pycon-us-2017/videos/jiaqi-liu-fuzzy-search-algorithms-how-and-when-to-use-them-pycon-2017.json

{
  "description": "Fuzzy Searching or approximate string matching is powerful because often\ntext data is messy. For example, shorthand and abbreviated text are\ncommon in various data sets. In addition, outputs from OCR or voice to\ntext conversions tend to be messy or imperfect. Thus, we want to be able\nto make the most of our data by extrapolating as much information as\npossible.\n\nIn this talk, we will explore the various approaches used in fuzzy\nstring matching and demonstrate how they can be used as a feature in a\nmodel or a component in your python code. We will dive deep into the\napproaches of different algorithms such as Soundex, Trigram/n-gram\nsearch, and Levenshtein distances and what the best use cases are. We\nwill also discuss situations where it\u2019s important to take into account\nthe meaning or intent of a word and demonstrate approaches for measuring\nsemantic similarity using nltk and word2vec. Furthermore, we will\ndemonstrate via live coding how to implement some of these fuzzy search\nalgorithms using python and/or built-in fuzzy search functions within\nPostgreSQL.\n",
  "duration": 1824,
  "language": "eng",
  "recorded": "2017-05-20",
  "speakers": [
    "Jiaqi Liu"
  ],
  "thumbnail_url": "https://i.ytimg.com/vi/kTS2b6pGElE/hqdefault.jpg",
  "title": "Fuzzy Search Algorithms: How and When to Use Them",
  "videos": [
    {
      "type": "youtube",
      "url": "https://www.youtube.com/watch?v=kTS2b6pGElE"
    }
  ]
}