Skip to content

matt8955/dsc-nlp-section-recap-nyc-ds-033020

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural Language Processing - Recap

Key Takeaways

The key takeaways from this section include:

  • NLP has become increasingly popular over the past few years, and NLP researchers have achieved very insightful insights
  • The Natural Language Tool Kit (NLTK) is one of the most popular Python libraries for NLP
  • Regular Expressions are an important part of NLP, which can be used for pattern matching and filtering
  • Regular Expressions can become confusing, so make sure to use our provided cheat sheet the first few times you work with regex
  • It is strongly recommended you take some time to use regex tester websites to ensure you understand how changing your regex pattern affects your results when working towards a correct answer!
  • Feature Engineering is essential when working with text data, and to understand the dynamics of your text
  • Common feature engineering techniques are removing stop words, stemming, lemmatization, and n-grams
  • When diving deeper into grammar and linguistics, context-free grammars and part-of-speech tagging is important
  • In this context, parse trees can help computers when dealing with ambiguous words
  • How you clean and preprocess your data will have a major effect on the conclusions you'll be able to draw in your NLP classification problems

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%