Skip to content

A Jupyter Notebook containing methods for common tasks related to the field of Natural Language Processing

Notifications You must be signed in to change notification settings

AndreasScharnetzki/NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Natural Language Processing

This module contains methods to accomplish several tasks related to the field of natural language processing (NLP) such as:

P R E P R O C E S S I N G:

Although there are many high-level API available, performing the text cleaning manually can give some advantages in regards of customization.

  • Loading the data and selecting relevant parts
  • Removing punctuation
  • Removing words shorter than a choosen length
  • Replace numbers with their word-based equivalent
  • Removing stopwords
  • Lemmatization
  • Tokenization

N L P:

  • Sentiment Analysis
  • Part-of-Speech-Tagging (POS-Tagging)
  • Named-Entity-Recognition (NER)
  • TF-IDF Scoring
  • Cosine Similarity
  • MinHashing
  • WordEmbedding
  • Latent Semantic Analysis (LSA)

Releases

No releases published

Packages

No packages published