Skip to content

Different Data Cleaning Mechanisms To Clean Data For Different NLP Problems.

Notifications You must be signed in to change notification settings

ShireeshPyreddy/End-To-End-Data-Cleaning-Mechanisms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

End-To-End-Data-Cleaning-Mechanisms

Breif History

In my experience, i have encountered a lot of time put into cleaning data especially with NLP related problems. So i thought to build a generalized data cleaning mechanisms with which we can directly plug-in the methods for preprocessing and directly building various models or we can select required methods, customize and use them in your own code.

Cleaning Mechanisms

The Mechanisms that are included in cleaningmechanism.py.

  1. Word/Phrase Embeddings Cleaner
  2. Named Entity Recognition Cleaner
  3. Text Summarization Cleaner
  4. Text Generation Cleaner

Future Scope

Pending Topics/Methods To Cover.

  1. Spelling Correction
  2. Join Split Words
  3. Split Join Words
  4. Grammer Correction
  5. Normalize Slang Words

About

Different Data Cleaning Mechanisms To Clean Data For Different NLP Problems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages