Skip to content

Hiten-98/NLPWash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLPWash

  • A library to clean textual data. The features of NLPWash is its flexibility. Different projects may have different requirements so this library is designed the to allow users to clean data based on their requirements.
  • Imagine a scenario where we're tasked with sentiment analysis. The library's interface allows us to decide whether to retain or discard emojis , and even select our preferred stemming approach. This level of control enables users to iterate through different configurations, helping them fine-tune the preprocessing process for optimal analysis outcomes.

To play with text data lots of cleaning is required which includes

  1. Normalization
  2. Removing hyperlinks
  3. Removing HTML tags
  4. Removing punctuation
  5. Tokenization
  6. Removing stopwords
  7. Stemming
  8. Lemmatization

Using NLPWash this all can be done in just 1 line of code

Installation

pip install NLPWash

Examples

  1. Example Code 1
  2. Example Code 2

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published