NLP_TextAnalysis

This repo demos my basic NLP skills for mult-label / multi-class text classification using the Kaggle cases (Natural Disaster Tweets, and Toxic Comments).

For the Natural Disaster tweets case, I demonstrate the use of two approaches:
- Support Vector Machines (SVM)
- LSTM and CNN neural network classifiers
- These models result in F1 scores of about 78% for the test data set.
For the Toxic Comments case, I demonstrate the use of:
- SVM
- Fine tuning BERT (base uncased)
- BERT achieves about 98% F1 score for the test data set
Conclusion: SVM is good for a quick classification project. With the necessary resources, BERT is very good at text classification.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
NaturalDisasterTweets		NaturalDisasterTweets
ToxicComments		ToxicComments
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NaturalDisasterTweets

NaturalDisasterTweets

ToxicComments

ToxicComments

README.md

README.md

Repository files navigation

NLP_TextAnalysis

About

Releases

Packages

Languages

acvanp/NLP_TextAnalysis

Folders and files

Latest commit

History

Repository files navigation

NLP_TextAnalysis

About

Resources

Stars

Watchers

Forks

Languages