NLP-With-Python

Natural Language Processing using Python (January 2025) - Lab Activities

Day 1:

Day 2:

Calculating frequency of top N words
Using NLTK and RE module to tokenize text, remove stop words, special characters
Using wordcloud library to generate word cloud from cleaned text

Day 3: Apply CountVectorization to the input data & answer the following:

Count Vectorizer Matrix
Vocabulary (unique words in corpus)
Calculating frequency of top N words (top 1 word = most frequent term)
Finding out words that appear in all/maximum sentences/documents
(Doubts):
1. What would happen if we set stop_words='english' in the CountVectorizer?
2. Should we manually explain the tone of 'happy' or should we write a code (Like for pick out term that occurs in all documents)

Day 4: Multinomial Naive Bayes model on 'Movie Review' dataset:

Pre processing steps - Using NLTK and RE module to:
1. Tokenize text
2. Remove stop words
3. Remove special characters
Word Cloud for the data
MultiNB Model

Capstone Project: Major concepts used:

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
EXL NLP With Python - Capstone Project.ipynb		EXL NLP With Python - Capstone Project.ipynb
EXL NLP With Python - Day 1.ipynb		EXL NLP With Python - Day 1.ipynb
EXL NLP With Python - Day 2.ipynb		EXL NLP With Python - Day 2.ipynb
EXL NLP With Python - Day 3.ipynb		EXL NLP With Python - Day 3.ipynb
EXL NLP With Python - Day 4.ipynb		EXL NLP With Python - Day 4.ipynb
README.md		README.md

Provide feedback