# Sentiment Analysis using NLP and Library TextBlob

# 1. Import required libraries
- nltk: A Python library for NLP tasks like tokenization, stemming, and lemmatization.
- spacy: A library for advanced NLP tasks such as named entity recognition (NER).
- textblob: Used for sentiment analysis and text processing.

In [1]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
import spacy
from textblob import TextBlob

# 2. Downloading NLTK Datasets
- punkt: A tokenizer model used for splitting sentences and words.
- stopwords: A dataset of common stopwords in English (like "is", "and", "the").
- wordnet: A lexical database for lemmatization.

In [5]:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt_tab')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


True

# 3. Defining Sample Text
- A multi-sentence text is provided as input for processing.

In [3]:
# Sample text
text = """Natural Language Processing (NLP) is a fascinating field of artificial intelligence.
It focuses on making machines understand human language. NLP powers applications like chatbots,
language translation tools, and voice assistants."""

# 4. Sentence Tokenization
#### TASK 1: TOKENIZATION
- Tokenization: Breaking text into smaller units like words or sentences.
- Sentence Tokenization splits the text into sentences.

In [6]:
print("Sentence Tokenization:")
sentences = sent_tokenize(text)
print(sentences)

Sentence Tokenization:
['Natural Language Processing (NLP) is a fascinating field of artificial intelligence.', 'It focuses on making machines understand human language.', 'NLP powers applications like chatbots, \nlanguage translation tools, and voice assistants.']


# 5. Word Tokenization
#### TASK 2: split Sentence
- Word Tokenization splits the text into individual words or tokens.

In [7]:
print("\nWord Tokenization:")
words = word_tokenize(text)
print(words)


Word Tokenization:
['Natural', 'Language', 'Processing', '(', 'NLP', ')', 'is', 'a', 'fascinating', 'field', 'of', 'artificial', 'intelligence', '.', 'It', 'focuses', 'on', 'making', 'machines', 'understand', 'human', 'language', '.', 'NLP', 'powers', 'applications', 'like', 'chatbots', ',', 'language', 'translation', 'tools', ',', 'and', 'voice', 'assistants', '.']


# 6. Stopword Removal
#### TASK 3: REMOVING STOPWORDS
- Stopword Removal: Removing frequently occurring words that are not meaningful.
- Stopwords are common words (e.g., "is", "and", "the") that are removed as they don't add significant meaning.

In [8]:
stop_words = set(stopwords.words('english'))
filtered_words = [word for word in words if word.lower() not in stop_words]
print("\nFiltered Words (without Stopwords):")
print(filtered_words)


Filtered Words (without Stopwords):
['Natural', 'Language', 'Processing', '(', 'NLP', ')', 'fascinating', 'field', 'artificial', 'intelligence', '.', 'focuses', 'making', 'machines', 'understand', 'human', 'language', '.', 'NLP', 'powers', 'applications', 'like', 'chatbots', ',', 'language', 'translation', 'tools', ',', 'voice', 'assistants', '.']


# 7. Stemming
#### TASK 3: reduces words
- Stemming reduces words to their root form. For example:
   - "fascinating" → "fascin"
   - "processing" → "process"
- Stemming is faster but less accurate (e.g., "studies" → "studi").

In [9]:
ps = PorterStemmer()
stemmed_words = [ps.stem(word) for word in filtered_words]
print("\nStemmed Words:")
print(stemmed_words)


Stemmed Words:
['natur', 'languag', 'process', '(', 'nlp', ')', 'fascin', 'field', 'artifici', 'intellig', '.', 'focus', 'make', 'machin', 'understand', 'human', 'languag', '.', 'nlp', 'power', 'applic', 'like', 'chatbot', ',', 'languag', 'translat', 'tool', ',', 'voic', 'assist', '.']


# 8. Lemmatization
#### TASK 4: converts words base form
- Lemmatization converts words to their dictionary base form (considering context).
  - Example: "running" → "run", "better" → "good".
- Lemmatization is slower but context-aware (e.g., "studies" → "study").

In [10]:
lemmatizer = WordNetLemmatizer()
lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]
print("\nLemmatized Words:")
print(lemmatized_words)


Lemmatized Words:
['Natural', 'Language', 'Processing', '(', 'NLP', ')', 'fascinating', 'field', 'artificial', 'intelligence', '.', 'focus', 'making', 'machine', 'understand', 'human', 'language', '.', 'NLP', 'power', 'application', 'like', 'chatbots', ',', 'language', 'translation', 'tool', ',', 'voice', 'assistant', '.']


# 9. Named Entity Recognition(NER) with SpaCy
#### TASK 5: NAMED ENTITY RECOGNITION (NER)
- NER: Identifies specific entities such as "NLP" as an organization.
- NER identifies entities like names, dates, and organizations.

In [13]:
nlp = spacy.load("en_core_web_sm")  # Load the model using the correct name
doc = nlp(text)
print("\nNamed Entities:")
for entity in doc.ents:
    print(f"{entity.text} ({entity.label_})")


Named Entities:
NLP (ORG)
NLP (ORG)


# 10. Sentiment Analysis with TextBlob
#### TASK 6: SENTIMENT ANALYSIS
- Sentiment Analysis: Evaluates emotional tone or opinion in the text.
- Polarity measures sentiment on a scale of -1 (negative) to +1 (positive).
- Subjectivity indicates whether the text is factual (close to 0) or opinionated (close to 1).

In [14]:
blob = TextBlob(text)
polarity = blob.sentiment.polarity
subjectivity = blob.sentiment.subjectivity

print("\nSentiment Analysis:")
print(f"Polarity: {polarity}, Subjectivity: {subjectivity}")

# Determine if the text is positive or not
if polarity > 0:
    print("The sample text has a Positive sentiment.")
elif polarity < 0:
    print("The sample text has a Negative sentiment.")
else:
    print("The sample text has a Neutral sentiment.")



Sentiment Analysis:
Polarity: 0.04999999999999999, Subjectivity: 0.5875
The sample text has a Positive sentiment.


# *************************************************************************

# About Me:-
## Name - Aatish Kumar Baitha
  - M.Tech(Data Science)
- YouTube
  - https://www.youtube.com/@EngineeringWithAatish/playlists
- My Linkedin Profile
  - https://www.linkedin.com/in/aatish-kumar-baitha-ba9523191
- My Blog
  - https://computersciencedatascience.blogspot.com/
- My Github Profile
  - https://github.com/Aatishkb

# Thank you!

# *************************************************************************