# Task-4: Natural Language Processing (NLP) Basics:
### Write a Python script using NLTK or spaCy to perform basic text processing tasks like tokenization, stemming, or sentiment analysis on a sample       text.   

## Import Libraries :
First, we import the necessary libraries from NLTK for performing various NLP tasks. These include tokenization, stemming, and sentiment analysis.

In [7]:
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer
from nltk.sentiment.vader import SentimentIntensityAnalyzer

## Download Necessary NLTK Data Files :
We need to download some necessary data files from NLTK for tokenization and sentiment analysis. This includes the 'punkt' tokenizer models and the 'vader_lexicon' for sentiment analysis.

In [8]:

nltk.download('punkt')
nltk.download('vader_lexicon')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Anwar\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\Anwar\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

## Sample Text :
We define a sample text to perform our NLP tasks. This text will be tokenized, stemmed, and analyzed for sentiment.

In [9]:
text = "Natural Language Processing is an exciting field. It helps in understanding and analyzing human language."

## Tokenization :
Tokenization is the process of breaking down text into smaller units, such as words or sentences. Here, we use 'word_tokenize' to tokenize the text into words and 'sent_tokenize' to tokenize it into sentences.

In [10]:
words = word_tokenize(text)
print("Word Tokenization:", words)

sentences = sent_tokenize(text)
print("Sentence Tokenization:", sentences)

Word Tokenization: ['Natural', 'Language', 'Processing', 'is', 'an', 'exciting', 'field', '.', 'It', 'helps', 'in', 'understanding', 'and', 'analyzing', 'human', 'language', '.']
Sentence Tokenization: ['Natural Language Processing is an exciting field.', 'It helps in understanding and analyzing human language.']


## Stemming :
Stemming is the process of reducing words to their base or root form. We use the PorterStemmer from NLTK to stem the tokenized words.

In [11]:
ps = PorterStemmer()
stemmed_words = [ps.stem(word) for word in words]
print("Stemmed Words:", stemmed_words)

Stemmed Words: ['natur', 'languag', 'process', 'is', 'an', 'excit', 'field', '.', 'it', 'help', 'in', 'understand', 'and', 'analyz', 'human', 'languag', '.']


## Sentiment Analysis :
Sentiment analysis involves determining the emotional tone of the text. We use the VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis tool from NLTK to analyze the sentiment of the sample text.

In [12]:
sid = SentimentIntensityAnalyzer()
sentiment = sid.polarity_scores(text)
print("Sentiment Analysis:", sentiment)

Sentiment Analysis: {'neg': 0.0, 'neu': 0.591, 'pos': 0.409, 'compound': 0.8074}


## Conclusion :
### In this project, I successfully implemented basic text processing tasks using NLTK. I performed tokenization to break the text into words and sentences, applied stemming to reduce words to their root forms, and conducted sentiment analysis to determine the emotional tone of the text.