<a href="https://colab.research.google.com/github/toni-ramchandani/AIMasterClassTTT/blob/main/Section1_8_Introduction_to_AI_NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Natural Language Processing (NLP) Overview**

NLP is a field of artificial intelligence that enables computers to understand, interpret, and generate human language. Its applications range from simple tasks, like text summarization and sentiment analysis, to advanced functions, like machine translation, chatbots, and virtual assistants. NLP bridges computer science and linguistics by processing language data, using both rule-based and machine learning approaches.

### **Core Concepts and Techniques in NLP**

1. **Tokenization**: Splits text into smaller units like words or sentences, enabling easier analysis. Word and sentence tokenization are common methods.

2. **Stemming and Lemmatization**: Both reduce words to their root forms. **Stemming** trims suffixes, while **lemmatization** uses vocabulary and grammar to find the root form.

3. **Part-of-Speech (POS) Tagging**: Identifies grammatical roles (e.g., noun, verb) of words in a sentence, assisting in understanding sentence structure and meaning.

4. **Named Entity Recognition (NER)**: Identifies entities (e.g., names, locations, organizations) within text, useful for extracting structured data from unstructured text.

5. **Bag of Words (BoW) and TF-IDF**: These are feature extraction techniques. **BoW** represents text based on word frequency, while **TF-IDF** (Term Frequency-Inverse Document Frequency) weighs words according to importance in documents.

6. **Word Embeddings**: Word embeddings, like Word2Vec and GloVe, map words to vectors in continuous space, preserving semantic relationships. Advanced embeddings, such as BERT and GPT, capture context-sensitive meanings.

7. **Sequence Models**: Models like **Recurrent Neural Networks (RNNs)** and **Long Short-Term Memory (LSTM)** networks are commonly used for tasks involving sequential data, like text generation and machine translation.

8. **Transformer Models**: Transformers (like BERT and GPT) have revolutionized NLP, enabling high-performance in tasks like translation and question-answering. They use self-attention mechanisms to capture word relationships across entire texts.

### **Example Applications of NLP**

1. **Sentiment Analysis**: Detects the sentiment of text, useful in social media monitoring and customer feedback.
2. **Machine Translation**: Translates text from one language to another (e.g., Google Translate).
3. **Chatbots and Virtual Assistants**: NLP powers conversation systems like Siri, Alexa, and customer service bots.
4. **Speech Recognition**: Converts spoken language into text, as used in virtual assistants.
5. **Document Summarization**: Creates concise summaries of long documents, valuable for news and legal industries.



### **Example Code Snippet for Sentiment Analysis with NLTK**

```python
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

# Initialize Sentiment Intensity Analyzer
sia = SentimentIntensityAnalyzer()

# Analyze sentiment of a sentence
text = "I love NLP! It's so interesting and powerful."
score = sia.polarity_scores(text)
print("Sentiment Score:", score)
```

### **Example Code for Text Classification with SpaCy**

```python
import spacy
from spacy.lang.en import English

# Load SpaCy model
nlp = English()

# Example text
text = "Apple is looking at buying a UK-based startup for $1 billion."

# Process text and print named entities
doc = nlp(text)
for ent in doc.ents:
    print(ent.text, ent.label_)
```



In [3]:
import nltk
nltk.download('vader_lexicon')


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


True

In [4]:
from nltk.sentiment import SentimentIntensityAnalyzer

# Initialize Sentiment Intensity Analyzer
sia = SentimentIntensityAnalyzer()

# Analyze sentiment of a sentence
text = "I love NLP! It's so interesting and powerful."
score = sia.polarity_scores(text)
print("Sentiment Score:", score)


Sentiment Score: {'neg': 0.0, 'neu': 0.266, 'pos': 0.734, 'compound': 0.9011}


Example Code for Text Classification with SpaCy

In [5]:
import spacy

# Load the spaCy model with NER
nlp = spacy.load("en_core_web_sm")

# Example text
text = "Apple is looking at buying a UK-based startup for $1 billion."

# Process text and print named entities
doc = nlp(text)
for ent in doc.ents:
    print(ent.text, ent.label_)


Apple ORG
UK GPE
$1 billion MONEY


This code provides a foundation for implementing NLP tasks like sentiment analysis, entity recognition, and tokenization. NLP techniques continue to evolve, offering increasingly sophisticated ways to process and understand human language.