# **Introduction to Machine Learning and Artificial Intelligence (August - September 2024)**
<br>

<br>
<br>

# Day 5: Recurrent Neural Networks (RNN) and Large Language Models (LLM)
## Recurrent Neural Networks (RNN):
**1. Purpose:** 
* Designed to handle sequential data, such as time series or text.
* **Activation Function:** Often use the tanh activation function to introduce non-linearity.

**2. Sequence to Sequence Modeling:**
* A type of RNN model used for tasks like translation or summarization where input and output are sequences.

**3. Word Embeddings:**
* Represent words as vectors in a continuous vector space, capturing semantic meanings.

**4. Text Prerocessing:**
* **Lowercasing:** Converts all characters to lowercase.
* **Tokenization:** Splits text into words or tokens.
* **Punctuation and Stop Words Removal:** Eliminates unnecessary punctuation and common words that might not contribute to meaning.
* **Stemming and Lemmatization:** Reduces words to their base or root forms.
* **Handling Contractions:** Expands contractions (e.g., “don’t” to “do not”).
* **Emoji handling:**
    * **Mapping Emojis:** Map emojis to textual representations to include them in text analysis.
    * [Solutions ChatGPT](https://chatgpt.com/share/aa12a519-9b42-429c-b530-4771d62c2178)

**5. NLTK Library:**
* **Stem:** Provides stemming functionalities.

**6. Sentiment Analysis Preprocessing:**
* Involves cleaning and preparing text data for sentiment classification.

**7. Count Vectorizer:**
* Bag of Words Model: Represents text as the frequency of words. Not very powerful but simple.

**8. TF-IDF (Term Frequency-Inverse Document Frequency):**
* A more advanced technique that reflects how important a word is in a document relative to its frequency across all documents.

**9. Sentiment Classification:**
**Examples:** Analyzing movie reviews or restaurant feedback to classify sentiment.

**10. The Vanishing Gradient Problem:**
* A problem where gradients become very small during backpropagation, causing slow learning.

**11. Long Short-Term Memory (LSTM):**
* A type of RNN that can remember long-term dependencies by using memory cells to store information.

**12. Attention Mechanism:**
* Allows models to focus on different parts of the input sequence when making predictions.

**13. Transformers:**
* Models that handle entire sequences at once, addressing limitations of RNNs.

14. ANM retention comparison:
* **RNN:** Can remember up to 8 words.
* **LSTM**: Can remember 50-100 words.
* **Attention Mechanism:** Handles 1000 words.
* **Transformers:** Efficiently process entire sequences.

**15. Spam/Ham Classification example:**
* Classification task to differentiate between spam and non-spam messages.

**16. Handling Class Imbalance:**
*   Techniques such as **SMOTE** (Synthetic Minority Over-sampling Technique), **MSMOTE** (Modified SMOTE), **KNN** (K-Nearest Neighbors), and **boosting**.

**17. Keras Pad Sequences:**
* Adds padding to sequences to ensure uniform length across all input sequences.

**18. Early Stopping:**
* Stops training when the model's performance on a validation set stops improving, based on hyperparameters like **min_delta** and **patience**.

**19. Embedding:**
* Represents categorical variables (like words) as dense vectors.

**20. Sparse vs. Dense Matrices:**
* **Sparse Matrix:** Contains mostly zeros, used to store large-scale data efficiently.
* **Dense Matrix:** Contains mostly non-zero elements.

**21. Word Embedding Models:**
* **GloVe (Global Vectors for Word Representation):** Pretrained embeddings capturing word similarities.
* **FastText:** Extends word embeddings to include subword information.
* **GoogleNet and AlexNet:** Convolutional neural networks for image classification (not directly related to text, but influential in neural network development).

## Transformers and HuggingFace:

**1. HuggingFace:** 
* Provides pre-trained transformer models and tools.
* [HuggingFace link](https://huggingface.co)
* **Example:** Summarizing text using HuggingFace's models. Image generation. Translation.

**2. DistilBART:** A distilled version of the BART model for text summarization.

**3.Retrieval-Augmented Generation (RAG):**
* Combines retrieval-based and generative approaches for more effective text generation.

## Fine-Tuning Large Language Models (LLM):

**1. OpenAI Playground:**
* An interactive tool for experimenting with OpenAI's language models.
* Adjusting pre-trained models to better fit specific tasks or datasets.
* Retrieval-Augmented Generation

## Future Topics:
**1. Intro to GANs (Generative Adversarial Networks):** 
* Models for generating new data samples.

**2. VAEs (Variational Autoencoders):** 
* Generative models that learn latent representations.

**3. OpenAI API:** 
* API for interacting with OpenAI models.

**4. Prompt Engineering:** 
* Designing effective prompts for LLMs.

**5. LangChain:** 
* Framework for building applications with language models.

**6. Other Models:** 
* Exploring models like LLaMA (Large Language Model Meta AI).

**7. RAG:** 
* Retrieval-Augmented Generation