# **Buckle Up ! We are starting our week 2 roller coaster**

In our first week we covered some theoritical concepts and completed our setup so its time we start building!

## 📓**Conversational AI Concepts & Model Pipelines**

🎯 By the end of this week, you will:

- Understand LLMs, STT, TTS models and their roles.

- Know how to connect to LLMs with APIs (Groq as example).

- Use Python (requests + JSON) for API interaction.

- Start building a basic chatbot with memory and preprocessing.

---

## 🌟 Large Language Models (LLMs) 🌟

---

### ❗ **Question 1**: What is an LLM?

👉 It’s like a super-smart text predictor that can read, understand, and generate human-like sentences.

You give it some words → it guesses the next words in a way that makes sense.

For example:

1) You ask a question → it gives you an answer.

2) You write a sentence → it can complete it.

3) You give it a topic → it can write an essay, code, or even a story.

So, its a type of AI trained on huge amounts of text data to generate or understand text.

---

### Types of LLMs

1. Encoder-only models (e.g., BERT)

    - Best for understanding text (classification, sentiment analysis, embeddings).

    - ❌ Not good at generating text.

2. Decoder-only models (e.g., GPT, LLaMA, Mistral)

    - Best for text generation (chatbots, writing, summarization).

    - What we use in chatbots.

3. Encoder-decoder models (e.g., T5, BART)

    - Good at transforming text (translation, summarization, Q&A).

### Must-Knows about LLMs

- They don’t “think” like humans → They predict text based on training.

- Garbage in → garbage out: Poor prompts = poor answers.

- Token limits: Models can only “see” a certain number of words at a time.

- Biases: Trained on internet text → may reflect biases/errors.

### 💡 **Quick Questions**: 

1. Why might a chatbot built on BERT (encoder-only) struggle to answer open-ended questions?

- Answer 👉

---

## 🌟 Speech-to-Text (STT) 🌟

---

### ❗ **Question 2**: What is STT?

👉 listens to your voice and turns it into written text.

- Converts **audio → text**.
- Enables voice input for conversational AI.
- Think of it as the **ears** of the chatbot.

**Popular STT Models**:

1) **Whisper (OpenAI)** – strong at multilingual speech recognition.
2) **Google Speech-to-Text API** – widely used, real-time transcription.
3) **Vosk** – lightweight, offline speech recognition.

**Common Usages**

1) Voice assistants (Alexa, Siri, Google Assistant).
2) Automated captions in meetings or lectures.
3) Voice-enabled customer support.

---

### Must-Knows about STT

- Accuracy depends on **noise, accents, clarity of speech**.

- Some models need **internet connection** (API-based), others run **offline**.

- Preprocessing audio (noise reduction) improves results.


### 💡 **Quick Questions**: 

2. Why do you think meeting transcription apps like Zoom or Google Meet struggle when multiple people talk at once?

- Answer 👉

---

## 🌟 Text-to-Speech (TTS) 🌟

---

### ❗ **Question 3**: What is TTS?

👉 takes written text and speaks it out loud in a human-like voice.

- Converts **text → audio (speech)**.
- Think of it as the **mouth** of the chatbot.
- Makes AI “speak” naturally.

**Popular TTS Models**:

1) **Google TTS** – supports many languages and voices.
2) **Amazon Polly** – lifelike voice synthesis with customization.
3) **ElevenLabs** – cutting-edge, realistic voice cloning.

**Common Usages**

1) Screen readers for visually impaired users.
2) AI chatbots with voice output.
3) Audiobooks or podcast generation.

---

### Must-Knows about TTS

- Some voices sound robotic; others use **neural TTS** for natural tones.

- Latency matters → If too slow, conversation feels unnatural.

- Some TTS services allow **custom voices**.

### 💡 **Quick Questions**: 

3. If you were designing a voice-based AI tutor, what qualities would you want in its TTS voice (tone, speed, clarity, etc.)?

- Answer 👉

---

## 🌟 Using APIs for LLMs with Groq 🌟

In [None]:
from groq import Groq

client = Groq(api_key="")

response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Hello! What is conversational AI?"}]
)

print(response.choices[0].message.content)


Conversational AI refers to the technology that enables computers or digital systems to simulate human-like conversations with humans. This is achieved through the use of natural language processing (NLP) and machine learning algorithms that allow AI systems to understand, interpret, and respond to human input in a way that feels natural and intuitive.

Conversational AI can take many forms, including:

1. **Chatbots**: These are AI-powered software programs that can engage in text-based conversations with humans, often used to provide customer support, answer frequently asked questions, or facilitate transactions.
2. **Virtual assistants**: These are AI-powered digital assistants, such as Siri, Google Assistant, or Alexa, that can understand voice commands and respond with relevant information or actions.
3. **Voice-controlled interfaces**: These are AI-powered interfaces that allow users to interact with devices using voice commands, such as smart speakers or home automation systems.

---

## 🌟 Assignments 🌟

### 📝 Assignment 1: LLM Understanding

* Write a short note (3–4 sentences) explaining the difference between **encoder-only, decoder-only, and encoder-decoder LLMs**.
* Give one example usage of each.

Encoder-only models (like BERT) are designed to understand and create rich contextual representations of input text. They are best for tasks where deep comprehension is key, such as sentiment analysis or named entity recognition.
Example: Classifying if a product review is positive or negative.

Decoder-only models (like GPT) are designed for text generation. They predict the next most likely word in a sequence, making them ideal for creative or conversational tasks.
Example: Writing a story continuation or a marketing email.

Encoder-decoder models (like T5) are designed for transformation tasks, where the input and output can be in different forms or languages. The encoder understands the input, and the decoder generates the corresponding output.
Example: Translating an English sentence into French.


### 📝 Assignment 2: STT/TTS Exploration

* Find **one STT model** and **one TTS model** (other than Whisper/Google).
* Write down:

  * What it does.
  * One possible application.

STT Model: Meta's MMS (Massively Multilingual Speech)

What it does: It is a single model that can transcribe speech to text in over 1,100 languages, even from low-resource languages that are typically unsupported.

Application: Creating accessible transcription tools for historical audio archives or community recordings in endangered languages.

TTS Model: ElevenLabs

What it does: It generates highly realistic and emotionally expressive synthetic speech in multiple languages and voices from text input.

Application: Producing voiceovers for audiobooks or educational videos with a consistent and engaging narrator voice.

### 📝 Assignment 3: Build a Chatbot with Memory

* Write a Python program that:

  * Takes user input in a loop.
  * Sends it to Groq API.
  * Stores the last 5 messages in memory.
  * Ends when user types `"quit"`.



In [2]:
import os
from groq import Groq

client = Groq(api_key="Your_api_key") # api  key removed for safety purposes
conversation_memory = []

while True:
    user_input = input("You: ")
    if user_input.lower() == "quit":
        break

    conversation_memory.append({"role": "user", "content": user_input})
    recent_messages = conversation_memory[-5:]

    chat_completion = client.chat.completions.create(
        messages=[{"role": "system", "content": "You are a helpful assistant."}] + recent_messages,
        model="llama3-8b-8192",
    )

    assistant_response = chat_completion.choices[0].message.content
    print("Assistant:", assistant_response)
    conversation_memory.append({"role": "assistant", "content": assistant_response})

You:  My name is Mujtaba. WHat is your name and Role ?


Assistant: Nice to meet you, Mujtaba! My name is Ada, and I'm a helpful assistant. I'm a computer program designed to assist and communicate with users like you in a helpful and friendly way. My role is to provide information, answer questions, and help with tasks to the best of my abilities. I'm here to assist you with any queries or needs you may have, so please feel free to ask me anything!


You:  Alright, Welcome Ada. So, can you please explain my work schedule for today ?


Assistant: I'm happy to help! However, I apologize, but I don't have any information about your work schedule. As a helpful assistant, I am not connected to your personal or professional life outside of our conversation. I don't have access to your calendar or work schedule.

If you would like to share your schedule with me, I'd be happy to help you remember or plan your day. Alternatively, you can also check your own calendar or ask your employer or HR department for your schedule information.


You:  Alright, that's fine. I want you to lookup , what's the name of bowler in ICC ranking no. 1 ?


Assistant: As of my latest data update, the ICC Test Bowling Rankings (which are the most widely recognized and authoritative rankings in the world of cricket) have Pat Cummins from Australia as the number 1 bowler!


You:  Alright, good. Now, can you recall the first thing I said to you ? In my first prompt what did I say to you ?


Assistant: Your first prompt was: "Alright, Welcome Ada. So, can you please explain my work schedule for today?"


You:  Good, Bye.


Assistant: Goodbye! It was nice chatting with you. If you have any other questions or need assistance in the future, don't hesitate to reach out. Have a great day!


You:  quit


### 📝 Assignment 4: Preprocessing Function

* Write a function to clean user input:

  * Lowercase text.
  * Remove punctuation.
  * Strip extra spaces.

Test with: `"  HELLo!!!  How ARE you?? "`




In [3]:
import string

def clean_input(text):
    text = text.lower()
    text = text.translate(str.maketrans('', '', string.punctuation))
    text = ' '.join(text.split())
    return text

test_text = "  HELLo!!!  How ARE you?? "
print(clean_input(test_text))

hello how are you


### 📝 Assignment 5: Text Preprocessing

* Write a function that:

    * Converts text to lowercase.
    * Removes punctuation & numbers.
    * Removes stopwords (`the, is, and...`).
    * Applies stemming or lemmatization.
    * Removes words shorter than 3 characters.
    * Keeps only nouns, verbs, and adjectives (using POS tagging).

In [9]:
import re
import nltk
from nltk.corpus import stopwords, wordnet
from nltk.stem import WordNetLemmatizer
from nltk import pos_tag, word_tokenize

nltk.download("punkt")
nltk.download("stopwords")
nltk.download("wordnet")
nltk.download("averaged_perceptron_tagger_eng") 
def get_wordnet_pos(tag):
    if tag.startswith("J"):
        return wordnet.ADJ
    elif tag.startswith("V"):
        return wordnet.VERB
    elif tag.startswith("N"):
        return wordnet.NOUN
    else:
        return None

def clean_text(text):
    text = text.lower()
    text = re.sub(r"[^a-z\s]", "", text)
    tokens = word_tokenize(text)
    stop_words = set(stopwords.words("english"))
    tokens = [w for w in tokens if w not in stop_words]
    tagged_tokens = pos_tag(tokens, lang="eng")
    lemmatizer = WordNetLemmatizer()
    lemmatized = []
    for word, tag in tagged_tokens:
        wn_tag = get_wordnet_pos(tag)
        if wn_tag:
            lemma = lemmatizer.lemmatize(word, wn_tag)
            if len(lemma) >= 3:
                lemmatized.append(lemma)
    return lemmatized

sample_text = "The cats are running quickly, but John’s dog is faster!"
print(clean_text(sample_text))


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Mujtaba\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Mujtaba\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\Mujtaba\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     C:\Users\Mujtaba\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger_eng.zip.


['cat', 'run', 'john', 'dog']


### 📝 Assignment 6: Reflection

* Answer in 2–3 sentences:

    * Why is context memory important in chatbots?
    * Why should beginners always check **API limits and pricing**?

Context memory is crucial in chatbots because it allows them to maintain the thread of a conversation, reference previous user statements, and provide coherent and relevant responses. Without it, each user input would be treated as an isolated query, making the interaction feel broken and frustrating.

Beginners should always check API limits and pricing to avoid unexpected charges and to understand the constraints of the service they are building on. This prevents their application from failing or incurring high costs once it moves beyond a simple prototype.

---

### **Hints:**

1) Stemming:
    - Cuts off word endings to get the “root.”
    - Very mechanical → may produce non-real words.
    - Example:
        - "studies" → "studi"
        - "running" → "run"

2) Lemmatization:
    - Smarter → uses vocabulary + grammar rules.
    - Always gives a real word (the **lemma**).
    - Example:
        - "studies" → "study"
        - "running" → "run"

3) Part-of-Speech (POS) tagging means labeling each word in a sentence with its grammatical role — like **noun, verb, adjective, adverb, pronoun, etc.**

    - Example:
        - Sentence → *“The cat is sleeping on the mat.”*

    - POS tags →
        - The → Determiner (DT)
        - cat → Noun (NN)
        - is → Verb (VBZ)
        - sleeping → Verb (VBG)
        - on → Preposition (IN)
        - the → Determiner (DT)
        - mat → Noun (NN)

    - **In short:** POS tagging helps machines understand **how words function in a sentence**, which is useful in NLP tasks like machine translation, text classification, and question answering.


---

### ✅ Recap

This week you learned:

* **LLMs**: Types, uses, must-knows.
* **STT & TTS**: How they connect with LLMs.
* **APIs**: Connecting to LLMs with Groq.
* Built your first chatbot foundation.