### 📝 Assignment 1: LLM Understanding

* Write a short note (3–4 sentences) explaining the difference between **encoder-only, decoder-only, and encoder-decoder LLMs**.
* Give one example usage of each.


Encoder-only models are designed to understand and analyze text. An example is BERT, which is used for tasks like sentiment analysis. Decoder-only models are built to generate text, such as GPT, which powers chatbots and story writing. Encoder-decoder models can both understand and generate text; for example, T5 is used for translation and summarization tasks.

### 📝 Assignment 2: STT/TTS Exploration

* Find **one STT model** and **one TTS model** (other than Whisper/Google).
* Write down:

  * What it does.
  * One possible application.

 For speech-to-text (STT), Vosk is an open-source toolkit that works offline and supports multiple languages, making it useful for applications like mobile apps that need voice commands without internet access. For text-to-speech (TTS), Amazon Polly converts written text into natural-sounding speech and is commonly used in customer service systems or for generating realistic audio in audiobooks.

### 📝 Assignment 3: Build a Chatbot with Memory

* Write a Python program that:

  * Takes user input in a loop.
  * Sends it to Groq API.
  * Stores the last 5 messages in memory.
  * Ends when user types `"quit"`.

In [28]:
%pip install groq



In [29]:
from google.colab import userdata
userdata.get('GROQ_API_KEY')

'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'

In [30]:
# Set your API key securely in Colab (run this cell once, don't share your key!)
import os

# Replace with your actual key temporarily, or better: use input() for safety
os.environ["GROQ_API_KEY"] = input("Enter your Groq API key: ")


Enter your Groq API key: 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


In [31]:
# Chatbot with memory (last 5 messages)
from groq import Groq

# Initialize Groq client
groq_client = Groq(api_key=os.getenv("GROQ_API_KEY"))

# Store conversation history
conversation_history = []

print("Chatbot ready! Type 'quit' to end the chat.\n")

while True:
    user_message = input("You: ")

    if user_message.lower() == "quit":
        print("Chatbot: Goodbye!")
        break

    # Add user input to history
    conversation_history.append({"role": "user", "content": user_message})

    # Keep only the last 5 messages
    if len(conversation_history) > 5:
        conversation_history = conversation_history[-5:]

    # Send conversation to Groq API
    response = groq_client.chat.completions.create(
        model="llama-3.1-8b-instant",
        messages=conversation_history
    )

    bot_message = response.choices[0].message.content
    print("Chatbot:", bot_message)

    # Add bot reply to history
    conversation_history.append({"role": "assistant", "content": bot_message})


Chatbot ready! Type 'quit' to end the chat.

You: Hello How are you?
Chatbot: Hello. I'm just a computer program, so I don't have feelings in the way that humans do. I exist to provide information and help with tasks, and I'm functioning properly. Is there anything I can help with or assist you with today?
You: yes i konw about you tell me about your specific ations
Chatbot: I'm an AI designed to understand and generate human-like text. Here are some of my specific capabilities:

1. **Natural Language Processing (NLP)**: I can understand and interpret human language, including nuances and context.
2. **Language Generation**: I can generate text in various styles, formats, and tones, from simple responses to complex essays.
3. **Conversational Dialogue**: I can engage in multi-turn conversations, using context and understanding to respond to follow-up questions and statements.
4. **Knowledge Retrieval**: I have access to a vast knowledge base, which I can draw upon to provide informatio

### 📝 Assignment 4: Preprocessing Function

* Write a function to clean user input:

  * Lowercase text.
  * Remove punctuation.
  * Strip extra spaces.

Test with: `"  HELLo!!!  How ARE you?? "`


In [14]:
def clean_text(text):
    # lowercase
    text = text.lower()
    # remove punctuation
    for p in "!?.,":
        text = text.replace(p, "")
    # remove extra spaces
    text = " ".join(text.split())
    return text

# Test
sample = " HELLo!!! How ARE you?? "
print("Original:", sample)
print("Cleaned:", clean_text(sample))


Original:  HELLo!!! How ARE you?? 
Cleaned: hello how are you


### 📝 Assignment 5: Text Preprocessing

* Write a function that:

    * Converts text to lowercase.
    * Removes punctuation & numbers.
    * Removes stopwords (`the, is, and...`).
    * Applies stemming or lemmatization.
    * Removes words shorter than 3 characters.
    * Keeps only nouns, verbs, and adjectives (using POS tagging).

In [21]:
!pip -q install nltk

In [26]:
import nltk
import string
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk import pos_tag, word_tokenize

# ✅ Download only the correct NLTK data
nltk.download("punkt")
nltk.download("stopwords")
nltk.download("wordnet")
nltk.download("omw-1.4")
nltk.download("averaged_perceptron_tagger")


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


True

In [32]:
# Function for preprocessing
def preprocess_text(text):
    # Lowercase
    text = text.lower()

    # Tokenize
    tokens = word_tokenize(text)

    # Remove punctuation & numbers
    tokens = [word for word in tokens if word.isalpha()]

    # Remove stopwords
    stop_words = set(stopwords.words("english"))
    tokens = [word for word in tokens if word not in stop_words]

    # Lemmatization
    lemmatizer = WordNetLemmatizer()
    tokens = [lemmatizer.lemmatize(word) for word in tokens]

    # POS tagging
    tagged = pos_tag(tokens)

    # Keep only nouns, verbs, adjectives
    allowed_pos = {"NN", "NNS", "VB", "VBD", "VBG", "VBN", "VBP", "VBZ", "JJ", "JJR", "JJS"}
    filtered = [word for word, tag in tagged if tag in allowed_pos and len(word) >= 3]

    return " ".join(filtered)

#  Test
user_input = input("Enter a sentence: ")
print("Processed:", preprocess_text(user_input))


Enter a sentence: Artificial Intelligence in 2025 is transforming healthcare, making diagnosis faster and more accurate!!!
Processed: artificial intelligence transforming healthcare making diagnosis accurate


### 📝 Assignment 6: Reflection

* Answer in 2–3 sentences:

    * Why is context memory important in chatbots?
    * Why should beginners always check **API limits and pricing**?

In my chatbot code, I stored the last few messages so the AI could give more meaningful replies. Context memory is important because it helps the chatbot continue the conversation smoothly and avoid repeating the same things again and again.Since my chatbot connects to the Groq API, every message uses up tokens from my plan. Beginners should check API limits and pricing so they don’t run out of free quota or get charged extra while testing their chatbot.