In [38]:
!pip install -U langchain-openai langchain-google-genai langchain-groq

Collecting langchain-groq
  Downloading langchain_groq-1.1.0-py3-none-any.whl.metadata (2.4 kB)
Collecting groq<1.0.0,>=0.30.0 (from langchain-groq)
  Downloading groq-0.37.1-py3-none-any.whl.metadata (16 kB)
Downloading langchain_groq-1.1.0-py3-none-any.whl (19 kB)
Downloading groq-0.37.1-py3-none-any.whl (137 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.5/137.5 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: groq, langchain-groq
Successfully installed groq-0.37.1 langchain-groq-1.1.0


In [49]:
from langchain.chat_models import init_chat_model
import os
from google.colab import userdata
from openai import OpenAI
from IPython.display import Markdown, display

In [45]:
os.environ["GROQ_API_KEY"] = userdata.get("GROQ_API_KEY")

llm = init_chat_model(model="openai/gpt-oss-120b", model_provider="groq")

In [50]:
display(Markdown(llm.invoke('what is the capital of india').content))

The capital of India is **New Delhi**.

In [51]:
inputs = [
    'Explain langchain in one line',
    'explain what is llm',
    'explain what is vector embedding'
]

response = llm.batch(inputs)
for res in response:
    display(Markdown(res.content))

LangChain is a framework that lets developers build LLM‑powered applications by chaining together language model calls, data sources, and tools in a modular, programmable workflow.

**LLM = Large Language Model**

A *large language model* (LLM) is a type of artificial‑intelligence system that has been trained on massive amounts of text data so that it can understand and generate human‑like language. Here’s a breakdown of what that means:

---

## 1. Core Idea
- **“Language model”**: A statistical model that predicts the next word (or token) in a sequence given the words that came before it. By repeatedly making these predictions, it can produce coherent sentences, answer questions, translate text, and more.
- **“Large”**: Refers to two things:
  1. **Scale of data** – billions to trillions of words from books, articles, webpages, code, etc.
  2. **Scale of the model** – millions to billions (or even trillions) of parameters, which are the internal numeric weights the model learns during training.

---

## 2. How LLMs Are Built

| Step | What Happens |
|------|--------------|
| **Data collection** | Gather a huge, diverse corpus of text (e.g., web crawls, books, scientific papers). |
| **Tokenization** | Break the text into manageable units (words, sub‑words, or characters) called *tokens*. |
| **Model architecture** | Most modern LLMs use the **Transformer** architecture (introduced in 2017). It relies on self‑attention mechanisms that let the model weigh the relevance of each token to every other token. |
| **Training** | The model learns by trying to predict masked or next tokens and adjusting its parameters to reduce prediction error. This is done on powerful GPUs/TPUs for days or weeks. |
| **Fine‑tuning (optional)** | After the general training, the model can be further trained on a narrower dataset or with reinforcement learning from human feedback (RLHF) to make it safer or more task‑specific. |

---

## 3. What LLMs Can Do

- **Text generation** – write essays, stories, code snippets, poetry, etc.
- **Question answering** – retrieve or synthesize information from its training knowledge.
- **Summarization** – condense long documents into shorter abstracts.
- **Translation** – convert text between languages.
- **Conversation** – act as a chatbot or virtual assistant.
- **Classification & extraction** – label sentiment, pull out entities, detect topics.
- **Code assistance** – autocomplete, debug, or explain programming code.

---

## 4. Why “Large” Matters

- **Pattern capture**: Bigger models with more parameters can learn subtler statistical regularities, idioms, reasoning steps, and even some world knowledge.
- **Generalization**: With enough data, LLMs can perform tasks they were never explicitly trained for (zero‑shot or few‑shot learning).
- **Quality vs. cost**: Larger models tend to produce higher‑quality output but require more compute, memory, and energy, making them expensive to train and run.

---

## 5. Limitations & Risks

| Issue | Explanation |
|-------|-------------|
| **Hallucination** | The model may generate plausible‑looking but factually incorrect statements. |
| **Bias** | It inherits biases present in its training data (gender, racial, cultural, etc.). |
| **Data privacy** | If training data contains personal or copyrighted material, there are legal/ethical concerns. |
| **Resource intensive** | Training and even inference can consume large amounts of electricity and specialized hardware. |
| **Interpretability** | Understanding why a model made a specific prediction is still an open research problem. |

---

## 6. Popular LLM Families (as of 2025)

| Model | Parameters | Notable Traits |
|-------|------------|----------------|
| **GPT‑4** (OpenAI) | ~1‑2 trillion (estimated) | Strong reasoning, multimodal (text+image). |
| **Claude** (Anthropic) | ~1 trillion | Emphasizes “constitutional” safety prompts. |
| **LLaMA 2** (Meta) | 7 B – 70 B | Open‑source weights, widely used for research. |
| **Gemma** (Google DeepMind) | 2 B – 7 B | Optimized for efficiency on consumer hardware. |
| **Mistral** (Mistral AI) | 7 B – 13 B | Focus on instruction‑following and fine‑tuning. |

---

## 7. Quick Analogy

Think of an LLM like a **very large, highly experienced autocomplete**. When you type a few words, it draws on everything it has “read” to guess the most likely continuation, but it also has learned patterns that let it *reason* about what you might want—much like a seasoned writer who can finish a story from a single opening line.

---

## 8. Bottom‑Line Summary

- **LLM = Large Language Model** → a neural network trained on massive text corpora.
- It predicts the next token, enabling it to generate and understand language.
- Size (data + parameters) gives it impressive capabilities, but also brings cost, bias, and reliability challenges.
- LLMs are now foundational tools for chatbots, writing assistants, code helpers, translation services, and many other AI‑driven applications.

### What a Vector Embedding Is

A **vector embedding** (often just called an *embedding*) is a way of representing a piece of data—such as a word, a sentence, an image, a user, or even a graph node—as a point in a continuous, high‑dimensional numeric space. In that space, the geometry (distances and directions) captures the relationships and similarities that are important for the task at hand.

In other words, an embedding is a **dense, fixed‑length vector of real numbers** that encodes the meaning, structure, or behavior of the original object in a form that machine‑learning models can easily work with.

---

## 1. Why Do We Need Embeddings?

| Traditional Representation | Embedding Representation |
|----------------------------|--------------------------|
| **Sparse / high‑dimensional** (e.g., one‑hot vectors for words: 1 in a 50 000‑dimensional vector, 0 elsewhere) | **Dense / low‑dimensional** (e.g., 300‑dimensional real‑valued vector) |
| No notion of similarity—two different words are orthogonal | Similar items are *close* (small Euclidean / cosine distance) |
| Hard for models to learn patterns from raw high‑dim data | Enables efficient learning, generalization, and transfer across tasks |

Embeddings turn raw, discrete symbols into a smooth numeric landscape where “nearby” points mean “semantically or structurally similar”.

---

## 2. How Are Embeddings Obtained?

### 2.1. **Learning from Data (Data‑Driven Embeddings)**  
Most modern embeddings are **learned** by optimizing a neural network on a large corpus of examples.

| Method | Typical Domain | Core Idea |
|--------|----------------|-----------|
| **Word2Vec (CBOW / Skip‑gram)** | Text | Predict a word from its context (or vice‑versa). Words that appear in similar contexts get similar vectors. |
| **GloVe** | Text | Factorize a word‑co‑occurrence matrix; the dot product of two word vectors approximates their log‑co‑occurrence probability. |
| **FastText** | Text | Like Word2Vec but also uses sub‑word (character n‑gram) information, giving better representations for rare or misspelled words. |
| **BERT / RoBERTa / GPT** | Text (sentence / token level) | Train deep Transformers on masked‑language‑model or autoregressive objectives; the hidden states become contextual embeddings. |
| **DeepWalk / node2vec** | Graphs | Perform random walks on a graph, treat walks like sentences, and apply Word2Vec to learn node embeddings. |
| **ResNet / Vision Transformers** | Images | Pass an image through a CNN or ViT; the activation of a penultimate layer (often after a pooling step) is the image embedding. |
| **Autoencoders / Variational Autoencoders** | Any modality | Encode data into a bottleneck layer (the latent vector) and decode it back; the bottleneck learns a compact embedding. |
| **Metric‑learning losses (Triplet, Contrastive)** | Faces, product recommendations, etc. | Force embeddings of similar items to be close and dissimilar items to be far apart. |

### 2.2. **Hand‑Crafted / Pre‑Defined Embeddings**  
Before deep learning, people sometimes built embeddings manually:

* **One‑hot vectors** (very sparse, no similarity information).  
* **TF‑IDF vectors** (sparse, weighted word counts).  
* **Principal Component Analysis (PCA) / SVD** of co‑occurrence matrices.

These are still useful in low‑resource settings but are far less expressive than learned dense embeddings.

---

## 3. What Does an Embedding Look Like?

A concrete example: the word **“king”** might be represented (after training a 300‑dimensional Word2Vec model) as:

```
[ 0.215, -0.134, 0.487, ..., -0.021 ]   (300 numbers)
```

Two important properties:

* **Dimensionality** – The length of the vector (e.g., 50, 128, 768, 1024) is a hyper‑parameter. Higher dimensions can capture more nuance but risk over‑fitting and higher computational cost.
* **Semantic arithmetic** – Because the space is learned, simple vector algebra often reveals relationships:
  ```
  embedding("king") - embedding("man") + embedding("woman") ≈ embedding("queen")
  ```

---

## 4. How Do We Use Embeddings?

| Task | How Embeddings Help |
|------|---------------------|
| **Similarity / Retrieval** | Compute cosine similarity between vectors to find “nearest neighbors” (e.g., search for similar documents, images, or users). |
| **Classification** | Feed embeddings as input features to a downstream classifier (e.g., sentiment analysis from sentence embeddings). |
| **Clustering** | Group items by distance in embedding space (e.g., topic discovery, product segmentation). |
| **Transfer Learning** | Re‑use a pre‑trained embedding (e.g., BERT) as a starting point for a new task, drastically reducing required labeled data. |
| **Recommendation** | Represent users and items in the same space; recommend items whose vectors are close to a user’s vector. |
| **Graph Analysis** | Node embeddings enable link prediction, community detection, and graph‑based recommendation. |

---

## 5. Key Intuitions & Analogies

1. **Map Analogy** – Think of each object as a city. A raw one‑hot representation tells you only the city’s name, not where it is on a map. An embedding gives you latitude/longitude (coordinates). Nearby cities (vectors) are likely to share cultural, economic, or geographic traits.

2. **Language Translation** – If you train separate embeddings for English and French words on comparable corpora, you can learn a linear transformation that aligns the two spaces. This reveals that the geometry of meaning is language‑agnostic.

3. **Compress‑and‑Explain** – Embeddings are a compressed summary that keeps the “essence” needed for the downstream task, discarding irrelevant details (e.g., exact pixel values of an image after a few convolutional layers).

---

## 6. Practical Tips

| Situation | Recommendation |
|-----------|----------------|
| **Small dataset** | Use a pre‑trained embedding (e.g., GloVe, FastText, BERT) and fine‑tune only a small head on top. |
| **Domain‑specific vocabulary** | Fine‑tune a language model on your domain text, or train FastText on your own corpus to capture rare terms. |
| **Memory constraints** | Choose a lower‑dimensional embedding (e.g., 64‑dim) or apply dimensionality reduction (PCA, quantization). |
| **Need for interpretability** | Use techniques like **t‑SNE** or **UMAP** to visualize the embedding space; inspect nearest neighbors for sanity checks. |
| **Cross‑modal tasks** (e.g., image‑text retrieval) | Align embeddings from different modalities into a shared space using contrastive learning (e.g., CLIP). |

---

## 7. Common Pitfalls

1. **Assuming Euclidean distance is always the right metric** – For many embeddings, **cosine similarity** works better because magnitude differences are less meaningful.
2. **Over‑relying on a single embedding** – Different models capture different aspects (syntax vs. semantics, local vs. global context). Ensemble or multi‑view embeddings can improve robustness.
3. **Ignoring bias** – Embeddings inherit biases present in the training data (gender, racial, cultural). Use debiasing techniques (e.g., projection onto a neutral subspace) when fairness matters.
4. **Treating embeddings as static** – Contextual models (BERT, GPT) produce **different vectors for the same word** depending on surrounding text. Make sure you extract the appropriate token or sentence representation.

---

## 8. A Minimal Code Sketch (Python)

Below is a tiny example using the popular `sentence‑transformers` library to obtain a **sentence embedding**:

```python
# pip install sentence-transformers
from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('all-MiniLM-L6-v2')   # ~384‑dim embeddings, fast & small

sentences = [
    "The cat sat on the mat.",
    "A feline was lounging on a rug."
]

embeddings = model.encode(sentences, normalize_embeddings=True)  # shape (2, 384)

# Cosine similarity (since vectors are already normalized)
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.4f}")   # ~0.85 → they are semantically close
```

The `embeddings` variable now holds two dense vectors that you can store, index (e.g., with FAISS), or feed into downstream models.

---

## 9. TL;DR (One‑Sentence Summary)

A vector embedding is a learned, dense numeric representation that maps discrete objects (words, images, users, etc.) into a continuous space where geometric closeness reflects semantic or structural similarity, enabling efficient similarity search, classification, and transfer learning across many AI tasks.

In [54]:
from langchain_groq import ChatGroq

llm = ChatGroq(model_name = 'openai/gpt-oss-120b', temperature = 0.7)
display(Markdown(llm.invoke('what is the capital of India').content))

The capital of India is **New Delhi**.

### Prompt Template

In [61]:
from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    "Translate the following sentence from {source_language} to {target_language}: {sentence}"
)
template

PromptTemplate(input_variables=['sentence', 'source_language', 'target_language'], input_types={}, partial_variables={}, template='Translate the following sentence from {source_language} to {target_language}: {sentence}')

In [63]:
msg = template.invoke({'source_language': 'English', 'target_language': 'Bengali', 'sentence': 'how are you?'})

In [65]:
display(Markdown(llm.invoke(msg).content))

আপনি কেমন আছেন?

### Output Parser

In [96]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()
parser.invoke(llm.invoke(msg))

'**Bengali translation:**  \n- Formal / polite: **আপনি কেমন আছেন?**  \n- Informal / casual: **তুমি কেমন আছো?**'

In [100]:
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

llm = ChatGroq(model_name = 'openai/gpt-oss-120b', temperature = 0)

class MyOutputSchema(BaseModel):
    query: str = Field(description = 'User query')
    result: str = Field(description="full llm generate response")

parser = JsonOutputParser(pydantic_object=MyOutputSchema)

prompt = PromptTemplate(
    input_variables=['query'],
    template = "answer the following question. \n {query} \n {format_instructions}"
)

filled_prompt = prompt.format(
    query = 'give me one movie name and its release year',
    format_instructions = parser.get_format_instructions()
)

response = llm.invoke(filled_prompt)
structured_data = parser.invoke(response.content)
structured_data

{'query': 'give me one movie name and its release year',
 'result': 'The Shawshank Redemption (1994)'}

### Langchain Expression Language (LCEL)

In [106]:
chain = prompt | llm | parser

# Run chain
result = chain.invoke({
    "query": "give me one movie name and its release year",
    "format_instructions" : parser.get_format_instructions()
})
result

{'query': 'give me one movie name and its release year',
 'result': 'The Shawshank Redemption (1994)'}