# Encoder vs Decoder: BERT, GPT-2, and Gemini 2.5 Flash

This notebook is designed for teaching:

- **BERT** ‚Äì encoder-only (understands text ‚Üí embeddings)
- **GPT-2** ‚Äì decoder-only (generates text)
- **Gemini 2.5 Flash** ‚Äì used both as:
  - an **encoder-like** model (via text embeddings)
  - a **decoder-like** model (via text generation)

Students will see side-by-side code and complete small assignments.


## 0. Install Dependencies

In [None]:
!pip install -q transformers torch google-generativeai

## 0.1 Common Imports

In [None]:
from transformers import AutoTokenizer, AutoModel, AutoModelForCausalLM
import torch
import torch.nn.functional as F
import numpy as np
import pandas as pd
import google.generativeai as genai

## 1. BERT ‚Äì Encoder-Only Model (Understands Text)

BERT reads a full sentence and converts it into a **vector representation** (embedding).
It is **not** designed to generate new text.


### 1.1 Load BERT

In [None]:
# Load BERT (encoder-only)
bert_name = "bert-base-uncased"
bert_tokenizer = AutoTokenizer.from_pretrained(bert_name)
bert_model = AutoModel.from_pretrained(bert_name)

print("Loaded BERT model:", bert_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Loaded BERT model: bert-base-uncased


### 1.2 Encode Example Sentences

We will encode two similar sentences and inspect their embeddings.


In [None]:
sentences_bert = [
    "I like playing football.",
    "I enjoy playing soccer."
]

# Tokenize: convert text ‚Üí token IDs
encoded_bert = bert_tokenizer(
    sentences_bert,
    padding=True,
    truncation=True,
    return_tensors="pt"
)

with torch.no_grad():
    outputs_bert = bert_model(**encoded_bert)

# Take the [CLS] token embedding (position 0) as a sentence embedding
bert_cls_embeddings = outputs_bert.last_hidden_state[:, 0, :]  # [batch, hidden]

print("BERT sentence embedding shape:", bert_cls_embeddings.shape)

BERT sentence embedding shape: torch.Size([2, 768])


### 1.3 Measure Sentence Similarity

In [None]:
bert_sim = F.cosine_similarity(bert_cls_embeddings[0], bert_cls_embeddings[1], dim=0)
print("Cosine similarity between the two BERT sentence embeddings:", bert_sim.item())

Cosine similarity between the two BERT sentence embeddings: 0.9696494936943054


**Discussion:**

- BERT converts sentences into vectors (embeddings).
- Similar sentences ‚Üí **high cosine similarity**.
- BERT is an **encoder**: it understands text, but does **not** generate new text with `.generate()`.


## 2. GPT-2 ‚Äì Decoder-Only Model (Generates Text)

GPT-2 takes a **prompt** and generates a continuation by predicting the **next token**
over and over again (autoregressive generation).


### 2.1 Load GPT-2

In [None]:
# Load GPT-2 (decoder-only)
gpt_name = "gpt2"
gpt_tokenizer = AutoTokenizer.from_pretrained(gpt_name)
gpt_model = AutoModelForCausalLM.from_pretrained(gpt_name)

print("Loaded GPT-2 model:", gpt_name)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Loaded GPT-2 model: gpt2


### 2.2 Generate Text from a Prompt

In [None]:
gpt_prompt = "In the future, artificial intelligence will"

# Tokenize prompt
gpt_inputs = gpt_tokenizer(gpt_prompt, return_tensors="pt")

# Generate continuation
gpt_output_ids = gpt_model.generate(
    **gpt_inputs,
    max_length=40,      # total tokens (prompt + generated)
    do_sample=False,    # deterministic
    num_beams=1
)

gpt_generated_text = gpt_tokenizer.decode(gpt_output_ids[0], skip_special_tokens=True)

print("GPT-2 PROMPT:")
print(gpt_prompt)
print("\nGPT-2 CONTINUATION:")
print(gpt_generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


GPT-2 PROMPT:
In the future, artificial intelligence will

GPT-2 CONTINUATION:
In the future, artificial intelligence will be able to do things like search for information about people, and to do things like search for information about people.

"We're going to see a lot


**Discussion:**

- GPT-2 is a **decoder-only** language model.
- It uses **autoregressive generation**: at each step it predicts the next token.
- It is excellent for **chat, story generation, code generation, autocomplete**, etc.


## 3. Gemini 2.5 Flash ‚Äì Encoder-like and Decoder-like Behavior

We will use Gemini in two ways:

1. **Encoder-like**: create embeddings and compute sentence similarity.
2. **Decoder-like**: generate text from a prompt.

> ‚úÖ Make sure you have a **Gemini API key** (Free Tier) from Google AI Studio.


### 3.1 Configure Gemini API Key

In [None]:
# üëâ IMPORTANT: Replace YOUR_API_KEY_HERE with your actual Gemini API key.
genai.configure(api_key="YOUR_API_KEY_HERE")

# Load the generative model (decoder-like)
gemini_model = genai.GenerativeModel("gemini-2.5-flash")

print("Gemini model ready: gemini-2.5-flash")

Gemini model ready: gemini-2.5-flash


### 3.2 Encoder-like: Text Embeddings and Similarity

In [None]:
embed_model_name = "models/text-embedding-004"

gemini_text1 = "I like playing football."
gemini_text2 = "I enjoy playing soccer."

gemini_emb1 = genai.embed_content(model=embed_model_name, content=gemini_text1)["embedding"]
gemini_emb2 = genai.embed_content(model=embed_model_name, content=gemini_text2)["embedding"]

def cosine(a, b):
    a = np.array(a)
    b = np.array(b)
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

gemini_sim = cosine(gemini_emb1, gemini_emb2)
print("Gemini embedding dimension:", len(gemini_emb1))
print("Cosine similarity between the two Gemini embeddings:", gemini_sim)

Gemini embedding dimension: 768
Cosine similarity between the two Gemini embeddings: 0.8197920340047521


### 3.3 Decoder-like: Text Generation

In [None]:
gemini_prompt = "In the future, artificial intelligence will"

gemini_response = gemini_model.generate_content(gemini_prompt)

print("GEMINI PROMPT:")
print(gemini_prompt)
print("\nGEMINI CONTINUATION:")
print(gemini_response.text)

GEMINI PROMPT:
In the future, artificial intelligence will

GEMINI CONTINUATION:
In the future, artificial intelligence will be **transformative**, but its exact trajectory and impact are subject to ongoing development, ethical considerations, and societal choices. Here's a breakdown of what's widely anticipated:

1.  **Become More Capable and Pervasive:**
    *   **Smarter and More Autonomous:** AI will continue to improve its ability to learn, reason, and make decisions, often without human intervention. This will range from highly specialized narrow AI to more general-purpose systems.
    *   **Better Understanding of Context and Nuance:** Current AI often struggles with genuine understanding. Future AI will likely develop a deeper grasp of human language, emotion, and complex social contexts, leading to more natural and helpful interactions.
    *   **Embodied AI:** AI will be increasingly integrated into physical robots, drones, and autonomous vehicles, allowing it to interact wit

**Discussion:**

- `text-embedding-004` turns text into vectors ‚Üí encoder-like behavior.
- `gemini-2.5-flash` generates text from a prompt ‚Üí decoder-like behavior.


### 4.1 Unified Helper Functions

These small helpers make it easy to compare encoder vs decoder behavior across models.


In [None]:
def bert_encode(sentence: str):
    tokens = bert_tokenizer(sentence, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        out = bert_model(**tokens)
    return out.last_hidden_state[:, 0, :].squeeze(0)  # CLS embedding

def gpt_generate(prompt: str, max_length: int = 40):
    tokens = gpt_tokenizer(prompt, return_tensors="pt")
    out_ids = gpt_model.generate(**tokens, max_length=max_length, do_sample=False)
    return gpt_tokenizer.decode(out_ids[0], skip_special_tokens=True)

def gemini_embed(sentence: str):
    return genai.embed_content(model="models/text-embedding-004", content=sentence)["embedding"]

def gemini_generate(prompt: str):
    resp = gemini_model.generate_content(prompt)
    return resp.text

### 4.2 Try All Three Models on Similar Tasks

In [None]:
s1 = "I love learning about transformers."
s2 = "I enjoy studying deep learning models."

print("=== BERT similarity ===")
b1 = bert_encode(s1)
b2 = bert_encode(s2)
print("Cosine similarity:", F.cosine_similarity(b1, b2, dim=0).item())

print("\n=== Gemini similarity ===")
g1 = gemini_embed(s1)
g2 = gemini_embed(s2)
print("Cosine similarity:", cosine(g1, g2))

print("\n=== GPT-2 generation ===")
print(gpt_generate("Explain transformers in simple words:"))

print("\n=== Gemini generation ===")
print(gemini_generate("Explain transformers in simple words:"))

=== BERT similarity ===
Cosine similarity: 0.9562077522277832

=== Gemini similarity ===


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Cosine similarity: 0.6452502103229109

=== GPT-2 generation ===
Explain transformers in simple words:

The following code is a simple example of a transformer.

import { transformers } from './transformers'; import { transformers.

=== Gemini generation ===
Imagine you're reading a really long sentence.

**The Old Way (before Transformers):**
Older computer models would read sentences word by word, like a child sounding out words. "The... quick... brown... fox..." By the time they got to the end, they often forgot what was at the beginning, making it hard to understand the full meaning, especially in long sentences.

**The Transformer Way:**

1.  **Read Everything At Once:** Instead of reading word by word, a Transformer reads the **entire sentence (or a large chunk of it) all at once**. It takes in all the words simultaneously.

2.  **"Pay Attention" (The Magic Part):** This is the core trick. For *each word* in the sentence, the Transformer asks: "Which other words in this sentence a

## 5. Student Assignments (Approx. 30‚Äì40 Minutes)

These tasks help you explore encoder vs decoder behavior across all three systems.


### üß© Task 1 ‚Äì BERT: Three-Sentence Similarity

Use all three models to compare the following three sentences:

1. `I like playing football.`  
2. `I enjoy playing soccer.`  
3. `The stock market is down today.`  

**Your tasks:**

- Compute cosine similarity for all pairs: (1,2), (1,3), (2,3).
- Print the values clearly.
- In 3‚Äì4 lines, explain which sentences are semantically closest and why.


In [None]:
# TODO: Student code for Task 1 ‚Äì BERT three-sentence similarity
sentences_bert = [
    "I like playing football.",
    "I enjoy playing soccer.",
    "The stock market is down today"
]

# Tokenize: convert text ‚Üí token IDs
encoded_bert = bert_tokenizer(
    sentences_bert,
    padding=True,
    truncation=True,
    return_tensors="pt"
)

with torch.no_grad():
    outputs_bert = bert_model(**encoded_bert)

# Take the [CLS] token embedding (position 0) as a sentence embedding
bert_cls_embeddings = outputs_bert.last_hidden_state[:, 0, :]  # [batch, hidden]

print("BERT sentence embedding shape:", bert_cls_embeddings.shape)

# Write your explanation in a markdown cell after running this code.


BERT sentence embedding shape: torch.Size([3, 768])


In [None]:
sim = F.cosine_similarity(bert_cls_embeddings[0], bert_cls_embeddings[1], dim=0)
print("pair(0,1): \n")
print("Cosine similarity between the two sentences:", sim.item())

pair(0,1): 

Cosine similarity between the two sentences: 0.969649612903595


In [None]:
sim = F.cosine_similarity(bert_cls_embeddings[1], bert_cls_embeddings[2], dim=0)
print("pair(1,2): \n")
print("Cosine similarity between the two sentences:", sim.item())

pair(1,2): 

Cosine similarity between the two sentences: 0.8546391129493713


In [None]:
sim = F.cosine_similarity(bert_cls_embeddings[0], bert_cls_embeddings[2], dim=0)
print("pair(0,2): \n")
print("Cosine similarity between the two sentences:", sim.item())

pair(0,2): 

Cosine similarity between the two sentences: 0.8786798715591431


Pair (0, 1): Cosine similarity between I like playing football. and I enjoy playing soccer.: ~0.97

Pair (1, 2): Cosine similarity between I enjoy playing soccer. and The stock market is down today: ~0.8546

Pair (0, 2): Cosine similarity between I like playing football. and The stock market is down today: ~0.878

### üß© Task 2 ‚Äì GPT-2: Prompt Style and Continuation

Run GPT-2 with the following prompts (or your own creative ones):

1. `Machine learning is transforming healthcare by`  
2. `Once upon a time in a distant galaxy,`  

**Your tasks:**

- Generate up to 40 tokens for each prompt.
- Compare the two outputs:
  - How does GPT-2 adapt to the style/topic of each prompt?
  - Write 4‚Äì5 lines of observations.


In [None]:
# TODO: Student code for Task 2 ‚Äì GPT-2 generation with different prompts
gpt_prompt = "Machine learning is transforming healthcare by"

# Tokenize prompt
gpt_inputs = gpt_tokenizer(gpt_prompt, return_tensors="pt")

# Generate continuation
gpt_output_ids = gpt_model.generate(
    **gpt_inputs,
    max_length=40,      # total tokens (prompt + generated)
    do_sample=False,    # deterministic
    num_beams=1
)

gpt_generated_text = gpt_tokenizer.decode(gpt_output_ids[0], skip_special_tokens=True)

print("GPT-2 PROMPT:")
print(gpt_prompt)
print("\nGPT-2 CONTINUATION:")
print(gpt_generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


GPT-2 PROMPT:
Machine learning is transforming healthcare by

GPT-2 CONTINUATION:
Machine learning is transforming healthcare by providing a new way to understand and predict the health of patients.

The new technology is called "deep learning," and it's being used to train doctors and nurses


In [None]:
# TODO: Student code for Task 2 ‚Äì GPT-2 generation with different prompts
gpt_prompt = "Once upon a time in a distant galaxy"

# Tokenize prompt
gpt_inputs = gpt_tokenizer(gpt_prompt, return_tensors="pt")

# Generate continuation
gpt_output_ids = gpt_model.generate(
    **gpt_inputs,
    max_length=50,      # total tokens (prompt + generated)
    do_sample=False,    # deterministic
    num_beams=1
)

gpt_generated_text = gpt_tokenizer.decode(gpt_output_ids[0], skip_special_tokens=True)

print("GPT-2 PROMPT:")
print(gpt_prompt)
print("\nGPT-2 CONTINUATION:")
print(gpt_generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


GPT-2 PROMPT:
Once upon a time in a distant galaxy

GPT-2 CONTINUATION:
Once upon a time in a distant galaxy, the galaxy was a vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast, vast


### üß© Task 3 ‚Äì Gemini: Embeddings and Generation

Repeat Tasks 1 and 2 using **Gemini**:

1. Use `text-embedding-004` to compute embeddings for the same three sentences from Task 1.
2. Compute cosine similarity for all pairs.
3. Use `gemini-2.5-flash` to generate text for the same prompts from Task 2.

**Your tasks:**

- Compare the Gemini similarities with BERT similarities.
- Compare Gemini generations with GPT-2 generations.
- Write 6‚Äì8 lines summarizing similarities and differences.


In [None]:
embed_model_name = "models/text-embedding-004"

gemini_text1 = "I like playing football."
gemini_text2 = "I enjoy playing soccer."
gemini_text3 = "The stock market is down today."

gemini_emb1 = genai.embed_content(model=embed_model_name, content=gemini_text1)["embedding"]
gemini_emb2 = genai.embed_content(model=embed_model_name, content=gemini_text2)["embedding"]
gemini_emb3 =   genai.embed_content(model=embed_model_name, content=gemini_text3)["embedding"]

def cosine(a, b):
    a = np.array(a)
    b = np.array(b)
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

gemini_sim = cosine(gemini_emb1, gemini_emb2)
print("Gemini embedding dimension:", len(gemini_emb1))
print("Cosine similarity between the two Gemini embeddings:", gemini_sim)

Gemini embedding dimension: 768
Cosine similarity between the two Gemini embeddings: 0.8197920340047521


In [None]:
gemini_sim = cosine(gemini_emb1, gemini_emb2)
print("Gemini embedding dimension:", len(gemini_emb1))
print("Cosine similarity between the two Gemini embeddings gemini_text1 and  gemini_text2 :", gemini_sim)

Gemini embedding dimension: 768
Cosine similarity between the two Gemini embeddings gemini_text1 and  gemini_text2 : 0.8197920340047521


In [None]:
gemini_sim = cosine(gemini_emb2, gemini_emb3)
print("Gemini embedding dimension:", len(gemini_emb1))
print("Cosine similarity between the two Gemini embeddings gemini_text2 and  gemini_text3 :", gemini_sim)

Gemini embedding dimension: 768
Cosine similarity between the two Gemini embeddings gemini_text2 and  gemini_text3 : 0.26086988066268707


In [None]:
gemini_sim = cosine(gemini_emb1, gemini_emb3)
print("Gemini embedding dimension:", len(gemini_emb1))
print("Cosine similarity between the two Gemini embeddings gemini_text1 and  gemini_text3 :", gemini_sim)

Gemini embedding dimension: 768
Cosine similarity between the two Gemini embeddings gemini_text1 and  gemini_text3 : 0.289561688143993


In [None]:
gemini_prompt = "Machine learning is transforming healthcare by"

gemini_response = gemini_model.generate_content(gemini_prompt)

print("GEMINI PROMPT:")
print(gemini_prompt)
print("\nGEMINI CONTINUATION:")
print(gemini_response.text)

GEMINI PROMPT:
Machine learning is transforming healthcare by

GEMINI CONTINUATION:
Machine learning is transforming healthcare by **revolutionizing virtually every aspect of the industry**, from diagnosis and treatment to drug discovery and operational efficiency. Here are the key ways:

1.  **Enabling Earlier and More Accurate Diagnosis:**
    *   **Image Recognition:** ML algorithms can analyze medical images (X-rays, MRIs, CT scans, pathology slides, dermatological images) with high accuracy, often surpassing human capabilities in detecting subtle patterns indicative of diseases like cancer, diabetic retinopathy, and glaucoma years earlier.
    *   **Pattern Recognition in EHRs:** By sifting through vast amounts of electronic health record (EHR) data, ML can identify complex patterns and correlations that might indicate a higher risk for certain conditions, leading to proactive interventions.

2.  **Personalizing Treatment Plans (Precision Medicine):**
    *   **Genomic Analysis:**

In [None]:
gemini_prompt = "Once upon a time in a distant galaxy,"

gemini_response = gemini_model.generate_content(gemini_prompt)

print("GEMINI PROMPT:")
print(gemini_prompt)
print("\nGEMINI CONTINUATION:")
print(gemini_response.text)

GEMINI PROMPT:
Once upon a time in a distant galaxy,

GEMINI CONTINUATION:
Once upon a time in a distant galaxy, nestled within a nebula that shimmered with the colors of a thousand sunsets, lay the planet of Aethelgard. Aethelgard was not merely a world; it was a living jewel, its surface crisscrossed by rivers of liquid starlight and dotted with cities built into gigantic, bioluminescent crystals that pulsed with an internal, gentle light. Its inhabitants, the Aethelans, were a species of sentient light-weavers, beings composed of shimmering energy, capable of manipulating electromagnetic fields and communicating through complex patterns of light and color. They were renowned throughout the sector for their unparalleled wisdom, their intricate star-maps etched into the very fabric of spacetime, and their profound connection to the cosmic symphony.

For millennia, Aethelgard had thrived, a beacon of peace and knowledge. Its primary function was to maintain the Grand Cosmic Archive, a 

In [None]:
# TODO: Student code for Task 3 ‚Äì Gemini embeddings and generation


# 1) Embeddings for three sentences


Print a matrix of it where you can draw the comparisio between all the models

In [None]:
import torch
import numpy as np
import pandas as pd

# Example sentences
s1 = "I like playing football."
s2 = "I enjoy playing soccer."
sentences = [s1, s2]

# ---------------------------
# 1Ô∏è‚É£ Compute BERT embeddings
# ---------------------------
# Assume bert_encode returns a 1D torch tensor
b_embeddings = torch.stack([bert_encode(s) for s in sentences])  # [2, hidden_dim]
b_norm = b_embeddings / b_embeddings.norm(dim=1, keepdim=True)

# For comparison, we can use the first sentence embedding as representative
bert_vec = b_norm[0]  # shape: [hidden_dim]

# ---------------------------
# 2Ô∏è‚É£ Compute Gemini embeddings
# ---------------------------
# Assume gemini_embed returns a 1D NumPy array
g_embeddings = np.stack([gemini_embed(s) for s in sentences])
g_norm = g_embeddings / np.linalg.norm(g_embeddings, axis=1, keepdims=True)
gemini_vec = torch.from_numpy(g_norm[0]).float()  # take first sentence

# ---------------------------
# 3Ô∏è‚É£ Generate GPT-2 text for each sentence and embed it
# ---------------------------
# Assume gpt_generate returns a string
gpt_text = gpt_generate(s1)

# Embed the GPT-generated text (you can use BERT or Gemini embeddings)
gpt_vec = bert_encode(gpt_text)
gpt_vec = gpt_vec / gpt_vec.norm()  # normalize

# ---------------------------
# 4Ô∏è‚É£ Construct 3x3 similarity matrix
# ---------------------------
vectors = [bert_vec, gemini_vec, gpt_vec]
model_names = ["BERT", "Gemini", "GPT-2"]

sim_matrix = torch.zeros((3,3))

for i in range(3):
    for j in range(3):
        sim_matrix[i,j] = torch.cosine_similarity(vectors[i], vectors[j], dim=0)

# ---------------------------
# 5Ô∏è‚É£ Labeled DataFrame
# ---------------------------
sim_df = pd.DataFrame(sim_matrix.numpy(), index=model_names, columns=model_names)

print("=== Cross-Model Similarity Matrix ===")
print(sim_df)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


=== Cross-Model Similarity Matrix ===
            BERT    Gemini     GPT-2
BERT    1.000000  0.087568  0.817321
Gemini  0.087568  1.000000  0.100768
GPT-2   0.817321  0.100768  1.000000


In [None]:
import torch
import numpy as np
import pandas as pd

# --------------------------------------------------------
# Your functions (already correct)
# --------------------------------------------------------

def bert_encode(sentence: str):
    tokens = bert_tokenizer(sentence, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        out = bert_model(**tokens)
    # CLS embedding
    return out.last_hidden_state[:, 0, :].squeeze(0)  # shape: [hidden_dim]

def gpt_generate(prompt: str, max_length: int = 40):
    tokens = gpt_tokenizer(prompt, return_tensors="pt")
    out_ids = gpt_model.generate(
        **tokens, max_length=max_length, do_sample=False
    )
    return gpt_tokenizer.decode(out_ids[0], skip_special_tokens=True)

def gemini_embed(sentence: str):
    return genai.embed_content(
        model="models/text-embedding-004",
        content=sentence
    )["embedding"]  # numpy list

def gemini_generate(prompt: str):
    resp = gemini_model.generate_content(prompt)
    return resp.text

# --------------------------------------------------------
# 1. Example sentences
# --------------------------------------------------------

s1 = "I love AI."
s2 = "AI is lovely"
sentences = [s1, s2]

# --------------------------------------------------------
# 2. BERT embeddings
# --------------------------------------------------------

bert_emb = torch.stack([bert_encode(s) for s in sentences])   # [2, hidden]
bert_emb_norm = bert_emb / bert_emb.norm(dim=1, keepdim=True)
bert_vec = bert_emb_norm[0]  # use sentence 1

# --------------------------------------------------------
# 3. Gemini embeddings ‚Üí convert to torch
# --------------------------------------------------------

gemini_emb = np.stack([gemini_embed(s) for s in sentences])   # [2, hidden]
gemini_emb_norm = gemini_emb / np.linalg.norm(gemini_emb, axis=1, keepdims=True)
gemini_vec = torch.tensor(gemini_emb_norm[0], dtype=torch.float32)

# --------------------------------------------------------
# 4. GPT-2 generation ‚Üí BERT embedding
# --------------------------------------------------------

gpt_output_text = gpt_generate(s1)
gpt_vec = bert_encode(gpt_output_text)
gpt_vec = gpt_vec / gpt_vec.norm()

# --------------------------------------------------------
# 5. Build 3√ó3 similarity matrix
# --------------------------------------------------------

vectors = [bert_vec, gemini_vec, gpt_vec]
names = ["BERT", "Gemini", "GPT-2"]

sim_matrix = torch.zeros((3, 3))

for i in range(3):
    for j in range(3):
        sim_matrix[i, j] = torch.cosine_similarity(vectors[i], vectors[j], dim=0)

# --------------------------------------------------------
# 6. Build DataFrame
# --------------------------------------------------------

sim_df = pd.DataFrame(sim_matrix.numpy(), index=names, columns=names)

print("\n=== CROSS-MODEL SIMILARITY MATRIX ===\n")
print(sim_df)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== CROSS-MODEL SIMILARITY MATRIX ===

            BERT    Gemini     GPT-2
BERT    1.000000  0.009073  0.832366
Gemini  0.009073  1.000000  0.024357
GPT-2   0.832366  0.024357  1.000000
