# Course 4 Module 3 Lesson 1 Screencasts

## M3L1SC1: Retrieving Knowledge: Embeddings and Vector Search with FAISS

### Step 1: Setting Up Your Environment

In [None]:
import os
os.system('pip install faiss-cpu sentence-transformers > /dev/null 2>&1')

0

In [None]:
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

### Step 2: Generating and Indexing Embeddings

In [None]:
# Build the index
model = SentenceTransformer("paraphrase-MiniLM-L6-v2")

documents = [
    "Deep learning models are solving complex problems.",
    "Generative AI can create lifelike images and videos.",
    "AI models need optimization to reduce biases."
]

embeddings = model.encode(documents, convert_to_numpy=True)
embeddings = embeddings.astype("float32")           # FAISS expects float32

dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(embeddings)

# Run a query
query      = "How do AI models optimize data?"
query_vec  = model.encode([query], convert_to_numpy=True).astype("float32")

k = 2
distances, indices = index.search(query_vec, k)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/3.51k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/314 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### Step 3: Retrieving Top Matches


In [None]:
for rank, doc_id in enumerate(indices[0]):
    print(
        f"Rank {rank+1}:",
        f"Match = '{documents[doc_id]}'",
        f"Distance = {distances[0][rank]:.4f}"
    )


Rank 1: Match = 'AI models need optimization to reduce biases.' Distance = 22.0849
Rank 2: Match = 'Generative AI can create lifelike images and videos.' Distance = 40.5746


## M3L1SC2: Grounded Generation: Adding Retrieval to an LLM Pipeline


### Step 1: Setting Up Your Environment

In [None]:
# !pip install transformers datasets faiss-cpu sentence-transformers

# Suppress "requirements already met" messages
import os
os.system('pip install transformers datasets faiss-cpu sentence-transformers > /dev/null 2>&1')



### Step 2: Creating Embeddings

In [None]:
# Embed & index the documents
model = SentenceTransformer("paraphrase-MiniLM-L6-v2")

documents = [
    "The moon affects ocean tides through gravitational pull.",
    "Solar power harnesses energy from the sun's rays.",
    "Generative AI models can create lifelike digital avatars."
]

embeddings = model.encode(documents, convert_to_numpy=True).astype("float32")

### Step 3: Indexing with FAISS

In [None]:
# Initialize your FAISS index
dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(embeddings)


### Step 4: Query Processing and Retrieval

In [None]:
# Example query you'll use
query = "How does the sun generate energy?"

# Encoding the query
query_embedding = model.encode([query], convert_to_numpy=True).astype("float32")

# Retrieve top 2 similar documents
k = 2
distances, indices = index.search(query_embedding, k)

# Prepare the enhanced prompt
retrieved_text  = " ".join(documents[i] for i in indices[0])
complete_prompt = (
    f"Context: {retrieved_text}\n\n"
    f"Q: {query}\n"
    f"A:"
)

### Step 5: Generating Grounded Responses

In [None]:
hf_model_name = "gpt2-medium"          # change to any available text-gen model

generator = pipeline(
    "text-generation",
    model=hf_model_name,
    tokenizer=hf_model_name,
    device="cuda" if faiss.get_num_gpus() else -1   # GPU if available
)

response = generator(
    complete_prompt,
    max_length=200,
    num_return_sequences=1,
    do_sample=False        # deterministic; set True + temperature/top_p for sampling
)

print("Enhanced response:\n", response[0]["generated_text"])

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=200) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Enhanced response:
 Context: Solar power harnesses energy from the sun's rays. The moon affects ocean tides through gravitational pull.

Q: How does the sun generate energy?
A: The sun's energy is generated by the interaction of hydrogen atoms with electrons in the outer layers of the solar system. The hydrogen atoms are charged and excited by the electrons, creating a magnetic field. The electrons then move through the hydrogen atoms, creating a magnetic field. The hydrogen atoms then move through the outer layers of the solar system, creating a magnetic field. The hydrogen atoms then move through the outer layers of the solar system, creating a magnetic field. The hydrogen atoms then move through the outer layers of the solar system, creating a magnetic field. The hydrogen atoms then move through the outer layers of the solar system, creating a magnetic field. The hydrogen atoms then move through the outer layers of the solar system, creating a magnetic field. The hydrogen atoms then