<a href="https://colab.research.google.com/github/nithinshettygit/AIRA-AI-Robotic-Assistant/blob/main/chapter_appendix-tools-for-deep-learning/jupyter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Jupyter Notebooks
:label:`sec_jupyter`


This section describes how to edit and run the code
in each section of this book
using the Jupyter Notebook. Make sure you have
installed Jupyter and downloaded the
code as described in
:ref:`chap_installation`.
If you want to know more about Jupyter see the excellent tutorial in
their [documentation](https://jupyter.readthedocs.io/en/latest/).


## Editing and Running the Code Locally

Suppose that the local path of the book's code is `xx/yy/d2l-en/`. Use the shell to change the directory to this path (`cd xx/yy/d2l-en`) and run the command `jupyter notebook`. If your browser does not do this automatically, open http://localhost:8888 and you will see the interface of Jupyter and all the folders containing the code of the book, as shown in :numref:`fig_jupyter00`.

![The folders containing the code of this book.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter00.png?raw=1)
:width:`600px`
:label:`fig_jupyter00`


You can access the notebook files by clicking on the folder displayed on the webpage.
They usually have the suffix ".ipynb".
For the sake of brevity, we create a temporary "test.ipynb" file.
The content displayed after you click it is
shown in :numref:`fig_jupyter01`.
This notebook includes a markdown cell and a code cell. The content in the markdown cell includes "This Is a Title" and "This is text.".
The code cell contains two lines of Python code.

![Markdown and code cells in the "text.ipynb" file.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter01.png?raw=1)
:width:`600px`
:label:`fig_jupyter01`


Double click on the markdown cell to enter edit mode.
Add a new text string "Hello world." at the end of the cell, as shown in :numref:`fig_jupyter02`.

![Edit the markdown cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter02.png?raw=1)
:width:`600px`
:label:`fig_jupyter02`


As demonstrated in :numref:`fig_jupyter03`,
click "Cell" $\rightarrow$ "Run Cells" in the menu bar to run the edited cell.

![Run the cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter03.png?raw=1)
:width:`600px`
:label:`fig_jupyter03`

After running, the markdown cell is shown in :numref:`fig_jupyter04`.

![The markdown cell after running.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter04.png?raw=1)
:width:`600px`
:label:`fig_jupyter04`


Next, click on the code cell. Multiply the elements by 2 after the last line of code, as shown in :numref:`fig_jupyter05`.

![Edit the code cell.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter05.png?raw=1)
:width:`600px`
:label:`fig_jupyter05`


You can also run the cell with a shortcut ("Ctrl + Enter" by default) and obtain the output result from :numref:`fig_jupyter06`.

![Run the code cell to obtain the output.](https://github.com/d2l-ai/d2l-en-colab/blob/master/img/jupyter06.png?raw=1)
:width:`600px`
:label:`fig_jupyter06`


When a notebook contains more cells, we can click "Kernel" $\rightarrow$ "Restart & Run All" in the menu bar to run all the cells in the entire notebook. By clicking "Help" $\rightarrow$ "Edit Keyboard Shortcuts" in the menu bar, you can edit the shortcuts according to your preferences.

## Advanced Options

Beyond local editing two things are quite important: editing the notebooks in the markdown format and running Jupyter remotely.
The latter matters when we want to run the code on a faster server.
The former matters since Jupyter's native ipynb format stores a lot of auxiliary data that is
irrelevant to the content,
mostly related to how and where the code is run.
This is confusing for Git, making
reviewing contributions very difficult.
Fortunately there is an alternative---native editing in the markdown format.

### Markdown Files in Jupyter

If you wish to contribute to the content of this book, you need to modify the
source file (md file, not ipynb file) on GitHub.
Using the notedown plugin we
can modify notebooks in the md format directly in Jupyter.


First, install the notedown plugin, run the Jupyter Notebook, and load the plugin:

```
pip install d2l-notedown  # You may need to uninstall the original notedown.
jupyter notebook --NotebookApp.contents_manager_class='notedown.NotedownContentsManager'
```

You may also turn on the notedown plugin by default whenever you run the Jupyter Notebook.
First, generate a Jupyter Notebook configuration file (if it has already been generated, you can skip this step).

```
jupyter notebook --generate-config
```

Then, add the following line to the end of the Jupyter Notebook configuration file (for Linux or macOS, usually in the path `~/.jupyter/jupyter_notebook_config.py`):

```
c.NotebookApp.contents_manager_class = 'notedown.NotedownContentsManager'
```

After that, you only need to run the `jupyter notebook` command to turn on the notedown plugin by default.

### Running Jupyter Notebooks on a Remote Server

Sometimes, you may want to run Jupyter notebooks on a remote server and access it through a browser on your local computer. If Linux or macOS is installed on your local machine (Windows can also support this function through third-party software such as PuTTY), you can use port forwarding:

```
ssh myserver -L 8888:localhost:8888
```

The above string `myserver` is the address of the remote server.
Then we can use http://localhost:8888 to access the remote server `myserver` that runs Jupyter notebooks. We will detail on how to run Jupyter notebooks on AWS instances
later in this appendix.

### Timing

We can use the `ExecuteTime` plugin to time the execution of each code cell in Jupyter notebooks.
Use the following commands to install the plugin:

```
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime
```

## Summary

* Using the Jupyter Notebook tool, we can edit, run, and contribute to each section of the book.
* We can run Jupyter notebooks on remote servers using port forwarding.


## Exercises

1. Edit and run the code in this book with the Jupyter Notebook on your local machine.
1. Edit and run the code in this book with the Jupyter Notebook *remotely* via port forwarding.
1. Compare the running time of the operations $\mathbf{A}^\top \mathbf{B}$ and $\mathbf{A} \mathbf{B}$ for two square matrices in $\mathbb{R}^{1024 \times 1024}$. Which one is faster?


[Discussions](https://discuss.d2l.ai/t/421)


In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [1]:
!pip install langchain_community
!pip install fuzzywuzzy[speedup]
!pip install regex
!pip install sentence-transformers faiss-cpu numpy
!pip install tqdm
!pip install langgraph langchain-google-genai chromadb faiss-cpu
!pip install langchain langchain-community
!pip install flask  # later for web UI



Collecting langchain_community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain_community)
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 k



In [2]:
import os
import json
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_core.documents import Document

# --- Configuration & Initialization ---
VECTOR_DB_DIR = "data/faiss_vectorstore_final"
SUBCHAPTER_METADATA_FILE = "subchapter_metadata.json"
CHUNKS_FILE = "merged_chunks_with_figures.json"

# Initialize embeddings
embedding_model_name = "thenlper/gte-large"
embedding_model_kwargs = {"device": "cuda"}
embedding_encode_kwargs = {"normalize_embeddings": True}
embeddings = HuggingFaceBgeEmbeddings(
    model_name=embedding_model_name,
    model_kwargs=embedding_model_kwargs,
    encode_kwargs=embedding_encode_kwargs
)

# --- Load and Process Data ---
try:
    with open(SUBCHAPTER_METADATA_FILE, "r", encoding="utf-8") as f:
        subchapter_metadata = json.load(f)
    print("✅ Subchapter metadata loaded.")
    # Convert the dictionary to a list of Documents for FAISS
    subchapter_docs = [
        Document(page_content=subchapter_metadata[doc_id], metadata={"doc_id": doc_id})
        for doc_id in subchapter_metadata
    ]
except FileNotFoundError:
    print(f"❌ Error: {SUBCHAPTER_METADATA_FILE} not found. Please ensure it's in the same directory.")
    exit()
except json.JSONDecodeError:
    print(f"❌ Error: {SUBCHAPTER_METADATA_FILE} is not a valid JSON file.")
    exit()

try:
    with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
        original_chunks = json.load(f)
    print("✅ Original data loaded.")
    # Convert the list of dictionaries to a list of Documents for FAISS
    content_docs = [
        Document(page_content=chunk.get("content", ""), metadata=chunk)
        for chunk in original_chunks if chunk.get("content") and chunk.get("id")
    ]
    # Create a dictionary for quick lookup by ID
    content_lookup = {str(chunk.get("id")): chunk for chunk in original_chunks}
except FileNotFoundError:
    print(f"❌ Error: {CHUNKS_FILE} not found. Please ensure it's in the same directory.")
    exit()
except json.JSONDecodeError:
    print(f"❌ Error: {CHUNKS_FILE} is not a valid JSON file.")
    exit()

# --- Create Indexes ---
if not os.path.exists(VECTOR_DB_DIR):
    os.makedirs(VECTOR_DB_DIR)

# Create and Save Subchapter Index
if subchapter_docs:
    print("🚀 Creating subchapter FAISS index...")
    vectorstore_subchapters = FAISS.from_documents(subchapter_docs, embeddings)
    vectorstore_subchapters.save_local(os.path.join(VECTOR_DB_DIR, "subchapter_faiss_index"))
    print("✅ Subchapter FAISS index created and saved successfully.")
else:
    print("⚠️ No valid subchapters found. Skipping subchapter index creation.")

# Create and Save Full Content Index
if content_docs:
    print("🚀 Creating full content FAISS index...")
    vectorstore_full_content = FAISS.from_documents(content_docs, embeddings)
    vectorstore_full_content.save_local(os.path.join(VECTOR_DB_DIR, "content_faiss_index"))
    print("✅ Full content FAISS index created and saved successfully.")
else:
    print("❌ No valid content chunks found. Cannot create full content index.")
    exit()

  embeddings = HuggingFaceBgeEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/385 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/619 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/670M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

✅ Subchapter metadata loaded.
✅ Original data loaded.
🚀 Creating subchapter FAISS index...
✅ Subchapter FAISS index created and saved successfully.
🚀 Creating full content FAISS index...
✅ Full content FAISS index created and saved successfully.


In [6]:
import os
import json
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_core.documents import Document

# --- Configuration & Initialization ---
VECTOR_DB_DIR = "data/faiss_vectorstore_final"
SUBCHAPTER_METADATA_FILE = "subchapter_metadata.json"
CHUNKS_FILE = "merged_chunks_with_figures.json"

# Initialize embeddings
embedding_model_name = "thenlper/gte-large"
embedding_model_kwargs = {"device": "cuda"}
embedding_encode_kwargs = {"normalize_embeddings": True}
embeddings = HuggingFaceBgeEmbeddings(
    model_name=embedding_model_name,
    model_kwargs=embedding_model_kwargs,
    encode_kwargs=embedding_encode_kwargs
)

# --- Functions to load indexes ---
def load_vectorstores(vector_dir, embeddings_model):
    """Loads existing FAISS vector stores from disk."""
    vectorstore_subchapters = None
    try:
        vectorstore_subchapters = FAISS.load_local(
            folder_path=os.path.join(vector_dir, "subchapter_faiss_index"),
            embeddings=embeddings_model,
            allow_dangerous_deserialization=True
        )
        print("✅ Subchapter FAISS vector store loaded successfully.")
    except Exception:
        print("⚠️ Subchapter FAISS store not found. Skipping.")

    vectorstore_full_content = None
    try:
        vectorstore_full_content = FAISS.load_local(
            folder_path=os.path.join(vector_dir, "content_faiss_index"),
            embeddings=embeddings_model,
            allow_dangerous_deserialization=True
        )
        print("✅ Full content FAISS vector store loaded successfully.")
    except Exception as e:
        print(f"❌ Error loading full content FAISS vector store: {e}")
        return None, None

    return vectorstore_subchapters, vectorstore_full_content

# --- Main Execution Block ---
if __name__ == "__main__":

    # Load the vector stores
    vectorstore_subchapters, vectorstore_full_content = load_vectorstores(VECTOR_DB_DIR, embeddings)
    if not vectorstore_subchapters or not vectorstore_full_content:
        print("Please run the create_faiss_indexes.py script first to generate the indexes.")
        exit()

    try:
        with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
            original_chunks = json.load(f)

        # Create a direct lookup map: Subchapter Name -> Document Chunk
        subchapter_to_chunk_map = {chunk.get("subchapter", "").strip(): chunk for chunk in original_chunks}
        print("✅ Subchapter to content mapping created.")
    except FileNotFoundError:
        print(f"❌ Error: {CHUNKS_FILE} not found.")
        exit()
    except json.JSONDecodeError:
        print(f"❌ Error: {CHUNKS_FILE} is not a valid JSON file.")
        exit()

    # Define a sample query for testing
    query_to_test = "What is a double displacement Reaction?"

    print(f"\n🔍 Performing two-stage search for: '{query_to_test}'...")

    # Stage 1: Search the subchapter FAISS index
    subchapter_results = vectorstore_subchapters.similarity_search(query_to_test, k=1)

    if not subchapter_results:
        print("No matching subchapter found.")
    else:
        top_subchapter_doc = subchapter_results[0]
        subchapter_name = top_subchapter_doc.page_content.strip()

        print(f"✅ Found top-ranked subchapter: '{subchapter_name}'")

        # Stage 2: Use the exact subchapter name for a direct lookup
        print("\n🔍 Performing direct content lookup...")

        final_doc = subchapter_to_chunk_map.get(subchapter_name)

        if final_doc:
            print("\n--- Final Result ---")
            print(f"Subchapter: {final_doc.get('subchapter', 'N/A')}")
            content_snippet = final_doc.get('content', '').replace('\n', ' ').strip()
            print(f"Content (first 300 chars): {content_snippet[:300]}...")

            figures_str = final_doc.get("figures", "")
            if figures_str:
                print("Figures:")
                print(f" - {figures_str}")
            else:
                print("Figures: None")
        else:
            print(f"❌ Error: Could not retrieve content for subchapter '{subchapter_name}'.")

✅ Subchapter FAISS vector store loaded successfully.
✅ Full content FAISS vector store loaded successfully.
✅ Subchapter to content mapping created.

🔍 Performing two-stage search for: 'What is a double displacement Reaction?'...
✅ Found top-ranked subchapter: '1.2.4 Double Displacement Reaction'

🔍 Performing direct content lookup...

--- Final Result ---
Subchapter: 1.2.4 Double Displacement Reaction
Content (first 300 chars): Activity 1.10 n Take about 3 mL of sodium sulphate solution in a test tube. n In another test tube, take about 3 mL of barium chloride solution. n Mix the two solutions (Fig. 1.9). n What do you observe? Figure 1.9 Formation of barium sulphate and sodium chloride You will observe that a white substa...
Figures:
 - [{'figure': 'Figure 1.9', 'desc': 'Figure 1.9 Formation of barium sulphate and sodium chloride.'}]


In [11]:
import os
import json
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_core.documents import Document
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI  # ✅ Use Google's Gemini model

# --- Configuration ---
VECTOR_DB_DIR = "data/faiss_vectorstore_final"
CHUNKS_FILE = "merged_chunks_with_figures.json"
TOP_K = 5  # number of chunks to retrieve
EMBEDDING_MODEL = "thenlper/gte-large"
DEVICE = "cuda"

import os

GOOGLE_API_KEY = "AIzaSyDwSeSRNqTrUbw10XzkW-xYIUEtK4vPVg8"

# You can then use this variable in your code
# For example, to set it as an environment variable for a specific library call:
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

# Now, the library will be able to access the key
# --- Initialize embeddings ---
embeddings = HuggingFaceBgeEmbeddings(
    model_name=EMBEDDING_MODEL,
    model_kwargs={"device": DEVICE},
    encode_kwargs={"normalize_embeddings": True}
)

# --- Load FAISS indexes ---
def load_faiss_indexes(vector_dir):
    """Load subchapter and content FAISS vector stores"""
    try:
        vectorstore_subchapters = FAISS.load_local(
            os.path.join(vector_dir, "subchapter_faiss_index"),
            embeddings,
            allow_dangerous_deserialization=True
        )
        print("✅ Subchapter FAISS loaded.")
    except Exception:
        vectorstore_subchapters = None
        print("⚠️ Subchapter FAISS not found.")

    try:
        vectorstore_content = FAISS.load_local(
            os.path.join(vector_dir, "content_faiss_index"),
            embeddings,
            allow_dangerous_deserialization=True
        )
        print("✅ Content FAISS loaded.")
    except Exception as e:
        vectorstore_content = None
        print(f"❌ Error loading content FAISS: {e}")

    return vectorstore_subchapters, vectorstore_content

# --- Load chunks for metadata/figures ---
with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
    chunks = json.load(f)
# Map: subchapter -> chunk
subchapter_to_chunk = {c.get("subchapter", "").strip(): c for c in chunks}

# --- Load FAISS ---
vectorstore_subchapters, vectorstore_content = load_faiss_indexes(VECTOR_DB_DIR)
if not vectorstore_subchapters or not vectorstore_content:
    raise RuntimeError("FAISS indexes missing. Run index creation first.")

# --- Prompt Template ---
lesson_prompt_template = """
You are an 8th-grade science teacher. Using the following content, generate a **detailed lesson**:
Content: {retrieved_content}

Figures (if any): {figures}

Please structure the lesson as follows:
1. Short funny/memorable intro story.
2. Explanation with examples.
3. Embed figures using <img src='...' alt='Figure'> syntax.
4. Key takeaways at the end.
5. Summary paragraph.

Generate in Markdown format.
"""

prompt_template = PromptTemplate(
    input_variables=["retrieved_content", "figures"],
    template=lesson_prompt_template
)

# --- Initialize LLM with Gemini ---
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-8b",  # Free-tier Gemini model
    temperature=0.7,
    max_tokens=2000,
    api_key=GOOGLE_API_KEY  # direct API key
)

# --- Retrieval + Generation Function ---
def generate_lesson(query: str, top_k: int = TOP_K):
    # 1️⃣ Retrieve top subchapter(s)
    sub_results = vectorstore_subchapters.similarity_search(query, k=1)
    if not sub_results:
        print("⚠️ No matching subchapter found.")
        return None

    top_subchapter = sub_results[0].page_content.strip()
    print(f"Top subchapter: {top_subchapter}")

    # 2️⃣ Retrieve top content chunks
    chunk_doc = subchapter_to_chunk.get(top_subchapter)
    if not chunk_doc:
        print(f"❌ No chunk found for subchapter '{top_subchapter}'")
        return None

    # Optional: Use FAISS content index for more fine-grained retrieval
    content_results = vectorstore_content.similarity_search(query, k=top_k)
    combined_texts = []
    figures_list = []

    for doc in content_results:
        combined_texts.append(doc.page_content)
        figures_list.append(doc.metadata.get("figures", ""))

    retrieved_content = "\n\n".join(combined_texts)
    figures_str = "\n".join([f"<img src='{f}' alt='Figure'>" for f in figures_list if f])

    # 3️⃣ Generate lesson via LLM
    chain = LLMChain(llm=llm, prompt=prompt_template)
    lesson = chain.run(retrieved_content=retrieved_content, figures=figures_str)
    return lesson

# --- Example Usage ---
if __name__ == "__main__":
    user_query = input("Enter a topic/question: ").strip()
    lesson_output = generate_lesson(user_query, top_k=5)
    if lesson_output:
        print("\n\n--- Generated Lesson ---\n")
        print(lesson_output)

  embeddings = HuggingFaceBgeEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


✅ Subchapter FAISS loaded.
✅ Content FAISS loaded.
Enter a topic/question: photosynthesis
Top subchapter: 6.3 RESPIRATION


  chain = LLMChain(llm=llm, prompt=prompt_template)
  lesson = chain.run(retrieved_content=retrieved_content, figures=figures_str)




--- Generated Lesson ---

# 8th Grade Science:  Fueling Life - Photosynthesis, Respiration, and Energy

**1. Short Intro Story:**

Imagine a tiny plant, Pip, desperately trying to get energy to grow taller than the other weeds. Pip knows he needs fuel, but he doesn't have a gas station.  He needs to make his own!  Today, we're going to discover how plants and even us, humans, get the energy to live and grow – through processes like photosynthesis and respiration.

**2. Explanation with Examples:**

**Photosynthesis: The Plant's Power Plant**

Plants, like Pip, are autotrophs. They can make their own food using sunlight. This incredible process is called photosynthesis.  Photosynthesis is like a plant's personal solar panel.

* **Ingredients:** Plants take in carbon dioxide from the air and water from the soil.  They also need sunlight and chlorophyll, a green pigment in their leaves, to kickstart the process.
<img src='[{"figure": "Figure 6.1", "desc": "Cross-section of a leaf."}, {"

In [2]:
import google.generativeai as genai
import os

# Set your API key
os.environ["GOOGLE_API_KEY"] = "AIzaSyDwSeSRNqTrUbw10XzkW-xYIUEtK4vPVg8"

# Configure the API key
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

# List all available models
for m in genai.list_models():
    print(m.name)

models/embedding-gecko-001
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/ge

In [5]:
!pip install langchain_community



In [10]:
import os
import google.generativeai as genai
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage

# --- Replace with your Gemini API key ---
# NOTE: It is not secure to hardcode API keys. Use environment variables in a real application.
API_KEY = "AIzaSyDwSeSRNqTrUbw10XzkW-xYIUEtK4vPVg8"

# 1️⃣ List available models using the native library
# First, configure the native Google Generative AI library with your API key.
genai.configure(api_key=API_KEY)

print("📋 Listing available models:")
try:
    # Use a list comprehension to get a list of models that support the generateContent method
    available_models = [m.name for m in genai.list_models() if 'generateContent' in m.supported_generation_methods]
    for model in available_models:
        print(" -", model)
except Exception as e:
    print(f"❌ Error listing models: {e}")
    # Handle the case where the API key might not be valid for this operation.

print("-" * 30)

# 2️⃣ Test a simple content generation with a free-tier compatible model
# This uses the LangChain wrapper.
test_model = "gemini-1.5-flash-8b"  # A valid model for the free tier.
llm_test = ChatGoogleGenerativeAI(
    model=test_model,
    temperature=0.7,
    max_output_tokens=200
)

# Simple test query using the LangChain `invoke` method for a single message.
query = "Explain the water cycle in simple terms."

try:
    response = llm_test.invoke(query)

    # Print the output from the LangChain response object
    print("\n💬 Generated Response:")
    print(response.content)

except Exception as e:
    print(f"❌ Error during content generation: {e}")

📋 Listing available models:
 - models/gemini-1.5-pro-latest
 - models/gemini-1.5-pro-002
 - models/gemini-1.5-pro
 - models/gemini-1.5-flash-latest
 - models/gemini-1.5-flash
 - models/gemini-1.5-flash-002
 - models/gemini-1.5-flash-8b
 - models/gemini-1.5-flash-8b-001
 - models/gemini-1.5-flash-8b-latest
 - models/gemini-2.5-pro-preview-03-25
 - models/gemini-2.5-flash-preview-05-20
 - models/gemini-2.5-flash
 - models/gemini-2.5-flash-lite-preview-06-17
 - models/gemini-2.5-pro-preview-05-06
 - models/gemini-2.5-pro-preview-06-05
 - models/gemini-2.5-pro
 - models/gemini-2.0-flash-exp
 - models/gemini-2.0-flash
 - models/gemini-2.0-flash-001
 - models/gemini-2.0-flash-exp-image-generation
 - models/gemini-2.0-flash-lite-001
 - models/gemini-2.0-flash-lite
 - models/gemini-2.0-flash-preview-image-generation
 - models/gemini-2.0-flash-lite-preview-02-05
 - models/gemini-2.0-flash-lite-preview
 - models/gemini-2.0-pro-exp
 - models/gemini-2.0-pro-exp-02-05
 - models/gemini-exp-1206
 - m

the multiagent for figure

In [None]:
import os
import json
from langgraph.graph import StateGraph
from langgraph.types import Command
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_core.documents import Document
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_google_genai import ChatGoogleGenerativeAI

# -----------------------------
# Configuration
# -----------------------------
VECTOR_DB_DIR = "data/faiss_vectorstore_final"
CHUNKS_FILE = "merged_chunks_with_figures.json"
FIGURES_FILE = "chapter_figures.json"
TOP_K = 5

GOOGLE_API_KEY = "YOUR_API_KEY_HERE"

EMBEDDING_MODEL = "thenlper/gte-large"
DEVICE = "cuda"

# -----------------------------
# Load Chunks & Figures
# -----------------------------
with open(CHUNKS_FILE, "r", encoding="utf-8") as f:
    chunks = json.load(f)
subchapter_to_chunk = {c.get("subchapter", "").strip(): c for c in chunks}

with open(FIGURES_FILE, "r", encoding="utf-8") as f:
    figures_data = json.load(f)
# Map: chapter -> list of figure dicts
chapter_to_figures = {f['chapter']: f['figures'] for f in figures_data}

# -----------------------------
# Initialize Embeddings
# -----------------------------
embeddings = HuggingFaceBgeEmbeddings(
    model_name=EMBEDDING_MODEL,
    model_kwargs={"device": DEVICE},
    encode_kwargs={"normalize_embeddings": True}
)

# -----------------------------
# Load FAISS
# -----------------------------
def load_faiss_indexes(vector_dir):
    try:
        vectorstore_subchapters = FAISS.load_local(
            os.path.join(vector_dir, "subchapter_faiss_index"),
            embeddings,
            allow_dangerous_deserialization=True
        )
        vectorstore_content = FAISS.load_local(
            os.path.join(vector_dir, "content_faiss_index"),
            embeddings,
            allow_dangerous_deserialization=True
        )
        return vectorstore_subchapters, vectorstore_content
    except Exception as e:
        raise RuntimeError(f"FAISS load error: {e}")

vectorstore_subchapters, vectorstore_content = load_faiss_indexes(VECTOR_DB_DIR)

# -----------------------------
# Initialize Gemini LLM
# -----------------------------
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-8b",  # free-tier
    temperature=0.7,
    max_tokens=2000,
    api_key=GOOGLE_API_KEY
)

# -----------------------------
# Prompt Templates
# -----------------------------
lesson_prompt_template = """
You are an 8th-grade science teacher.
Content: {retrieved_content}
Figures: {figures}

Generate a lesson:
1. Funny intro story
2. Explain concepts with examples
3. Embed figures using <img src='...' alt='Figure'>
4. Key takeaways
5. Summary paragraph
"""

figure_prompt_template = """
You are a teacher explaining figures to students.
Figure Descriptions: {figure_descriptions}

Explain each figure briefly and clearly in 2-3 sentences.
"""

lesson_prompt = PromptTemplate(
    input_variables=["retrieved_content", "figures"],
    template=lesson_prompt_template
)

figure_prompt = PromptTemplate(
    input_variables=["figure_descriptions"],
    template=figure_prompt_template
)

# -----------------------------
# Agent Functions
# -----------------------------
def main_agent(query: str):
    # 1️⃣ Retrieve top subchapter
    sub_results = vectorstore_subchapters.similarity_search(query, k=1)
    if not sub_results:
        print("⚠️ No matching subchapter found.")
        return
    top_subchapter = sub_results[0].page_content.strip()
    print(f"Top Subchapter: {top_subchapter}")

    # 2️⃣ Retrieve content chunks
    chunk_doc = subchapter_to_chunk.get(top_subchapter)
    if not chunk_doc:
        print(f"❌ No chunk found for subchapter '{top_subchapter}'")
        return

    content_results = vectorstore_content.similarity_search(query, k=TOP_K)
    combined_texts = [doc.page_content for doc in content_results]
    retrieved_content = "\n\n".join(combined_texts)

    # 3️⃣ Check if figures exist, call figure_description_agent
    figures_list = chunk_doc.get("figures", [])
    figures_str = ""
    if figures_list:
        figures_str = figure_description_agent(top_subchapter, figures_list)

    # 4️⃣ Generate lesson via LLM
    chain = LLMChain(llm=llm, prompt=lesson_prompt)
    lesson = chain.run(retrieved_content=retrieved_content, figures=figures_str)
    print("\n--- Generated Lesson ---\n")
    print(lesson)

def figure_description_agent(subchapter, figures_list):
    # Map figure names to descriptions
    chapter_figures = chapter_to_figures.get(subchapter.split()[0], [])
    desc_map = {f["name"]: f.get("desc", "") for f in chapter_figures}

    figure_descriptions = []
    for fig_name in figures_list:
        desc = desc_map.get(fig_name, "No description available")
        figure_descriptions.append(f"{fig_name}: {desc}")

    figure_text = "\n".join(figure_descriptions)

    # Generate explanation using LLM
    chain = LLMChain(llm=llm, prompt=figure_prompt)
    explanation = chain.run(figure_descriptions=figure_text)
    return explanation

# -----------------------------
# LangGraph Setup
# -----------------------------
# This is the simplest orchestration; you can expand with states and handoffs
def run_aira():
    user_query = input("Enter subchapter/topic: ").strip()
    main_agent(user_query)

if __name__ == "__main__":
    run_aira()
