This notebook is for the SMART Workshop (Djerba, Dec 2024). It covers an implementation of a RAG system in part I and introduces a simple agentic system in part II.


In [1]:
! pip install langchain langchain-chroma langchain-google-genai



# Part I Rag From Scratch


In [2]:
import google.generativeai as genai

genai.configure(api_key="your_api_key")
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("tell me about Djerba")
print(response.text)



BadRequest: 400 POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?%24alt=json%3Benum-encoding%3Dint: API key not valid. Please pass a valid API key.

In [3]:
! pip install langchain langchain-chroma langchain-google-genai



`(2) LangSmith`

https://docs.smith.langchain.com/

In [4]:
import os
os.environ['LANGCHAIN_API_KEY'] = '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'
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['GOOGLE_API_KEY'] = 'AIzaSyCh6u53hyU9kFMLEMbqXyCQv1Ik7RxzFr0'

Imports

In [5]:
import bs4
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import ChatGoogleGenerativeAI,GoogleGenerativeAIEmbeddings
from langchain.schema import Document
from langchain_core.vectorstores import InMemoryVectorStore



Loading data

In [12]:
import pandas as pd

# Add the correct file name
file_name = "valorant_agents_extended.csv"  # Make sure this is the correct path

# Add the separator type (comma-separated values)
separator_type = ','

# Load the CSV file
df = pd.read_csv(file_name, sep=separator_type, encoding="utf-8")

# Print the first two values from the "Biography" column (assuming "Abstract" is a mistake)
print(df['Biography'].head(2))


0    Representing her home country of South Korea, ...
1    Hailing from the U.K., Phoenix's star power sh...
Name: Biography, dtype: object


Split the text into chunks
not needed in this example because the data is already divided, where each row is a separate element

In [17]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,  # Size of each chunk
    chunk_overlap=50  # Overlap between chunks
)

# Assuming 'df' contains your data and 'Biography' column holds the text you want to split
# Replace 'Biography' with the actual column name if different
documents = [Document(page_content=text) for text in df['Biography']]

chunks = text_splitter.split_documents(documents)

create and populate a vector database

In [18]:
from langchain_chroma import Chroma

# create a chroma database
# here we are using the google AI embedding.

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

#TODO add embeddings
vector_store = Chroma(
    collection_name="abstract_collection",
    embedding_function= embeddings,
    persist_directory="./chroma_langchain_db",  # Where to save data locally, remove if not necessary
)

#since the elements are already split, we will put every abstract in a document

docs = [Document(page_content=abstract) for abstract in  df['Biography']]

vector_store.add_documents(documents=docs)

['e37ac9ee-b45d-4f22-ae57-a7084fdce6f7',
 '34592c86-3eb3-4258-b7ff-4060d0542ae0',
 '564e67a8-1948-4d30-aa4c-78014b965655',
 'fd17c891-2a31-4d3a-a1f6-c87d0110e900',
 '23a78b7d-b5b6-4b29-ac5c-b89e0ee5063e',
 '1c884b41-2ad5-4dae-a8d4-e763d88d448f',
 '0f7623fd-919d-44df-91dc-c3c9f1f70603',
 '0d903b0f-bd82-44f3-9e16-79d5ed42887a',
 '62198c86-3187-4db8-bf84-801ce13e6295',
 '7c2ce7b6-11f0-48ec-b401-c880bf1539cf',
 '78a58800-ec20-4689-aabb-c30a5326b23f',
 '835d0288-902b-46e2-ade7-2966a9a351a5',
 'b5a6afc1-033f-4194-9780-8874003502a1',
 '03636dbc-77c0-44c4-b944-fac98e1ac536',
 '2d8544d2-1cf1-44dd-aca2-4e8a441a5d10',
 '5f8ba10c-2339-4e73-8331-4df7b612ba47',
 '6a95a8e2-ae60-4f00-a6d7-94f1c92f05a7',
 'be1ebb73-3fd9-4ffa-8ce2-a6ac6f6be037',
 'b9c9fbcc-3b6b-4a14-9c3f-ca47fe421929',
 'dd24d613-a333-46b5-a0dc-f4fb006253be',
 '5f1e476c-0389-4d73-84ca-e1198cba632a',
 'fb47476a-29aa-4240-900c-15e8cf946399',
 '11b04370-76c4-4791-9d80-20ade9456be8',
 '1b11cc59-c009-4914-b512-4976e52a9218']

search for an element in the database using similarity search

In [21]:
#TODO add your questions
results = vector_store.similarity_search(
    "abilites of sage ?",
  k=2,
)
print(results)

[Document(id='564e67a8-1948-4d30-aa4c-78014b965655', metadata={}, page_content='The stronghold of China, Sage creates safety for herself and her team wherever she goes.'), Document(id='7c2ce7b6-11f0-48ec-b401-c880bf1539cf', metadata={}, page_content='Raze explodes out of Brazil with her big personality and big guns. With her blunt-force-trauma playstyle, she excels at flushing entrenched enemies.')]


In [25]:

#### RETRIEVAL and GENERATION ####

# Prompt
prompt = hub.pull("rlm/rag-prompt")

# LLM
#TODO add your model
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.3)

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


qa_chain = (
    {
        "context": vector_store.as_retriever() | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)

qa_chain.invoke("what is the role of reyna")



'Reyna excels in single combat, gaining power with each kill.  She is a dominant force in one-on-one fights.  Her origin is Mexico.'

Adding a web interface

In [26]:
!pip install gradio

Collecting gradio
  Downloading gradio-5.20.1-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.7.2 (from gradio)
  Downloading gradio_client-1.7.2-py3-none-any.whl.metadata (7.1 kB)
Collecting groovy~=0.1 (from gradio)
  Downloading groovy-0.1.2-py3-none-any.whl.metadata (6.1 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.9.3 (from gradio)
  Downloading ruff-0.9.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (25 kB

In [27]:
import gradio as gr

def rag_search(question):
    return qa_chain.invoke(question)

#TODO add function
demo = gr.Interface(fn=rag_search, inputs="text", outputs="text")
demo.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a6ac31b37a0fd92d2d.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# Part II: Creating agents

In [None]:
pip install colorama

Collecting colorama
  Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: colorama
Successfully installed colorama-0.4.6


In [None]:
import os
from langchain.schema import AIMessage, HumanMessage
from colorama import Fore, Back, Style

# Step 1: define an agent
def create_agent(model_name):
    return genai.GenerativeModel(model_name)

# Step 2: Initialize two agents
ContentGenerator = create_agent(model_name="gemini-1.5-flash")
contentReviewer = create_agent(model_name="gemini-1.5-flash")  # Both agents use the same model here


# Step 3: Start a conversation loop
def two_agents_conversation(agent1, agent2, initial_message, rounds=5):
    print("Initial Message:", initial_message)
    comments = "no comments";

    for i in range(rounds):
        print(f"\nRound {i + 1}:")

        # Agent 1 responds
        response1 = agent1.generate_content("You are trip planer. generate a plan based on the following request:"+initial_message
                                            +" while considering the follwoing comments: "+comments
                                            +". your answer should be no longer than 10 lines")
        print(Fore.BLUE +"Agent 1:", response1.text)

        # Pass Agent 1's response to Agent 2
        response2 = agent2.generate_content("consider the following request:"+ initial_message
                                            + ". what do you think about the following plan. only generate comments for improvements."
                                            +"make you answer no longer than 5 lines. "+response1.text)
        print(Fore.RED +"Agent 2:", response2.text)

        # Update the current message for the next round
        comments = response2.text

# Step 4: Start the conversation
initial_message = "three days in Tunisia"
two_agents_conversation(ContentGenerator, contentReviewer, initial_message)



Initial Message: three days in Tunisia

Round 1:
[34mAgent 1: Day 1: Explore Tunis Medina (old city) and the Bardo Museum.

Day 2:  Travel to Sidi Bou Said, a charming blue-and-white village, then onto Carthage for historical sites.

Day 3:  Desert excursion to the Sahara (optional, requires early start) or explore the coastal town of Hammamet.  Enjoy a final Tunisian dinner.

[31mAgent 2: Day 3's Sahara trip is ambitious for a single day; consider a shorter desert experience or focusing solely on Hammamet.  Tunis Medina could easily take longer than a single day. Prioritize must-see sights within the time constraints.  Factor in travel time between locations, especially for Day 2.  Pre-book transportation, especially for the desert excursion.


Round 2:
[34mAgent 1: **Day 1:** Tunis Medina (morning/afternoon), Carthage ruins (afternoon/evening).

**Day 2:** Hammamet (full day): beach, medina, optional boat trip.  Pre-book transport.

**Day 3:**  Hammamet: Relax, explore further, or