This notebook is for the SMART Workshop (Djerba, Dec 2024). It covers an implementation of a RAG system in part I and introduces a simple agentic system in part II.


# Part I Rag From Scratch


In [None]:
import google.generativeai as genai

genai.configure(api_key="your_api_key")
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("tell me about Djerba")
print(response.text)

Djerba (or Jerba) is an island off the coast of Tunisia in North Africa, located in the Gulf of Gabès.  It's the largest island in North Africa and is known for its distinct culture, beautiful beaches, and rich history. Here's a breakdown of what makes Djerba special:

**Key Features:**

* **Beaches:** Djerba boasts stunning beaches, renowned for their fine, white sand and turquoise waters.  Many resorts cater to tourists seeking sun, sea, and sand holidays.  The island is a popular destination for beach vacations, especially for Europeans.

* **Culture:** Djerba has a unique blend of Berber, Arab, and Jewish cultures, creating a fascinating tapestry of traditions.  The island's history is marked by the peaceful coexistence of these groups, though unfortunately, many Jewish residents emigrated following Tunisian independence. The island's distinct architecture, cuisine, and customs reflect this multicultural heritage.  You'll see influences reflected in the architecture, clothing, food

In [None]:
! pip install langchain langchain-chroma langchain-google-genai

`(2) LangSmith`

https://docs.smith.langchain.com/

In [None]:
import os
os.environ['LANGCHAIN_API_KEY'] = 'your_api_key'
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['GOOGLE_API_KEY'] = 'your_api_key'

Imports

In [None]:
import bs4
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import ChatGoogleGenerativeAI,GoogleGenerativeAIEmbeddings
from langchain.schema import Document
from langchain_core.vectorstores import InMemoryVectorStore



Loading data

In [None]:
import pandas as pd

#TODO add file name
#TODO add separator type
df = pd.read_csv('your_file_name.csv', sep='separator_type')  # Load the CSV file

print(df['Abstract'].head(2)) #print first two abstracts

0    Wireless sensor network (WSN) plays an importa...
1    No matter when, transportation construction is...
Name: Abstract, dtype: object


Split the text into chunks
not needed in this example because the data is already divided, where each row is a separate element

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,  # Size of each chunk
    chunk_overlap=50  # Overlap between chunks
)
chunks = text_splitter.split_documents(documents)

create and populate a vector database

In [None]:
from langchain_chroma import Chroma

# create a chroma database
# here we are using the google AI embedding.

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

#TODO add embeddings
vector_store = Chroma(
    collection_name="abstract_collection",
    embedding_function= your_embeddings,
    persist_directory="./chroma_langchain_db",  # Where to save data locally, remove if not necessary
)

#since the elements are already split, we will put every abstract in a document

docs = [Document(page_content=abstract) for abstract in  df['Abstract']]

vector_store.add_documents(documents=docs)

search for an element in the database using similarity search

In [None]:
#TODO add your questions
results = vector_store.similarity_search(
    "add_your_question",
  k=2,
)
print(results)

[Document(metadata={}, page_content='Wireless sensor network (WSN) plays an important role in information collection and processing in Intelligent Transportation Systems (ITS) recently. With rapid development of ITS, the size and complexity of WSN is rapidly increasing. Thus, how to evaluate the reliability of large-scale WSN has attracted more and more attention. In this article, reliability is defined as the probability that the WSN is functional. A confident information coverage (CIC) model-based reliability algorithm (CICRA) is proposed to comprehensively consider coverage reliability and connectivity reliability. Especially, to determine the impact of failing nodes on connectivity in large-scale WSN, a grid clustering connectivity algorithm (GCCA) is proposed to reduce the complexity of computing the connectivity between large-scale wireless sensor nodes, which transforms the connectivity problem between nodes into a grid connectivity problem. In addition, a 3-order tensor modelin

In [None]:

#### RETRIEVAL and GENERATION ####

# Prompt
prompt = hub.pull("rlm/rag-prompt")

# LLM
#TODO add your model
llm = ChatGoogleGenerativeAI(model="your_model", temperature=0.3)

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


qa_chain = (
    {
        "context": vector_store.as_retriever() | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)

qa_chain.invoke("what is reliability in the context of WSN?")



'Reliability in the context of WSN is defined as the probability that the WSN is functional. It is a measure of the ability of the WSN to perform its intended function under specified conditions. Reliability is an important aspect of WSN design, as it affects the overall performance and dependability of the system.'

Adding a web interface

In [None]:
!pip install gradio

Collecting gradio
  Downloading gradio-5.9.1-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.5.2 (from gradio)
  Downloading gradio_client-1.5.2-py3-none-any.whl.metadata (7.1 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.18 (from gradio)
  Downloading python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.2.2 (from gradio)
  Downloading ruff-0.8.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (25 kB)
Collecting safehttpx<0.2.0,>=0.1.6 (from gradio)
  Downloading safehttpx-0.1.6-py3-none-any.whl.met

In [None]:
import gradio as gr

def rag_search(question):
    return qa_chain.invoke(question)

#TODO add function
demo = gr.Interface(fn=your_function, inputs="text", outputs="text")
demo.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://7b4265991b5d78aeba.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# Part II: Creating agents

In [None]:
pip install colorama

Collecting colorama
  Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: colorama
Successfully installed colorama-0.4.6


In [None]:
import os
from langchain.schema import AIMessage, HumanMessage
from colorama import Fore, Back, Style

# Step 1: define an agent
def create_agent(model_name):
    return genai.GenerativeModel(model_name)

# Step 2: Initialize two agents
ContentGenerator = create_agent(model_name="gemini-1.5-flash")
contentReviewer = create_agent(model_name="gemini-1.5-flash")  # Both agents use the same model here


# Step 3: Start a conversation loop
def two_agents_conversation(agent1, agent2, initial_message, rounds=5):
    print("Initial Message:", initial_message)
    comments = "no comments";

    for i in range(rounds):
        print(f"\nRound {i + 1}:")

        # Agent 1 responds
        response1 = agent1.generate_content("You are trip planer. generate a plan based on the following request:"+initial_message
                                            +" while considering the follwoing comments: "+comments
                                            +". your answer should be no longer than 10 lines")
        print(Fore.BLUE +"Agent 1:", response1.text)

        # Pass Agent 1's response to Agent 2
        response2 = agent2.generate_content("consider the following request:"+ initial_message
                                            + ". what do you think about the following plan. only generate comments for improvements."
                                            +"make you answer no longer than 5 lines. "+response1.text)
        print(Fore.RED +"Agent 2:", response2.text)

        # Update the current message for the next round
        comments = response2.text

# Step 4: Start the conversation
initial_message = "three days in Tunisia"
two_agents_conversation(ContentGenerator, contentReviewer, initial_message)



Initial Message: three days in Tunisia

Round 1:
[34mAgent 1: Day 1: Explore Tunis Medina (old city) and the Bardo Museum.

Day 2:  Travel to Sidi Bou Said, a charming blue-and-white village, then onto Carthage for historical sites.

Day 3:  Desert excursion to the Sahara (optional, requires early start) or explore the coastal town of Hammamet.  Enjoy a final Tunisian dinner.

[31mAgent 2: Day 3's Sahara trip is ambitious for a single day; consider a shorter desert experience or focusing solely on Hammamet.  Tunis Medina could easily take longer than a single day. Prioritize must-see sights within the time constraints.  Factor in travel time between locations, especially for Day 2.  Pre-book transportation, especially for the desert excursion.


Round 2:
[34mAgent 1: **Day 1:** Tunis Medina (morning/afternoon), Carthage ruins (afternoon/evening).

**Day 2:** Hammamet (full day): beach, medina, optional boat trip.  Pre-book transport.

**Day 3:**  Hammamet: Relax, explore further, or