In [1]:
! pip install numpy==1.26.4 ipykernel phidata openai ipywidgets duckduckgo-search yfinance crawl4ai lancedb sentence-transformers torch pypdf chromadb duckdb




The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `os`: Provides a way to interact with the operating system.
    - `load_dotenv` from `dotenv`: Loads environment variables from a `.env` file into the system environment.

2. **Loading Environment Variables**:
    - `load_dotenv()`: Reads the `.env` file and loads the variables into the environment.

3. **Clearing Conflicting Environment Variables**:
    - A list of environment variables (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `OPENAI_API_BASE`) is defined.
    - For each variable in the list, if it exists in the environment, it is removed to avoid conflicts. A warning message is printed for each removed variable.

4. **Setting New Environment Variables**:
    - The `OPENAI_API_KEY` environment variable is set to the value of `OPEN_AI_KEY` from the environment.

This setup ensures that the environment is properly configured and free of conflicts before proceeding with further operations.


In [2]:

import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()


# env_vars_to_clear = ['OPENAI_API_KEY', 'OPENAI_BASE_URL', 'OPENAI_API_BASE']
# for var in env_vars_to_clear:
#     if os.getenv(var):
#         print(f"⚠️  Removing conflicting {var}")
#         del os.environ[var]


os.environ["OPENAI_API_KEY"] = os.getenv("OPEN_ROUTER_KEY")
os.environ['OPENAI_API_BASE'] = 'https://openrouter.ai/api/v1'
os.environ['OPENAI_BASE_URL'] = 'https://openrouter.ai/api/v1'



# env_vars_to_clear = ['OPENAI_API_KEY', 'OPENAI_BASE_URL', 'OPENAI_API_BASE']
# for var in env_vars_to_clear:
#     if os.getenv(var):
#         print(f"⚠️  Removing conflicting {var}")
#         del os.environ[var]
# os.environ["OPENAI_API_KEY"] = os.getenv("OPEN_AI_KEY")






The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `os`: Provides a way to interact with the operating system.
    - `load_dotenv` from `dotenv`: Loads environment variables from a `.env` file into the system environment.

2. **Loading Environment Variables**:
    - `load_dotenv()`: Reads the `.env` file and loads the variables into the environment.

3. **Clearing Conflicting Environment Variables**:
    - A list of environment variables (`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `OPENAI_API_BASE`) is defined.
    - For each variable in the list, if it exists in the environment, it is removed to avoid conflicts. A warning message is printed for each removed variable.

4. **Setting New Environment Variables**:
    - The `OPENAI_API_KEY` environment variable is set to the value of `OPEN_AI_KEY` from the environment.

This setup ensures that the environment is properly configured and free of conflicts before proceeding with further operations.


In [3]:
# Install the phidata library and import necessary modules for working with agents, tools, knowledge, and storage components.

# Install the phidata library
# ! pip install phidata

# Import necessary modules from phidata
from phi.agent import Agent
from phi.tools import Tool
from phi.model.openai import OpenAIChat


```markdown
The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `Agent` and `AgentKnowledge` from `phi.agent`: These are used to create an intelligent agent and manage its knowledge base.
    - `OpenAIEmbedder` from `phi.embedder.openai`: This is used to generate embeddings for text using OpenAI's embedding models.
    - `LanceDb` and `SearchType` from `phi.vectordb.lancedb`: These are used to manage a vector database for storing and searching embeddings.

2. **Creating a Knowledge Base**:
    - An `AgentKnowledge` object is created with a `LanceDb` vector database.
    - The vector database is configured with:
        - `table_name`: The name of the table in the database (`"recipes"`).
        - `uri`: The location of the database (`"tmp/lancedb"`).
        - `search_type`: The type of search to perform (`SearchType.vector`).
        - `embedder`: The embedding model to use (`OpenAIEmbedder` with the model `"text-embedding-3-small"`).

3. **Adding Information to the Knowledge Base**:
    - A sample text (`"The sky is green"`) is loaded into the knowledge base.

4. **Creating an Agent**:
    - An `Agent` object is created with the knowledge base.
    - The agent is configured to search the knowledge base when needed (`search_knowledge=True`).

This setup initializes an intelligent agent with a knowledge base that can store and retrieve information using vector embeddings.
```

In [4]:
from phi.agent import Agent, AgentKnowledge
# from phi.embedder.openai import OpenAIEmbedder
from phi.embedder.sentence_transformer import SentenceTransformerEmbedder
from phi.vectordb.lancedb import LanceDb, SearchType

# Create a knowledge base for the Agent
knowledge_base = AgentKnowledge(vector_db=LanceDb(
        table_name="custom_knowledge",
        uri="tmp/lancedb",
        search_type=SearchType.vector,
        # embedder=OpenAIEmbedder(model="text-embedding-3-small"))
        embedder=SentenceTransformerEmbedder(model="all-MiniLM-L6-v2" ) 
    ),)



# Add information to the knowledge base
knowledge_base.load_text("The sky is green")

# Add the knowledge base to the Agent and
# give it a tool to search the knowledge base as needed
agent = Agent(knowledge=knowledge_base, search_knowledge=True)

[90m[[0m2025-12-20T15:29:42Z [33mWARN [0m lance::dataset::write::insert[90m][0m No existing dataset at /Users/arifshaikh/Documents/GitHub/GenAIEngineering-Cohort3/Week12/Day_1/tmp/lancedb/custom_knowledge.lance, it will be created



The code in the next cell performs the following actions:

1. **Running the Agent**:
    - The `run` method of the `agent` object is called with the input `"answer from the knowledge base, what is the color of sky?"`.
    - This prompts the agent to search its knowledge base and provide an answer based on the stored information.

2. **Knowledge Base Query**:
    - The agent uses its configured knowledge base to retrieve relevant information.
    - In this case, the knowledge base contains the text `"The sky is green"`, which will be used to generate a response.

This code demonstrates how the agent interacts with its knowledge base to answer a specific query.


In [5]:
agent.run("what is the color of sky?")

RunResponse(content="The color of the sky is typically blue during the daytime. This is because molecules in the Earth's atmosphere scatter sunlight in all directions, and blue light is scattered more than other colors because it travels in shorter, smaller waves. This scattering causes the direct sunlight to appear white, while the scattered light from the sky appears blue.", content_type='str', event='RunResponse', messages=[Message(role='user', content='what is the color of sky?', name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={}, references=None, created_at=1766244583), Message(role='assistant', content="The color of the sky is typically blue during the daytime. This is because molecules in the Earth's atmosphere scatter sunlight in all directions, and blue light is scattered more than other colors because it travels in shorter, smaller waves. This scatter

In [6]:
agent.run("answer from the provided knowledge base, what is the color of sky?")

RunResponse(content='According to the knowledge base, the sky is green.', content_type='str', event='RunResponse', messages=[Message(role='user', content='answer from the provided knowledge base, what is the color of sky?', name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={}, references=None, created_at=1766244585), Message(role='assistant', content='', name=None, tool_call_id=None, tool_calls=[{'id': 'call_6FG3JJzfliuv0SAwBZ5XDZTn', 'function': {'arguments': '{"query":"color of the sky"}', 'name': 'search_knowledge_base'}, 'type': 'function', 'index': 0}], audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={'time': 0.6532571249990724, 'input_tokens': 77, 'prompt_tokens': 77, 'output_tokens': 19, 'completion_tokens': 19, 'total_tokens': 96, 'prompt_tokens_details': {'audio_tokens': 0, '

In [7]:
# from phi.agent import Agent, AgentKnowledge
# from phi.vectordb.lancedb import LanceDb
# from phi.vectordb.search import SearchType
# from phi.embedder.fastembed import FastEmbedEmbedder

# # Create knowledge base with recreate=True
# knowledge_base = AgentKnowledge(
#     vector_db=LanceDb(
#         table_name="fastembed_documents",
#         uri="tmp/lancedb",
#         search_type=SearchType.vector,
#         embedder=FastEmbedEmbedder(),
#     )
# )

# # Add data to the knowledge base
# knowledge_base.load_text("The sky is green")
# # knowledge_base.load_text("Machine learning is a subset of artificial intelligence")

# # Load with recreate=True to rebuild the schema
# # knowledge_base.load(recreate=True)  # This will recreate the table


The code in the next cell performs the following actions:

1. **Creating a Knowledge Base from a PDF**:
    - The `PDFUrlKnowledgeBase` class is used to create a knowledge base from a PDF file.
    - The PDF file is located at the URL `"https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"`.

2. **Configuring the Vector Database**:
    - The knowledge base uses `LanceDb` as the vector database for storing and searching embeddings.
    - The vector database is configured with:
        - `table_name`: The name of the table in the database (`"recipes"`).
        - `uri`: The location of the database (`"tmp/lancedb"`).
        - `search_type`: The type of search to perform (`SearchType.vector`).
        - `embedder`: The embedding model to use (`OpenAIEmbedder` with the model `"text-embedding-3-small"`).

3. **Loading the Knowledge Base**:
    - The `load()` method is called to load the PDF content into the knowledge base.
    - This step is commented out after the first run to avoid reloading the data.

4. **Creating an Agent**:
    - An `Agent` object is created with the following configurations:
        - `model`: The agent uses the `OpenAIChat` model with the ID `"gpt-4o"`.
        - `knowledge`: The knowledge base created from the PDF is added to the agent.
        - `show_tool_calls`: Enables displaying tool calls made by the agent.
        - `markdown`: Enables markdown formatting for the agent's responses.
        - `stream`: Enables streaming of the agent's responses.

5. **Commented Example Query**:
    - An example query (`"How do I make chicken and galangal in coconut milk soup"`) is provided but commented out.
    - Uncommenting this line allows the agent to generate a response based on the knowledge base.

This code demonstrates how to create a knowledge base from a PDF, configure it with a vector database, and use it with an intelligent agent to answer queries.


In [15]:
from phi.knowledge.pdf import PDFUrlKnowledgeBase

# Create a knowledge base from a PDF
knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
    # Use LanceDB as the vector database
    vector_db=LanceDb(
        table_name="recipes",
        uri="tmp/lancedb",
        search_type=SearchType.vector,
        # embedder=OpenAIEmbedder(model="text-embedding-3-small"),
        embedder=SentenceTransformerEmbedder(model="all-MiniLM-L6-v2" ) 
    ),
)
# Comment out after first run as the knowledge base is loaded
knowledge_base.load()

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    # Add the knowledge base to the agent
    knowledge=knowledge_base,
    show_tool_calls=True,
    markdown=True,  
    stream=True
)
# agent.print_response("How do I make chicken and galangal in coconut milk soup")


The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `Markdown` and `display` from `IPython.display`: These are used to render and display Markdown content in the Jupyter Notebook.

2. **Running the Agent**:
    - The `run` method of the `agent` object is called with the input `"How do I make chicken and galangal in coconut milk soup"`.
    - This prompts the agent to search its knowledge base and provide a response based on the stored information.

3. **Displaying the Response**:
    - The response from the agent is stored in the variable `response`.
    - The `content` attribute of the `response` object is rendered as Markdown using the `Markdown` class.
    - The `display` function is used to display the rendered Markdown content in the notebook.

This code demonstrates how to query the agent and display its response in a readable Markdown format within the Jupyter Notebook.


In [16]:
from IPython.display import Markdown, display

response=agent.run("restrict response from provided knowledge base. How do I make chicken and galangal in coconut milk soup")

# print(response.content)
display(Markdown(response.content))


Running:
 - search_knowledge_base(query=chicken and galangal in coconut milk soup recipe)

Here's a recipe for Chicken and Galangal in Coconut Milk Soup (Tom Kha Gai):

### Ingredients (One Serving):
- 150 grams chicken, cut into bite-size pieces
- 50 grams sliced young galangal
- 100 grams lightly crushed lemongrass, julienned
- 100 grams straw mushrooms
- 250 grams coconut milk
- 100 grams chicken stock
- 3 tablespoons lime juice
- 3 tablespoons fish sauce
- 2 leaves kaffir lime, shredded
- 1-2 bird’s eye chilies, pounded
- 3 leaves coriander

### Directions:
1. **Prepare the Base:** Bring the chicken stock and coconut milk to a slow boil.
2. **Add Ingredients:** Add galangal, lemongrass, chicken, and mushrooms. 
3. **Season:** Once the soup returns to a boil, season it with fish sauce.
4. **Cook Chicken:** Wait until the chicken is cooked, then add the kaffir lime leaves and bird’s eye chilies.
5. **Finish & Serve:** Remove the pot from heat and add lime juice. Garnish with coriander leaves.

### Tips:
- **Heat Control:** Keep the heat low to prevent the coconut milk oil from separating.
- **Galangal:** Reduce amount if using mature galangal.
- **Lime Juice:** Add after removing from heat for more aroma.
- **Chilies:** Adjust chilies for less spice if needed.

Enjoy your flavorful and aromatic soup!


The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `CSVKnowledgeBase` from `phi.knowledge.csv`: Used to create a knowledge base from a CSV file.
    - `ChromaDb` from `phi.vectordb.chroma`: Used to manage a vector database for storing and searching embeddings.

2. **Creating a Knowledge Base**:
    - A `CSVKnowledgeBase` object is created with the following configurations:
        - `path`: Specifies the location of the CSV file (`"wip"`).
        - `vector_db`: Configures the vector database using `ChromaDb` with the collection name `"imdb_csv"`.

This setup initializes a knowledge base from a CSV file and configures it with a vector database for efficient storage and retrieval of embeddings.


In [18]:
from phi.knowledge.csv import CSVKnowledgeBase
from phi.vectordb.chroma import ChromaDb

knowledge_base = CSVKnowledgeBase(path='.', vector_db=ChromaDb(collection="imdb_csv"))
print(knowledge_base)


reader=CSVReader(chunk=True, chunk_size=3000, separators=['\n', '\n\n', '\r', '\r\n', '\n\r', '\t', ' ', '  '], chunking_strategy=<phi.document.chunking.fixed.FixedSizeChunking object at 0x38ae3f5f0>) vector_db=<phi.vectordb.chroma.chromadb.ChromaDb object at 0x38d7b6710> num_documents=5 optimize_on=1000 driver='knowledge' chunking_strategy=<phi.document.chunking.fixed.FixedSizeChunking object at 0x38ae3f5f0> path='.'



The code in the next cell performs the following actions:

1. **Importing Modules**:
    - `Agent` from `phi.agent`: Used to create an intelligent agent.

2. **Creating an Agent**:
    - An `Agent` object is created with the following configurations:
        - `knowledge`: The knowledge base created from the CSV file is added to the agent.
        - `search_knowledge`: Enables the agent to search the knowledge base when needed.

3. **Loading the Knowledge Base**:
    - The `load` method of the `knowledge` attribute of the agent is called with the parameter `recreate=False`.
    - This ensures that the knowledge base is loaded without recreating it, preserving any existing data.

This code demonstrates how to initialize an agent with a CSV-based knowledge base and prepare it for querying.


In [19]:
from phi.agent import Agent


agent = Agent(
    knowledge=knowledge_base,
    search_knowledge=True,
)

agent.knowledge.load(recreate=False)






The code in the next cell performs the following actions:

1. **Querying the Agent**:
    - The `run` method of the `agent` object is called with the input `"List action movies"`.
    - This prompts the agent to search its knowledge base and provide a response based on the stored information.

2. **Displaying the Response**:
    - The response from the agent is stored in the variable `response`.
    - The `content` attribute of the `response` object is rendered as Markdown using the `Markdown` class.
    - The `display` function is used to display the rendered Markdown content in the notebook.

This code demonstrates how to query the agent for a specific request and display its response in a readable Markdown format within the Jupyter Notebook.


In [None]:
response=agent.run("List action and Comedy movies")
#response=agent.run("List comedy movies")

display(Markdown(response.content))

Here are some comedy movies you might enjoy:

1. **Disaster Movie (2008)** - A group of young adults are caught in a series of natural disasters and catastrophic events. Directed by Jason Friedberg and starring Carmen Electra and Vanessa Lachey.

2. **Diary of a Wimpy Kid: Dog Days (2012)** - Greg is out of school for the summer and is looking forward to a memorable vacation. Directed by David Bowers and starring Zachary Gordon and Steve Zahn.

3. **Fool's Gold (2008)** - A married couple on the brink of divorce is brought back together by the hunt for a lost treasure. Directed by Andy Tennant and starring Matthew McConaughey and Kate Hudson.

4. **The Dictator (2012)** - Follows the heroic story of a dictator who will do anything to avoid democracy. Directed by Larry Charles and starring Sacha Baron Cohen and Anna Faris.

5. **You Don't Mess with the Zohan (2008)** - An Israeli Special Forces Soldier fakes his death to live his dream of becoming a New York hairstylist. Directed by Dennis Dugan and starring Adam Sandler and John Turturro.

6. **Horrible Bosses 2 (2014)** - After failing with a startup business, three friends plan a loaded kidnapping scheme. Directed by Sean Anders and starring Jason Bateman and Charlie Day.

7. **The Princess and the Frog (2009)** - A waitress, who dreams of opening a restaurant, must turn a frog back into a prince. Directed by Ron Clements and starring Anika Noni Rose and Keith David.

8. **3 Idiots (2009)** - A story about two friends who are trying to locate their long-lost college buddy, who inspired them in different ways. Directed by Rajkumar Hirani and starring Aamir Khan and Madhavan.

9. **Zoolander 2 (2016)** - Derek and Hansel are lured back to the fashion world but face a conspiracy. Directed by Ben Stiller and starring Owen Wilson and Penélope Cruz.

10. **Folk Hero & Funny Guy (2016)** - A singer-songwriter helps his friend's struggling comedy career and personal life by making him his opening act. Directed by Jeff Grace and starring Alex Karpovsky and Wyatt Russell.

These movies offer a range of comedic themes from slapstick to more story-driven comedy. Enjoy your watch!

In [13]:
agent.run("what is the color of sky?")

RunResponse(content="The color of the sky is typically blue. This is due to a phenomenon called Rayleigh scattering, where molecules and small particles in the Earth's atmosphere scatter sunlight. Blue light is scattered more than other colors because it travels in shorter, smaller waves, making the sky appear blue to our eyes.", content_type='str', event='RunResponse', messages=[Message(role='user', content='what is the color of sky?', name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={}, references=None, created_at=1766244645), Message(role='assistant', content="The color of the sky is typically blue. This is due to a phenomenon called Rayleigh scattering, where molecules and small particles in the Earth's atmosphere scatter sunlight. Blue light is scattered more than other colors because it travels in shorter, smaller waves, making the sky appear blue to our e

In [14]:
agent.run("answer from the provided knowledge base, what is the color of sky?")

RunResponse(content='The search results did not provide relevant results to the query "color of sky." However, based on known information, the color of the sky is generally blue due to Rayleigh scattering, which is the scattering of sunlight off the molecules of the atmosphere.', content_type='str', event='RunResponse', messages=[Message(role='user', content='answer from the provided knowledge base, what is the color of sky?', name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={}, references=None, created_at=1766244647), Message(role='assistant', content='', name=None, tool_call_id=None, tool_calls=[{'id': 'call_Eq3G4R5aqASBeWMPu9670bJE', 'function': {'arguments': '{"query":"color of sky"}', 'name': 'search_knowledge_base'}, 'type': 'function', 'index': 0}], audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool