# Introduction to Large language models

## What are large language models ?

Large language models are advanced artificial intelligence programs that can understand and generate human-like text. They're trained on vast amounts of written material from the internet and other sources, allowing them to process and produce language on a wide range of topics.

While they can produce remarkably human-like responses, they don't truly understand or think like humans do. Instead, they excel at recognizing and replicating language patterns to generate appropriate responses to prompts or questions.

![img](https://www.investopedia.com/thmb/ulGrKT5WnVclGMOgQQVe65OtmeI=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/large-language-model-7563532-final-9e350e9fa02d4685887aa061af7a2de2.png)

## How do LLM's work ?

Large language models are trained on massive datasets of text using a process called machine learning. They analyze patterns in this data to understand how language is structured and used. During training, the model is repeatedly shown examples of text and learns to predict what words or phrases are likely to come next. This process involves adjusting millions or even billions of internal parameters to improve its predictions.



As training progresses, the model becomes better at understanding context, grammar, and even subtle nuances in language. Once trained, when given a prompt or question, the model uses its learned patterns to generate relevant and coherent responses. It's like the model has developed a complex map of language, allowing it to navigate and produce human-like text on a vast array of topics.

![img](https://miro.medium.com/v2/resize:fit:2000/1*faLf-OAINgRAyMyCLyZLvg.png)

## Capabilities and Uses of LLM's

Large language models (LLMs) have a wide range of applications across various fields, thanks to their ability to understand and generate human-like text. These AI-powered tools can assist with numerous language-related tasks, enhancing productivity and creativity in many areas. Here are some key capabilities and use-cases of LLMs:

- Text generation (stories, articles, code)
- Question answering
- Language translation
- Summarization
- Sentiment analysis
- Code completion and generation

These applications demonstrate the versatility of LLMs in handling diverse language-based tasks, from creative writing to technical analysis. As the technology continues to evolve, we can expect to see even more innovative uses emerge in various industries and everyday life.

![img](https://www.baeldung.com/wp-content/uploads/sites/4/2023/05/Foundation-Models.jpg)

## What are the LLMs that are available today ?

Several prominent large language models (LLMs) are available today, each with unique features and capabilities.

- **OpenAI GPT series**: Includes GPT-3 and GPT-4, known for versatility and wide-ranging applications.
- **Google Bard (LaMDA)**: Renowned for its conversational abilities.
- **Meta's LLaMA**: Focuses on efficient performance.
- **Claude by Anthropic**: Emphasizes safety and ethical use.
- **Mistral series**: Prioritizes open-source development.







![img](https://miro.medium.com/v2/resize:fit:2000/1*8kymcqfvPfQzfhVF6fUaHA.png)

## Where can I try them out ?

You can try chatting with some of these LLM's in the following websites

- https://claude.ai/
- https://chatgpt.com/
- https://www.meta.ai/

# Introduction to Gemini

Google Gemini is a family of multimodal large language models developed by Google DeepMind, serving as the successor to LaMDA and PaLM2.



![gemini.jpg](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Google_Gemini_logo.svg/440px-Google_Gemini_logo.svg.png)

## Models available in Gemini family

These are the models available in Gemini 1 family

- Gemini Ultra (Most intelligent, costly, and Slow)
- Gemini Pro (Middle ground in intelligent, cost, and speed)
- Gemini Flash (Least intelligent, cheapest and fastest)
- Gemini Nano (Used for ondevice with low memory footprint)

These are the models available in Gemini 1.5 family

- Gemini flash 1.5
- Gemini Pro 1.5



> Try out Gemini at > ***https://gemini.google.com/***



# Working with Gemini

Google offers a free-tier API to play around / work with Gemini. We'll be using Gemini to build our chatbot.

Let's get an API token to get started

## Getting an API token



1. Visit this site (https://ai.google.dev/gemini-api)
2. Login with your personal Google account.
3. Click **"Get API key in Google studio"**
4. Create a new API Key
5. Copy it, and go to Secrets in colab, create a new key titled **GOOGLE_API_KEY** and paste it.



## Installing the required dependencies

Run the below cell to install the required python libraries for us to work with Gemini

In [None]:
%pip install -q llama-index
%pip install -q llama-index-llms-gemini
%pip install -q google-generativeai

## Simple completion

**Large Language Models (LLMs)** can generate text based on a given prompt or input. This is called a completion. Think of it like autofill, but for sentences or even entire conversations! For example, if you type "**The capital of France is**", an LLM can complete it with "**Paris**".



If you start telling a story with "**Once upon a time in a land far, far away**", an LLM can continue the story with its own ideas.

Completions can be used for tasks like writing assistance, or even generating code - the possibilities are endless!

![img](https://storage.googleapis.com/download.tensorflow.org/tflite/examples/autocomplete_fig2.gif)

Let's now try out some completions with Gemini. Uncomment the below cells, and try out some completions

In [None]:
# from llama_index.llms.gemini import Gemini
# from google.colab import userdata

# # Read the API Key from Google colab secrets, and assign it to a variable
# gemini_api_key = userdata.get("GOOGLE_API_KEY")

# # Initialise the Gemini client with the API Key we got in the previous step
# response = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-flash").complete("What national team does Virat Kohli play for ?")
# print(response)

## Chat Completion

Large language models (LLMs) offer advanced chat completion capabilities, enabling highly interactive and dynamic conversations. These models, such as OpenAI's GPT-4, Google's Bard (LaMDA), and Meta's LLaMA, can understand and generate human-like responses, making them ideal for customer support, virtual assistants, and educational tools.

Let's try chatting with Gemini via an API

In [None]:
# from llama_index.core.llms import ChatMessage
# from llama_index.llms.gemini import Gemini

# # Get the chat history
# messages = [
#     ChatMessage(role="user", content="Hello friend!"),
#     ChatMessage(role="assistant", content="Hi there friend. What can I help you with today ?"),
#     ChatMessage(
#         role="user", content="What is hello in spanish ?"
#     ),
# ]

# # Initialise the Gemini client with API key, and pass along the chat history
# response = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-flash").chat(messages)
# print(response)

## Streaming

Streaming with large language models (LLMs) means they can respond to you right away, just like a live conversation. This is important for things like chatbots, virtual assistants, and translation services, where quick answers are needed. Streaming makes sure you get fast and smooth responses, making the interaction feel more natural.

![img](https://cdn.shopify.com/s/files/1/0779/4361/files/SidekickStreaming-Gif-02-v003.gif?v=1690295248)

Let's now replicate the above experience with Gemini

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini

# # Initialise the Gemini client with API key
# llm = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-pro")

# # Call the streaming API endpoint
# resp = llm.stream_complete(
#     "Write about Virat Kohli in about 1000 words"
# )
# for r in resp:
#     print(r.text, end="")
#     sys.stdout.flush()
#     time.sleep(0.2)


# Building a simple bot



Based on what we've learned before, let's put together a simple bot with streaming support

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini

# # Initialise the Gemini client
# llm = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-flash")

# while True:
#     # Collect the query from the user
#     query = input("Enter your query: ")
#     if query == "q":
#       break

#     # Make a streaming API call to get answer to user query
#     resp = llm.stream_complete(query)
#     for r in resp:
#         print(r.text, end="")
#         sys.stdout.flush()
#         time.sleep(0.2)

# Asking queries about real life events

If you try asking the bot questions around "**Who won the T20 world cup in 2024?**" it will struggle to answer it.



1.   It'll either say it hasn't happened yet
2.   (or) It'll make up an answer, which is not factual



This is because

1. ***LLMs do not know your data*** - LLMs are often limited to their pre-trained knowledge and data. Once trained, many LLMs do not have the ability to access data beyond their training data cutoff point.

2. ***Factual grounding and consistency*** - LLMs are powerful tools for generating creative and engaging text, but they can sometimes struggle with factual accuracy. This is because LLMs are trained on massive amounts of text data, which may contain inaccuracies or biases.

Let's now see how to overcome this limitation of LLM's

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini

# # Initialising the Gemini Client
# llm = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-flash")

# PROMPT = """
# Here is some information which could help in answering the user question.

# Context:
# {context}

# Using the above information, you can answer the user's question.
# Here is the user's question:

# {user_question}
# """

# def get_context():
#     return "India won the T20 World cup in 2024."


# while True:
#     query = input("Enter your query: ")
#     if query == "q":
#       break

#     # Inject the information into the prompt, we send to the LLM
#     prompt = PROMPT.format(context=get_context(), user_question=query)

#     resp = llm.stream_complete(prompt=prompt)
#     for r in resp:
#         print(r.text, end="")
#         sys.stdout.flush()
#         time.sleep(0.2)

By feeding the LLM with relevant information that is needed to answer the user's question, LLM can now provide answers to the queries, which it wasn't able to answer previously, since it wasn't trained on that data.

Let's now ask our bot question around "**Who were all the players who were part of the indian cricket team that won the T20 world cup?**"

Obviously the bot isn't able to answer it. It lacks this information. Let's now add this information as well.

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini

# # Initialise the Gemini client
# llm = Gemini(api_key=gemini_api_key, model="models/gemini-1.5-flash")

# PROMPT = """
# Here is some information which could help in answering the user question.

# Context:
# {context}

# Using the above information, you can answer the user's question.
# Here is the user's question:

# {user_question}
# """

# def get_context():
#     # Add information about who won the world cup, and the squad details as well.
#     return """
#     India won the 2024 T20 World cup.

#     These were the indian team players who were part of the squad that won the 2024 T20 World Cup:
#     1. Rohit Sharma
#     2. Virat Kohli
#     3. Rishabh Pant
#     4. Sanju Samson
#     5. Yashasvi Jaiswal
#     6. Surayakumar Yadav
#     7. Shivam Dube
#     8. Hardik Pandya
#     9. Axar Patel
#     10. Ravindra Jadeja
#     11. Jasprit Bumrah
#     12. Mohammed Siraj
#     13. Kuldeep Yadav
#     14. Yuzvendra Chahal
#     15. Arshdeep Singh
#     """


# while True:
#     # Collect the user query
#     query = input("Enter your query: ")
#     if query == "q":
#         break

#     # Inject necessary information into the prompt, which is fed to the LLM
#     prompt = PROMPT.format(context=get_context(), user_question=query)
#     resp = llm.stream_complete(prompt=prompt)
#     for r in resp:
#         print(r.text, end="")
#         sys.stdout.flush()
#         time.sleep(0.2)

If we try asking "**What was the score of the first match that india played in the T20 world cup ?**" , the bot will not be able to answer.

As usual, we'd have to add that information to the prompt. LLM's have limited context length. We can't keep on passing all this information to the LLM's everytime.

As we start increasing the amount of information that we pass along to the LLM,



1.   **The response times start increasing**
2.   **The cost (which is based on number of characters we send in the prompt) also increases**

Apart from this, LLM's do have a limit on number of characters, that we could pass along. It is finite !

So we have to be frugal about what information we're sending to the LLM.



For a given query, retrieving the relevant information, that needs to be passed along to the LLM is called as **RAG (Retrieval Augmented generation)**.

# Introduction to RAG

## What is RAG ?

Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data.

This is done by retrieving data/documents relevant to a question or task and providing them as context for the LLM.

By providing this extra context, LLM can generate answers, that are up-to-date, factually correct, and relevant to a specific domain.

![fm-jumpstar.png](https://docs.aws.amazon.com/images/sagemaker/latest/dg/images/jumpstart/jumpstart-fm-rag.jpg)


## Key Phases in RAG

The key phases in RAG are:



1. Data preparation
2. Data indexing
3. Information retrieval
4. LLM Inference (Answer generation)



![redbricks.jpg](https://www.databricks.com/sites/default/files/inline-images/glossary-rag-image-2.png?v=1704903053)

## What are the benefits of RAG ?

The RAG approach has a number of key benefits, including:

1. ***Providing up-to-date and accurate responses***: RAG ensures that the response of an LLM is not based solely on static, stale training data. Rather, the model uses up-to-date external data sources to provide responses.
2. ***Reducing inaccurate responses, or hallucinations:*** By grounding the LLM model's output on relevant, external knowledge, RAG attempts to mitigate the risk of responding with incorrect or fabricated information (also known as hallucinations). Outputs can include citations of original sources, allowing human verification.
3. ***Providing domain-specific, relevant responses***: Using RAG, the LLM will be able to provide contextually relevant responses tailored to an organization's proprietary or domain-specific data.
4. ***Being efficient and cost-effective***: Compared to other approaches to customizing LLMs with domain-specific data, RAG is simple and cost-effective. Organizations can deploy RAG without needing to customize the model. This is especially beneficial when models need to be updated frequently with new data.


## Where can we use RAG ?

1. ***Question and answer chatbots***: Incorporating LLMs with chatbots allows them to automatically derive more accurate answers from company documents and knowledge bases. Chatbots are used to automate customer support and website lead follow-up to answer questions and resolve issues quickly.

2. ***Search augmentation***: Incorporating LLMs with search engines that augment search results with LLM-generated answers can better answer informational queries and make it easier for users to find the information they need to do their jobs.

# Building a chatbot with RAG which can handle player related queries

## Installing the required dependencies

In [None]:
%pip install -q llama-index
%pip install -q llama-index-llms-gemini
%pip install -q google-generativeai
%pip install -q llama-index-embeddings-gemini
%pip install -q requests
%pip install -q beautifulsoup4

## Downloading the necessary data

Let's scrape some indian cricket players bio-graphy from Wikipedia, which we'll use while building a RAG app.

We'll build a chatbot which will answer queries about indian cricket players based on the biography we're scraping off of Wikipedia

In [None]:
# import os
# import requests
# from bs4 import BeautifulSoup


# INDIAN_PLAYERS = [
#     "https://en.wikipedia.org/wiki/Suryakumar_Yadav",
#     "https://en.wikipedia.org/wiki/Yashasvi_Jaiswal",
#     "https://en.wikipedia.org/wiki/Virat_Kohli",
#     "https://en.wikipedia.org/wiki/Rohit_Sharma",
#     "https://en.wikipedia.org/wiki/Hardik_Pandya",
#     "https://en.wikipedia.org/wiki/Ravindra_Jadeja",
#     "https://en.wikipedia.org/wiki/Axar_Patel",
#     "https://en.wikipedia.org/wiki/Kuldeep_Yadav",
#     "https://en.wikipedia.org/wiki/Jasprit_Bumrah",
# ]


# def scrape_wiki(url):
#     # Make an API call to get the page content
#     response = requests.get(url)
#     soup = BeautifulSoup(response.content, "html.parser")
#     name = soup.find("h1", class_="firstHeading").text

#     # Extract all the text from para HTML tags
#     para_tags = soup.find_all("p")
#     full_page_text = ""
#     for para in para_tags:
#         full_page_text += para.text

#     return name, full_page_text


# # Store player bio taken from wikipedia in text files
# if not os.path.exists("/content/player_wiki"):
#     os.mkdir("/content/player_wiki")

# player_data = []
# for player in INDIAN_PLAYERS:
#     name, text = scrape_wiki(player)
#     with open(f"/content/player_wiki/{name}.txt", "x") as f:
#         f.write(text)


# # Look around what information is scraped
# files = os.listdir("/content/player_wiki")
# for player_name in files[:10]:
#   file_path = f"/content/player_wiki/{player_name}"

#   file_content = None

#   with open(file_path) as f:
#     file_content = f.read()

#   print(player_name.strip(".txt"))
#   print(file_content[:100], end="\n\n")


## Building a chatbot

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini
# from llama_index.core import SimpleDirectoryReader
# from llama_index.core import Settings

# # Initialise Gemini Flash LLM to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# Settings.llm = gemini_llm

# # Read the data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# documents = reader.load_data()

# PROMPT = """
# Here is some information which could help in answering the user question.

# Context:
# {context}

# Using the above information, you can answer the user's question.
# Here is the user's question:

# {user_question}
# """

# def get_context(query_text):
#     # Add a simple keyword search over the documents.
#     # Documents which have lot of keywords matching the query_text will be returned.
#     # This is a simple way to retrieve the context information.
#     context = ""
#     words = query_text.split()

#     # Remove stop words
#     stop_words = ["is", "the", "who", "what", "where", "when", "why", "how"]
#     words = [word for word in words if word.lower() not in stop_words]

#     doc_score_map = {}
#     for word in words:
#         for doc in documents:
#             if word in doc.text:
#                 doc_score_map[doc.doc_id] = doc_score_map.get(doc.doc_id, 0) + 1

#     if doc_score_map:
#         doc_id = max(doc_score_map, key=doc_score_map.get)

#         for doc in documents:
#             if doc.doc_id == doc_id:
#                 context = doc.text
#                 break

#     print("\033[34mRetrieved context: ", context, end="\n\n\033[0m")
#     return context


# while True:
#     query = input("Enter your query: ")
#     if query == "q":
#         break

#     prompt = PROMPT.format(context=get_context(query), user_question=query)
#     response = gemini_llm.stream_complete(prompt=prompt)

#     print("Answer: ", end="")
#     for r in response:
#         print(r.text, end="")
#         sys.stdout.flush()
#         time.sleep(0.2)

In the above example, we built a chatbot which used a simple keyword search which worked okay in some cases, and which didn't work good in some cases.

Keyword search is not the best way to search for information among a large set of information.

Keyword search, while useful, has several limitations that can affect its effectiveness. Here are some common pitfalls of keyword search:

- **Lack of context:** Keyword search often misses the broader meaning or intent behind a query, focusing only on specific words.
- **Synonym confusion:** It may miss relevant results that use synonyms or related terms instead of the exact keywords.
- **Ambiguity:** Words with multiple meanings can lead to irrelevant results.
Overemphasis on popularity: Popular content may overshadow more relevant but less well-known information.
- **Missing conceptual matches:** Keyword search struggles with finding content that's conceptually related but doesn't contain the exact search terms.
- **Over-reliance on exact matches:** This can lead to missing valuable information that's phrased differently.

Instead of simply searching of relevant information based on the keywords alone, what if we're able to understand the information that is stored, and we're able to retrieve the most relevant information based on the query ?

This is where **semantic search** comes in.

# Making our chatbot truly understand meaning of the queries

## Semantic search

Semantic search is an advanced approach to finding information that goes beyond simple keyword matching. Instead of just looking for exact words, it tries to understand the meaning and context of your search query. This method is like having a smart librarian who doesn't just match book titles, but understands what you're really looking for.:

Semantic search can pick up on synonyms, related concepts, and even the intent behind your question. This makes it much better at finding relevant information, even when the exact words in your search don't appear in the results.

## Vector embeddings

Vector embeddings are a key technology that makes semantic search possible. They're a way of turning words, sentences, or even entire documents into long lists of numbers. These numbers represent the meaning of the text in a way that computers can understand and compare.

Imagine if you could turn the meaning of every word into a unique point in space. Words with similar meanings would be close together, while very different words would be far apart.

Vector embeddings work in a similar way, allowing computers to measure how similar or different pieces of text are in terms of their meaning, not just their spelling. This helps search engines find results that are truly relevant to what you're looking for, even if they use different words to express the same idea.

![img](https://dkharazi.github.io/ecc71bb7c9e227b292dd909b02dbf4e8/embedding.svg)

## Generating embeddings with Gemini

Now that we understand embeddings, let's try to put it to use. Let's use embedding models that are available in Gemini family to generate embeddings

In [None]:
# from llama_index.embeddings.gemini import GeminiEmbedding

# # Initialise the Gemini embedding model Client with the API Key
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# embedding = gemini_embedding_model.get_text_embedding("Hello, world!")
# print(embedding)

## Swapping keyword search with Vector search

Let's now build a chatbot, that uses vector search instead of keyword search using LlamaIndex

In [None]:
# import sys
# import time

# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
# from llama_index.core import Settings

# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# documents = reader.load_data()

# # Create a in-memory vector store index from the documents
# index = VectorStoreIndex.from_documents(documents=documents, similarity_top_k=3, show_progress=True)
# index_retriever = index.as_retriever()

# PROMPT = """
# Here is some information which could help in answering the user question.

# Context:
# {context}

# Using the above information, you can answer the user's question.
# Here is the user's question:

# {user_question}
# """

# def get_context(query_text):
#     relevant_docs =  index_retriever.retrieve(query_text)

#     context = ""
#     for doc in relevant_docs:
#         context += doc.get_content()

#     print("\033[34mRetrieved context: ", context, end="\n\n\033[0m")
#     return context


# while True:
#     query = input("Enter your query: ")
#     if query == "q":
#         break

#     prompt = PROMPT.format(context=get_context(query), user_question=query)

#     print("Answer: ", end="")
#     resp = gemini_llm.stream_complete(prompt=prompt)
#     for r in resp:
#         print(r.text, end="")
#         sys.stdout.flush()
#         time.sleep(0.2)

## Using LlamaIndex end-to-end instead of just for querying

Instead of using LlamaIndex just for querying, let's use LlamaIndex to build our chatbot.

In [None]:
# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
# from llama_index.core import Settings

# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the data from the directory
# # reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# # documents = reader.load_data()

# # Create a in-memory vector store index from the documents
# # index = VectorStoreIndex.from_documents(documents=documents, show_progress=True, verbose=True)

# # Construct a query engine on top of the index
# query_engine = index.as_query_engine(similarity_top_k=3)

# while True:
#   query = input("Enter your query: ")
#   if query == "q":
#     break

#   response = query_engine.query(query)
#   print(f"Response: {response}")

# Adding ability for our chatbot to answer national team related queries

## Downloading necessary data

In [None]:
# NATIONAL_CRICKET_TEAMS = [
#     "https://en.wikipedia.org/wiki/Australia_national_cricket_team",
#     "https://en.wikipedia.org/wiki/England_cricket_team",
#     "https://en.wikipedia.org/wiki/India_national_cricket_team",
#     "https://en.wikipedia.org/wiki/New_Zealand_national_cricket_team",
#     "https://en.wikipedia.org/wiki/Pakistan_national_cricket_team",
#     "https://en.wikipedia.org/wiki/South_Africa_national_cricket_team",
#     "https://en.wikipedia.org/wiki/Sri_Lanka_national_cricket_team",
#     "https://en.wikipedia.org/wiki/West_Indies_cricket_team",
# ]

# # Store team bio taken from wikipedia in text files
# if not os.path.exists("/content/team_wiki"):
#     os.mkdir("/content/team_wiki")

# for team in NATIONAL_CRICKET_TEAMS:
#     name, text = scrape_wiki(team)
#     with open(f"./team_wiki/{name}.txt", "x") as f:
#         f.write(text)

## Setting up our bot to answer queries related to teams

In [None]:
# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
# from llama_index.core import Settings


# # Initialise the gemini embedding model to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/team_wiki/")
# documents = reader.load_data()

# # Create a in-memory vector store index from the documents
# index = VectorStoreIndex.from_documents(documents=documents, show_progress=True, verbose=True)

# # Construct a query engine on top of the index
# query_engine = index.as_query_engine(similarity_top_k=3, streaming=True)

# while True:
#   query = input("Enter your query: ")
#   if query == "q":
#     break

#   response = query_engine.query(query)
#   print(f"Response: {response}")

# Adding routing to our bot to determine which source needs to be queried

Our bot can either handle queries about players (or) about the national teams depending on the data source that is being considered.

What if we want our bot to answer questions from both of these data sources ? What if our bot could choose the appropriate data source, based on the query ?

We'll build a simple router, which determines which data source (index) to use based on the query from the user.

![img](https://miro.medium.com/v2/resize:fit:1400/0*Ado23RkTTpNFRO0b)

In [None]:
# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
# from llama_index.core.tools import QueryEngineTool
# from llama_index.core.query_engine import RouterQueryEngine
# from llama_index.core.selectors import LLMSingleSelector

# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001", embed_batch_size=50)

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the player data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# player_documents = reader.load_data()

# # Create a in-memory vector store index from the player documents
# player_index = VectorStoreIndex.from_documents(documents=player_documents, show_progress=True)
# player_query_engine = player_index.as_query_engine(streaming=True)

# # Read the team data from the directory
# team_reader = SimpleDirectoryReader(input_dir="/content/team_wiki/")
# team_documents = team_reader.load_data()

# # Create a in-memory vector store index from the player documents
# team_index = VectorStoreIndex.from_documents(documents=team_documents, show_progress=True)
# team_query_engine = team_index.as_query_engine(streaming=True)

# player_tool = QueryEngineTool.from_defaults(
#     query_engine=player_query_engine,
#     description=(
#         "Useful for queries that are about specific cricket players. Contains data about players taken from their wikipedia page."
#     ),
#     name="player_wiki_tool"
# )

# team_tool = QueryEngineTool.from_defaults(
#     query_engine=team_query_engine,
#     description=(
#         "Useful for queries that are about national cricket teams. Contains data about teams taken from their wikipedia page."
#     ),
#     name="team_wiki_tool"
# )

# query_engine = RouterQueryEngine(
#     selector=LLMSingleSelector.from_defaults(),
#     query_engine_tools=[
#         player_tool,
#         team_tool,
#     ],
#     verbose=True
# )

# response = query_engine.query("Who is Virat Kohli ?")
# response.print_response_stream()


# Making our bot handle multi-turn conversations

Multi-turn conversations are dialogues that extend beyond a single exchange, requiring participants to remember and build upon previous messages. For chatbots, memory is essential to engage in these more complex and natural interactions. Without memory, a chatbot would treat each user message as an isolated query, leading to disjointed and often frustrating exchanges.

By incorporating memory, chatbots can:

- **Follow conversation threads**: Keep track of the current topic and refer back to earlier points.
- **Understand context**: Interpret user messages correctly based on the conversation history.
- **Provide relevant responses**: Offer information that builds on previous exchanges.
- **Handle follow-up questions**: Respond appropriately to queries that depend on earlier context.
- **Maintain coherence**: Ensure the conversation flows logically from one turn to the next.

Memory enables chatbots to participate in fluid, multi-turn dialogues that more closely resemble human conversation

We'll now look at how to add conversational history to our bot, and make it a full-fledged chatbot that you can have natural conversations with.

🚀 Let's get started !

In [None]:
# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
# from llama_index.core.llms import ChatMessage, MessageRole
# from llama_index.core.chat_engine import ContextChatEngine
# from llama_index.core.retrievers import RouterRetriever
# from llama_index.core.tools import RetrieverTool
# from llama_index.core.memory import ChatMemoryBuffer


# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the player data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# player_documents = reader.load_data()

# # Create a in-memory vector store index from the player documents
# player_index = VectorStoreIndex.from_documents(documents=player_documents)
# player_retriever = player_index.as_retriever()

# # Read the team data from the directory
# team_reader = SimpleDirectoryReader(input_dir="/content/team_wiki/")
# team_documents = team_reader.load_data()

# # Create a in-memory vector store index from the player documents
# team_index = VectorStoreIndex.from_documents(documents=team_documents)
# team_retriever = team_index.as_retriever()

# player_tool = RetrieverTool.from_defaults(
#     retriever=player_query_engine,
#     description=(
#         "Useful for queries that are about specific cricket players. Contains data about players taken from their wikipedia page."
#     ),
#     name="player_wiki_tool"
# )

# team_tool = RetrieverTool.from_defaults(
#     retriever=team_query_engine,
#     description=(
#         "Useful for queries that are about national cricket teams. Contains data about teams taken from their wikipedia page."
#     ),
#     name="team_wiki_tool"
# )

# retriever = RouterRetriever(
#     selector=LLMSingleSelector.from_defaults(),
#     retriever_tools=[
#         player_tool,
#         team_tool,
#     ]
# )

# # Initialise a chat memory buffer component
# memory = ChatMemoryBuffer.from_defaults(token_limit=150000)

# # Initialise chat engine with the retriever, LLM and the memory component
# chat_engine = ContextChatEngine(retriever=retriever, prefix_messages=[], llm=gemini_llm, memory=memory)

# while True:
#   query = input("Human: ")
#   if query == "q":
#     break

#   # Pass along the user query to the chat engine
#   response = chat_engine.stream_chat(query)
#   print("AI: ", end="")
#   response.print_response_stream()



---



# Handling complex questions

Till now, our bot can

1. Answer queries about players, and teams
2. Determine which data source needs to be used, depending on the query.
3. Handle multi-turn conversations.

It can't handle very complex queries like "**Who is the current indian team captain ? Which IPL team does he play for ? What is his birth date ?**"

To solve this, we'll have to break down the query into multiple sub-questions, which we need to answer in order, and then answer the entire question.

In the above example, we can break down the above question into:

1. Who is the current indian team captain ?
2. Once we figure out that the india team captain is Rohit Sharma, we should then get an answer for "IPL team that Rohith Sharma plays for".
3. Then figure out his birth date.
4. Answer the entire question.


![img](https://docs.llamaindex.ai/en/v0.10.19/_images/multi_step_diagram.png)

In [None]:
# from llama_index.llms.gemini import Gemini
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
# from llama_index.core.llms import ChatMessage, MessageRole
# from llama_index.core.retrievers import RouterRetriever
# from llama_index.core.tools import RetrieverTool
# from llama_index.core.agent import ReActAgent


# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the player data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# player_documents = reader.load_data()

# # Create a in-memory vector store index from the player documents
# player_index = VectorStoreIndex.from_documents(documents=player_documents)
# player_retriever = player_index.as_retriever()

# # Read the team data from the directory
# team_reader = SimpleDirectoryReader(input_dir="/content/team_wiki/")
# team_documents = team_reader.load_data()

# # Create a in-memory vector store index from the player documents
# team_index = VectorStoreIndex.from_documents(documents=team_documents)
# team_retriever = team_index.as_retriever()

# player_tool = RetrieverTool.from_defaults(
#     retriever=player_retriever,
#     description=(
#         "Useful for queries that are about specific cricket players. Contains data about players taken from their wikipedia page."
#     ),
#     name="player_wiki_tool"
# )

# team_tool = RetrieverTool.from_defaults(
#     retriever=team_retriever,
#     description=(
#         "Useful for queries that are about national cricket teams. Contains data about teams taken from their wikipedia page."
#     ),
#     name="team_wiki_tool"
# )

# # use ReAct Agent
# agent = ReActAgent.from_tools(
#     tools = [player_tool, team_tool],
#     llm=gemini_llm,
#     verbose=True
# )

In [None]:
# # use ReAct Agent
# agent = ReActAgent.from_tools(
#     tools = [player_tool, team_tool],
#     llm=gemini_llm,
#     verbose=True
# )

# response = agent.chat("Who is the current indian team captain ? Which IPL team does he play for ?")

Our bot now can also handle complex questions 🎉

But, if you think about, the data we fed to our bot contains only the basic information about them and their teams.

But, what if you want the bot answer live scores ? or get scoredcards from past matches ?

There are free Cricket APIs that can provide us with this data.

But, as you know LLMs cannot perform API calls. Now, how do we solve this ?

## Bridging the Gap with Tools

To overcome these limitations and enhance the capabilities of LLMs, researchers and developers have introduced the concept of "tools." These tools are external systems or functions that LLMs can leverage to accomplish specific parts of a given task.

Tools in the context of LLMs are:
- External APIs, software, or functions
- Specialized systems designed for specific tasks
- Interfaces that allow LLMs to interact with the real world or access current data

# Tool-Using Agents

Tool-using agents are AI systems designed to interact with and utilize various tools or external resources to accomplish tasks more effectively. These agents combine the power of language models with the ability to use specific tools, bridging the gap between general language understanding and specialized task execution.

# Building an Agent with an API calling tool

### What are we going to build ?

Let's try to build a tool which can call an API which will give us match level info and scorecards for the T20 World cup.

I will be using a free cricket data API service called [Cricket Data](https://cricketdata.org)

*Steps to get the API Key*

1. Sign up for an account here [Signup](https://cricketdata.org/signup.aspx).
2. Verify your email and login.
3. You should be able to see your Free API key in the dashboard page.

### Build the tools to call the API.

Let's build our first tool get all the matches list for the T20 world cup.

In [None]:
# import requests
# from llama_index.core.tools import FunctionTool

# # this is a constant series id for the series we are
# # going to fetch the info for.
# SERIES_ID = "e079ef23-b5e9-4802-93e9-dd2f27db0533"

# CRICKET_API = userdata.get("CRICKET_API")

# def get_matches_list():
#   """
#   Return the list of Matches and their information
#   for the 2024 T20 world cup
#   """
#   url = f"https://api.cricapi.com/v1/series_info?apikey={CRICKET_API}&id={SERIES_ID}"

#   response = requests.get(url)
#   return response.json()

# get_matches_list_tool = FunctionTool.from_defaults(fn=get_matches_list)


Let's quickly test our tool by providing it to an Agent.

In [None]:
# from llama_index.core.agent import ReActAgent

# GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")

# llm = Gemini(
#     api_key=GOOGLE_API_KEY,
#     model="models/gemini-1.5-flash"
# )

# api_agent = ReActAgent.from_tools([get_matches_list_tool], llm=llm, verbose=True, max_iterations=50)

In [None]:
# # ask a simple question
# response = api_agent.chat("Who played the first game in the series ?")

Wow, now our bot is capable of giving us match results. But, I also want the match specific info and score cards. Well, let build two more tools that can,

1. Get us match specific scorecard.

2. Get us all the team squads in the series.

In [None]:
# import requests
# from llama_index.core.tools import FunctionTool
# from llama_index.core.agent import ReActAgent

# # this is a constant series id for the series we are
# # going to fetch the info for.
# SERIES_ID = "e079ef23-b5e9-4802-93e9-dd2f27db0533"

# CRICKET_API = userdata.get("CRICKET_API")

# def get_matches_list():
#   """
#   Return the list of Matches and their information
#   for the 2024 T20 world cup
#   """
#   url = f"https://api.cricapi.com/v1/series_info?apikey={CRICKET_API}&id={SERIES_ID}"

#   response = requests.get(url)
#   return response.json()


# def get_match_scorecard(match_id: str):
#   """
#   Returns score card for a specific match for the 2024 T20 world cup
#   Args
#    - match_id - UUID of the match which can be obtained using get_matches_list
#   """
#   url = url = f"https://api.cricapi.com/v1/match_scorecard?apikey={CRICKET_API}&id={match_id}"

#   response = requests.get(url)
#   return response.json()


# def get_series_squad():
#   """
#   Returns team squads for the 2024 T20 world cup
#   """
#   url = url = f"https://api.cricapi.com/v1/series_squad?apikey={CRICKET_API}&id={SERIES_ID}"

#   response = requests.get(url)
#   return response.json()

# get_match_scorecard_tool = FunctionTool.from_defaults(fn=get_match_scorecard)
# get_matches_list_tool = FunctionTool.from_defaults(fn=get_matches_list)
# get_series_squad_tool = FunctionTool.from_defaults(fn=get_series_squad)

# GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")

# llm = Gemini(
#     api_key=GOOGLE_API_KEY,
#     model="models/gemini-1.5-flash"
# )

# api_agent = ReActAgent.from_tools([get_matches_list_tool, get_series_squad_tool, get_match_scorecard_tool], llm=llm, verbose=True, max_iterations=50)

Let's interact with our agent.

In [None]:
# response = api_agent.chat("Who won the T20 world cup ? what's the scorecard ?")

In [None]:
# response = api_agent.chat("Give me the Indian squad that played in the T20 world cup")

Now our bot is capable doing so many more things. Kudos to tools.

But, there is still one thing missing. If you guys are "Cricket Nerds" like me 😉, you might want to post series analysis on how each player performed.

You can do this with APIs but calling the match specific API for each match and trying to analyse it very tedious.

Let's say we have a CSV dataset of all player stats frkm the T20 world cup.

Let's build an analysis agent 🎉

# Let's build a tool that can analyse CSV dataset and perform operations

This agent will be Able to answer from a CSV which contains player statistics from T20 world cup.

## Install required dependencies

- pandas: pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,
built on top of the Python programming language.

In [None]:
# install requried dependencies
%pip install pandas

## Download the required dataset

Our dataset is available in Github - https://raw.githubusercontent.com/happyfoxinc/agentic-rag-workshop/main/data/ODI_data.csv

In [None]:
!wget -O batting_stats.csv https://raw.githubusercontent.com/happyfoxinc/agentic-rag-workshop/main/data/batting_stats_for_icc_mens_t20_world_cup_2024.csv
!wget -O bowling_stats.csv https://raw.githubusercontent.com/happyfoxinc/agentic-rag-workshop/main/data/bowling_stats_for_icc_mens_t20_world_cup_2024.csv

## Import Data

Let's import the data and take a look at what we are going to work with

We are goint to work with 2 different datasets which is related to the recent T20 world cup which India won 😀

1. Batting stats
2. Bowling stats

In [None]:
# # load data using pandas
# import pandas as pd

# batting_df = pd.read_csv("/content/batting_stats.csv")

# # see what we are dealing with
# batting_df.head()

In [None]:
# bowling_df = pd.read_csv("/content/bowling_stats.csv")

# bowling_df.head()

## Building a tool that can interact with our dataset

Now, lets build a simple tool that can get some insights from the data that we have. Let's say you want,

For the batting stats dataset,
1. Get the player highest no of runs

2. Player with the highest strike rate

For the bowling stats dataset,

1. Get the player with the highest number of wickets

2. Get the player with the lowest economy

In [None]:
# from llama_index.core.tools import FunctionTool
# from llama_index.core.agent import ReActAgent

# def get_t20_world_cup_stat_insights(stats_type: str, insight_type: str):
#   """
#   This will return the requested insights for the the given stats_type
#   and insight_type

#   Supported stats_type:
#   1. batting
#   2. bowling

#   Supported insight_types
#   1. For batting
#     a. highest_run_scorer
#     b. player_with_highest_strike_rate
#   2. For bowling
#     a. highest_wicket_taker
#     b. player_with_least_economy
#   """
#   if stats_type == "batting":
#     if insight_type == "highest_run_scorer":
#       row_with_highest_score = batting_df.loc[batting_df["Runs"].idxmax()]

#       return row_with_highest_score.to_dict()
#     elif insight_type == "player_with_highest_strike_rate":
#       row_with_highest_strike_rate = batting_df.loc[batting_df["SR"].idxmax()]

#       return row_with_highest_strike_rate.to_dict()
#     else:
#       return "Invalid insight_type for batting"
#   elif stats_type == "bowling":
#     if insight_type == "highest_wicket_taker":
#       row_with_highest_wickets = bowling_df.loc[bowling_df["Wkts"].idxmax()]

#       return row_with_highest_wickets.to_dict()
#     elif insight_type == "player_with_least_economy":
#       row_with_least_economy = bowling_df.loc[bowling_df["Econ"].idxmin()]

#       return row_with_least_economy.to_dict()
#     else:
#       return "Invalid insight_type for bowling"
#   else:
#     return "Invalid stats_type"


# get_t20_world_cup_stat_insights_tool = FunctionTool.from_defaults(fn=get_t20_world_cup_stat_insights)

# # initialize our agent
# GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")

# llm = Gemini(
#     api_key=GOOGLE_API_KEY,
#     model="models/gemini-1.5-flash"
# )

# agent = ReActAgent.from_tools([get_t20_world_cup_stat_insights_tool], llm=llm, verbose=True)

Let's interact with our agent

In [None]:
# response = agent.chat("Who was the best run scorer and how many runs did he score ?")

In [None]:
# response = agent.chat("Who has the best strike rate ?")

In [None]:
# response = agent.chat("Compare and contrast between the player with highest score and highest strike rate")

In [None]:
# response = agent.chat("Who took the most wickets and who has the least economy in the tournament ?")

With all these tools in place, we can

1. Get match specific info

2. Get the squads for each team

3. Analyse each player statistics

Now, let's go ahead and wire these up with out chatbot 🥳

# Enhancing our RAG Chatbot with tools

By building the above we were able to get statistics for the T20 world via both CSV and an API.

Now, let's provide add these capabilities to our existing chatbot by providing it with these tools.

In [None]:
# import os
# import requests
# from bs4 import BeautifulSoup


# INDIAN_PLAYERS = [
#     "https://en.wikipedia.org/wiki/Suryakumar_Yadav",
#     "https://en.wikipedia.org/wiki/Yashasvi_Jaiswal",
#     "https://en.wikipedia.org/wiki/Virat_Kohli",
#     "https://en.wikipedia.org/wiki/Rohit_Sharma",
#     "https://en.wikipedia.org/wiki/Hardik_Pandya",
#     "https://en.wikipedia.org/wiki/Ravindra_Jadeja",
#     "https://en.wikipedia.org/wiki/Axar_Patel",
#     "https://en.wikipedia.org/wiki/Kuldeep_Yadav",
#     "https://en.wikipedia.org/wiki/Jasprit_Bumrah",
# ]


# def scrape_wiki(url):
#     # Make an API call to get the page content
#     response = requests.get(url)
#     soup = BeautifulSoup(response.content, "html.parser")
#     name = soup.find("h1", class_="firstHeading").text

#     # Extract all the text from para HTML tags
#     para_tags = soup.find_all("p")
#     full_page_text = ""
#     for para in para_tags:
#         full_page_text += para.text

#     return name, full_page_text


# # Store player bio taken from wikipedia in text files
# if not os.path.exists("/content/player_wiki"):
#     os.mkdir("/content/player_wiki")

# player_data = []
# for player in INDIAN_PLAYERS:
#     name, text = scrape_wiki(player)
#     with open(f"/content/player_wiki/{name}.txt", "x") as f:
#         f.write(text)

# NATIONAL_CRICKET_TEAMS = [
#     "https://en.wikipedia.org/wiki/Australia_national_cricket_team",
#     "https://en.wikipedia.org/wiki/England_cricket_team",
#     "https://en.wikipedia.org/wiki/India_national_cricket_team",
#     "https://en.wikipedia.org/wiki/New_Zealand_national_cricket_team",
#     "https://en.wikipedia.org/wiki/Pakistan_national_cricket_team",
#     "https://en.wikipedia.org/wiki/South_Africa_national_cricket_team",
#     "https://en.wikipedia.org/wiki/Sri_Lanka_national_cricket_team",
#     "https://en.wikipedia.org/wiki/West_Indies_cricket_team",
# ]

# # Store team bio taken from wikipedia in text files
# if not os.path.exists("/content/team_wiki"):
#     os.mkdir("/content/team_wiki")

# for team in NATIONAL_CRICKET_TEAMS:
#     name, text = scrape_wiki(team)
#     with open(f"./team_wiki/{name}.txt", "x") as f:
#         f.write(text)


In [None]:
# import requests
# from llama_index.llms.gemini import Gemini
# from llama_index.core.agent import ReActAgent
# from llama_index.embeddings.gemini import GeminiEmbedding
# from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
# from google.colab import userdata
# from llama_index.core import Settings
# from llama_index.core.tools import QueryEngineTool
# from llama_index.core.tools import FunctionTool

# # download dataset


# # Initialise the gemini embedding model client to be used for search
# gemini_embedding_model = GeminiEmbedding(api_key=gemini_api_key, model_name="models/embedding-001")

# # Initialise Gemini Flash LLM client to be used for generating answers
# gemini_llm = Gemini(model="models/gemini-1.5-flash", api_key=gemini_api_key)

# # Mark the LLM and embedding model to be used as default
# Settings.embed_model = gemini_embedding_model
# Settings.llm = gemini_llm

# # Read the player data from the directory
# reader = SimpleDirectoryReader(input_dir="/content/player_wiki/")
# player_documents = reader.load_data()

# # Create a in-memory vector store index from the player documents
# player_index = VectorStoreIndex.from_documents(documents=player_documents)
# player_retriever = player_index.as_retriever()

# # Read the team data from the directory
# team_reader = SimpleDirectoryReader(input_dir="/content/team_wiki/")
# team_documents = team_reader.load_data()

# # Create a in-memory vector store index from the player documents
# team_index = VectorStoreIndex.from_documents(documents=team_documents)
# team_retriever = team_index.as_retriever()

# player_tool = RetrieverTool.from_defaults(
#     retriever=player_retriever,
#     description=(
#         "Useful for queries that are about specific cricket players. Contains data about players taken from their wikipedia page."
#     ),
#     name="player_wiki_tool"
# )

# team_tool = RetrieverTool.from_defaults(
#     retriever=team_retriever,
#     description=(
#         "Useful for queries that are about national cricket teams. Contains data about teams taken from their wikipedia page."
#     ),
#     name="team_wiki_tool"
# )

# # this is a constant series id for the series we are
# # going to fetch the info for.
# SERIES_ID = "e079ef23-b5e9-4802-93e9-dd2f27db0533"

# CRICKET_API = userdata.get("CRICKET_API")

# def get_matches_list():
#   """
#   Return the list of Matches and their information
#   for the 2024 T20 world cup
#   """
#   url = f"https://api.cricapi.com/v1/series_info?apikey={CRICKET_API}&id={SERIES_ID}"

#   response = requests.get(url)
#   return response.json()


# def get_match_scorecard(match_id: str):
#   """
#   Returns score card for a specific match for the 2024 T20 world cup
#   Args
#    - match_id - UUID of the match which can be obtained using get_matches_list
#   """
#   url = url = f"https://api.cricapi.com/v1/match_scorecard?apikey={CRICKET_API}&id={match_id}"

#   response = requests.get(url)
#   return response.json()


# def get_series_squad():
#   """
#   Returns team squads for the 2024 T20 world cup
#   """
#   url = url = f"https://api.cricapi.com/v1/series_squad?apikey={CRICKET_API}&id={SERIES_ID}"

#   response = requests.get(url)
#   return response.json()

# def get_t20_world_cup_stat_insights(stats_type: str, insight_type: str):
#   """
#   This will return the requested insights for the the given stats_type
#   and insight_type

#   Supported stats_type:
#   1. batting
#   2. bowling

#   Supported insight_types
#   1. For batting
#     a. highest_run_scorer
#     b. player_with_highest_strike_rate
#   2. For bowling
#     a. highest_wicket_taker
#     b. player_with_least_economy
#   """
#   if stats_type == "batting":
#     if insight_type == "highest_run_scorer":
#       row_with_highest_score = batting_df.loc[batting_df["Runs"].idxmax()]

#       return row_with_highest_score.to_dict()
#     elif insight_type == "player_with_highest_strike_rate":
#       row_with_highest_strike_rate = batting_df.loc[batting_df["SR"].idxmax()]

#       return row_with_highest_strike_rate.to_dict()
#     else:
#       return "Invalid insight_type for batting"
#   elif stats_type == "bowling":
#     if insight_type == "highest_wicket_taker":
#       row_with_highest_wickets = bowling_df.loc[bowling_df["Wkts"].idxmax()]

#       return row_with_highest_wickets.to_dict()
#     elif insight_type == "player_with_least_economy":
#       row_with_least_economy = bowling_df.loc[bowling_df["Econ"].idxmin()]

#       return row_with_least_economy.to_dict()
#     else:
#       return "Invalid insight_type for bowling"
#   else:
#     return "Invalid stats_type"


# get_t20_world_cup_stat_insights_tool = FunctionTool.from_defaults(fn=get_t20_world_cup_stat_insights)
# get_match_scorecard_tool = FunctionTool.from_defaults(fn=get_match_scorecard)
# get_matches_list_tool = FunctionTool.from_defaults(fn=get_matches_list)
# get_series_squad_tool = FunctionTool.from_defaults(fn=get_series_squad)

# GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")

# llm = Gemini(
#     api_key=GOOGLE_API_KEY,
#     model="models/gemini-1.5-flash-latest"
# )

# agent = ReActAgent.from_tools([player_tool, team_tool, get_matches_list_tool, get_series_squad_tool, get_match_scorecard_tool, get_t20_world_cup_stat_insights_tool], llm=llm, verbose=True, max_iterations=50)

# agent.chat_repl()

Our chatbot is now capabale of,

1. Answering questions about player bios, stats, matches info etc from the world cup.
2. It is capable of multi-turn conversation.
3. It can handle multi-hop queries (queries which may need info from different datasources)

Well done. You guys are awesome 🎉

## Seeing we have created our own chatbot with using almost less than 50 lines of code, what can be our next steps to making this more awesome 🎉 ?

1. Create a **FastAPI** app which can expose our chatbot outside of this notebook

> We have already implemented a small FastAPI app which can be starting point to extend and implement your own.
You can find it here > https://github.com/happyfoxinc/agentic-rag-workshop/tree/main/webapp

2. Your your dataset instead of the T20 world cup dataset we used. Try to play around with your data and see the power of AI.

# Exercises

## Excercise 1

In this exercise we can try to extend the capabilities of our `get_t20_world_cup_stat_insights` tool.


Suggestions:

1. Add a capability to get the player with the lowest score

2. Add a capability to get the player with the least number of wickets

3. Add a capability to get the player who has played the most number of balls

### Existing tool code

*Some tips for doing the execise*

1. If you encounter any issues when performing operations on the dataset, check if the dataset has any inappropriate values and try to clean them.

2. If there are any variable or import not found error, try running the cells above where the imports/variables are declared, or feel free to copy and paste them here.

3. For operations than can be performed with pandas on a csv dataset, try reading throught their documentation and understand what is needed for completing this task.

In [None]:
from llama_index.core.tools import FunctionTool

def get_t20_world_cup_stat_insights(stats_type: str, insight_type: str):
  """
  This will return the requested insights for the the given stats_type
  and insight_type

  Supported stats_type:
  1. batting
  2. bowling

  Supported insight_types
  1. For batting
    a. highest_run_scorer
    b. player_with_highest_strike_rate
  2. For bowling
    a. highest_wicket_taker
    b. player_with_least_economy
  """
  if stats_type == "batting":
    if insight_type == "highest_run_scorer":
      row_with_highest_score = batting_df.loc[batting_df["Runs"].idxmax()]

      return f"{row_with_highest_score['Player']} has scored {row_with_highest_score['Runs']} runs, which is the highest."
    elif insight_type == "player_with_highest_strike_rate":
      row_with_highest_strike_rate = batting_df.loc[batting_df["SR"].idxmax()]

      return f"{row_with_highest_strike_rate['Player']} has {row_with_highest_strike_rate['SR']} strike rate, which is the highest."
    else:
      return "Invalid insight_type for batting"
  elif stats_type == "bowling":
    if insight_type == "highest_wicket_taker":
      row_with_highest_wickets = bowling_df.loc[bowling_df["Wkts"].idxmax()]

      return f"{row_with_highest_wickets['Player']} has taken {row_with_highest_wickets['Wkts']} wickets, which is the highest."
    elif insight_type == "player_with_least_economy":
      row_with_least_economy = bowling_df.loc[bowling_df["Econ"].idxmin()]

      return f"{row_with_least_economy['Player']} has {row_with_least_economy['Econ']} economy, which is the least."
    else:
      return "Invalid insight_type for bowling"
  else:
    return "Invalid stats_type"


get_t20_world_cup_stat_insights_tool = FunctionTool.from_defaults(fn=get_t20_world_cup_stat_insights)

## Excercise 2

In this exercise you are going to bring your own dataset into the table.

Try to,

1. Bring your own pdf dataset

2. Process it, ingest and create an index for it.

3. Create a query engine for the index and add it as an extra tool to our chatbot.

Kindly, refer different parts of this notebook and [Llama Index Docs](https://docs.llamaindex.ai/en/stable/) to complete this exercie

All the very best for doing these exercises. I hope you guys will have fun doing these.

# Thanks everyone, you have been a great audience. We hope you guys had fun and learnt something valuable. 🥳