# Summary:
Here, I attempted to build a **chatbot as an AI-based conversational tool** to improve the user experience while inquiring about the Italian population in 2023. Two open-source Large Language Models (LLMs)—**Llama3** from Meta and **Mixtral** from Mistral—were employed to comprehend user prompts and produce human-like responses in the information retrieval process, as part of the **Retrieval Augmented Generation (RAG)** method. Additionally, this procedure was implemented using the **LangChain** orchestration framework.




In [None]:
# Install required libraries and frameworks for the project
! pip install langchain_community
! pip install langchain
! pip install sentence-transformers
! pip install faiss-cpu
! pip install --upgrade gradio
! pip install langchain_groq
! pip install langchain_openai
! pip install ragas

In [None]:
# Import required modules and classes

import pandas as pd #For data manipulation and To load and process CSV documents.
from langchain.text_splitter import RecursiveCharacterTextSplitter #For splitting text into smaller chunks for processing.
from langchain.embeddings import HuggingFaceEmbeddings #To generate text embeddings using HuggingFace models.
from langchain.vectorstores import FAISS #For efficient vector similarity search.
from langchain_groq import ChatGroq #To integrate Groq-based conversational capabilities and deploy LLMs easily.
from langchain.memory import ConversationBufferMemory #To maintain a memory buffer for conversational context.
from langchain.chains import ConversationalRetrievalChain #For building conversational systems with retrieval capabilities.
from langchain.prompts import PromptTemplate #For defining structured prompts for language models.
from langchain_core.prompts import ChatPromptTemplate
import os #Used to interact with environment variables, such as setting `OPENAI_API_KEY` for authentication

In [None]:
from getpass import getpass #To securely input sensitive information like API keys.
os.environ["OPENAI_API_KEY"] = "" #Personal OpenAI key must be added here

Load the dataset from a CSV file into a pandas DataFrame.
(The file contains data about the population in Italy for 2023)

In [None]:
df = pd.read_csv('/content/popolazione_Italia_2023_Places_updated.csv')

In [None]:
# Path to the FAISS vector store database.
# This is where the precomputed vector embeddings are stored for efficient retrieval.
DB_FAISS_PATH = "vectorstore/db_faiss"

In [None]:
df.head(10)

Unnamed: 0,Type of place,Codice,Luogo,Codice_Luogo,Maschi,Femmine,Totale,province,region,group_of_region,Country
0,Country,IT,Italia,[IT] Italia,28814832,30182369,58997201,Italy,Italy,Italy,Italy
1,Group of regions,ITCD,Nord,[ITCD] Nord,13429002,13988146,27417148,Nord,Nord,Nord,Italy
2,Group of regions,ITC,Nord-ovest,[ITC] Nord-ovest,7759911,8098715,15858626,Nord-ovest,Nord-ovest,Nord-ovest,Italy
3,Region,ITC1,Piemonte,[ITC1] Piemonte,2072771,2178580,4251351,Piemonte,Piemonte,Nord-ovest,Italy
4,Province,ITC11,Torino,[ITC11] Torino,1069885,1134747,2204632,Torino,Piemonte,Nord-ovest,Italy
5,Cities,1001,Agliè,[001001] Agliè,1229,1339,2568,Torino,Piemonte,Nord-ovest,Italy
6,Cities,1002,Airasca,[001002] Airasca,1871,1798,3669,Torino,Piemonte,Nord-ovest,Italy
7,Cities,1003,Ala di Stura,[001003] Ala di Stura,241,223,464,Torino,Piemonte,Nord-ovest,Italy
8,Cities,1004,Albiano d'Ivrea,[001004] Albiano d'Ivrea,776,852,1628,Torino,Piemonte,Nord-ovest,Italy
9,Cities,1006,Almese,[001006] Almese,3095,3197,6292,Torino,Piemonte,Nord-ovest,Italy


Now, In the following part, the code processes population data from a DataFrame and converts it into a LangChain Document object. It loops through the DataFrame, creating descriptive sentences for each location in Italy, including details like the type of place, region, province, and population statistics. The sentences are combined into a single text string, separated by "#####". Finally, the text is wrapped in a Document object with metadata indicating the source is local, making it compatible with LangChain workflows like StuffDocumentsChain.

In [None]:
from langchain.schema.document import Document
text = ""
for ind in df.index:
    text += f"{df['Luogo'][ind]} is kind of {df['Type of place'][ind]} of the part {df['group_of_region'][ind]} of Italy in province {df['province'][ind]} and region of {df['region'][ind]} that has {df['Maschi'][ind]} male population and {df['Femmine'][ind]} female population and {df['Totale'][ind]} persons as total population#####"
#Converting text to LangChain documents so that StuffDocumentsChain can understand Input
documents = Document(page_content=text, metadata={"source": "local"})

A **chunk** is a smaller segment of a larger text, used to make processing more efficient and manageable, especially in applications like information retrieval or language model inputs. Splitting text into chunks helps preserve context while staying within token limits of language models.

This code splits the large text document into smaller, overlapping chunks for better processing. It uses a RecursiveCharacterTextSplitter, specifying a chunk size of 250 characters and an overlap of 20 characters to retain context between chunks. The text is split at the separator "#####". The resulting chunks are stored in chunked_docs, and the print statement outputs the total number of chunks created.

In [None]:
# Split the text into Chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=250, chunk_overlap=20, separators="[#####]")
chunked_docs  = text_splitter.split_documents([documents])

print(len(chunked_docs))

8057


In [None]:
chunked_docs[1345]

Document(metadata={'source': 'local'}, page_content='#####Albisola Superiore is kind of Cities of the part Nord-ovest of Italy in province Savona and region of Liguria that has 4483 male population and 5114 female population and 9597 persons as total population####')

In [None]:
# Download Sentence Transformers Embedding From Hugging Face
embeddings = HuggingFaceEmbeddings(model_name = 'sentence-transformers/all-MiniLM-L6-v2')

In [None]:
# Converting the text Chunks into embeddings and saving the embeddings into FAISS Knowledge Base
docsearch = FAISS.from_documents(chunked_docs, embeddings)

docsearch.save_local(DB_FAISS_PATH)

In the next cell, a conversational AI system for answering questions about population data in different types of areas has been set up. It initializes a ChatGroq model with a specified temperature, model name, and API key. A system message is defined to guide the assistant's behavior, and the human variable represents the dynamic input text. A ChatPromptTemplate is created to structure the conversation. The chain of the prompt and model is established. A retriever is configured to search for similar documents, returning the top 3 matches. Finally, a ConversationalRetrievalChain is set up, combining the model and retriever to answer questions based on the provided data and return relevant source documents.

In [None]:
llm = ChatGroq(temperature=0.5, model_name='Mixtral-8x7b-32768', groq_api_key="Groq API key") #Personal Groq API key muste be added
# llm = ChatGroq(temperature=0.5, model_name='Llama3-8b-8192', groq_api_key="Groq API key")
system = """You are an assistant to answer the questions about population in different types of areas. recognize the type of location and answer the question based on its information."""
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", "{text}")])
chain = prompt | llm
retriever = docsearch.as_retriever(
    search_type="similarity",
    search_kwargs={'k':3}
)
qa = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    return_source_documents=True,
    verbose=True
)

The following code sets up a simple chatbot interface using Gradio, which interacts with the previously defined LangChain model to answer questions about the population in Italy for 2023. The chatbot function handles user queries, providing responses based on a conversational history. If the user enters 'exit', the conversation ends; if the input is empty, the bot prompts for a valid query. For valid queries, the bot retrieves an answer using the ConversationalRetrievalChain, appends the question and answer to the chat history, and returns the response.

The Gradio interface (gr.Interface) is configured with:

A Textbox for user input (where the user can ask questions).
A State to maintain the chat history.
A Textbox for displaying the bot's response.
The interface is launched with a title and description, making it easy to interact with the chatbot.

In [None]:
import gradio as gr
def chatbot(query, chat_history):
    if query == 'exit':
        return 'Exiting', chat_history

    if query == '':
        return 'Please enter a valid prompt.', chat_history

    result = qa({"question": query, "chat_history": chat_history})
    chat_history.append((query, result['answer']))
    return result['answer'], chat_history

# Initial empty chat history
chat_history = []

# Define the Gradio interface
iface = gr.Interface(
    fn=chatbot,
    inputs=[
        gr.Textbox(lines=1, placeholder="Ask your question about population in Italy of the year 2023:", label="DataDive: Your Database Navigator:"),
        gr.State(chat_history)
    ],
    outputs=[
        gr.Textbox(label="Response"),
        gr.State()
    ],
    title="Chatbot based on Langchain using Mixtral",
    description="Enter your query and receive a response from the chatbot."
)

# Launch the interface
iface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://bf77a9e09194786d65.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




# **Assessment Part:**

RAGAS (Retrieval Augmented Generation Assessment) evaluates the functionality of the
RAG system in various capacities. It evaluates the retrieval system's ability to identify pertinent
passages, the LLM's ability to utilize them correctly, and the overall quality of the produced
output.The efficacy of different components inside the RAG pipeline profoundly influences the
overall experience. Ragas provides metrics designed to assess each element of the RAG
pipeline independently.


In the context of RAGAS (Retrieval-Augmented Generation for Answering System), ground truth refers to the reference or correct answer against which the system's performance is evaluated. It serves as the benchmark for assessing the accuracy and relevance of the generated responses. In RAGAS assessment for some metrics, the model's output is compared to the ground truth to measure how well it retrieves the correct information from external sources and uses that information to generate an appropriate answer.

In [None]:
#Set up Groun Truth
eval_questions = [
     "What is the total population of province Prato?",
     "where is Assisi and what is the male population of there?",
     "which city in Napoli province is the most populous?",
     "Compare the population of men and women in the city of Roma.",
     "Tell me about the difference in sex between the people who live in Cusano Milanino?",
     "What is the total population of Leini?",
     "What is the total and Male population of Novara province?",
     "What is the female population of Palmi?",
     "What is the percentage of the total population of Italy that resides in the region Lombardy",
     "What is the ratio of male to female population in the province Latina?",
     "In Sicilia region, how does the female population compare to the male population in terms of percentage",
     "tell me about the population of women in Belmonte Mezzagno?",
     "what is the exact female population of Tivoli?",
     "How many male populations do reside in Ercolano?",
     "Tell me about the total population of Bari?",
     "How many people live in the city of Castellaneta?",
     "Which region in the Nord-est group has the most evenly balanced gender ratio?",
     "What is the male population of the region Piemonte?",
     "what is population of the region Emilia-Romagna?",
     "How does the male population of Alseno city compare to the female population"
]

eval_answers = [
     "The total population in Prato is 259244",
     "Assisi as a city in province Perugia and region of Umbria in center of Italy has 13339 male population",
     "The city of Napoli with total population of 917510 is the most populous city in the province of Napoli",
     "According to the provided data, the population of men in the city of Roma is 1308818, and the population of women is 1446491.",
     "The male population of Cusano Milanino is 8991, while the female population is 9900. Thus, the difference between the male and female population is 909.",
     "The total population of Leini is 16294.",
     "the total population Novara province is 362502 and male population in Novara province is 176980",
     "The female population of Palmi is 8733",
     "About 17% of all the people who live in Italy live in the area of Lombardy.",
     "The male to female population ratio in province Latina is 98%",
     "In Sicilia, the female population is approximately 51.3%, while the male population is 48.7%",
     "in the city of Belmonte Mezzagno, the women population is 5530",
     "The female population of Tivoli is 28032",
     "the male population in Ercolano is 24407",
     "Bari has a total population of 1225048 as a province, and 316736 as a city.",
     "the total population of Castellaneta is 16220 people",
     "The most balanced gender ratio in the Nord-est group is found in Veneto",
     "The male population of the region Piemonte is 2,072,771",
     "The total population of the region Emilia-Romagna is 4435758",
     "the male population in Alseno city is 2315, and the female population is 2374"
]

examples = [
    {"query": q, "ground_truth": [eval_answers[i]]}
    for i, q in enumerate(eval_questions)
]

In [None]:
examples

In [None]:
from ragas.integrations.langchain import EvaluatorChain
#EvaluatorChain is a tool that helps assess the effectiveness of a RAGAS system integrated with LangChain by evaluating its retrieval and generation capabilities.

In [None]:
qasecond = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    return_source_documents=True,
    verbose=True
)

In [None]:
result0 = qasecond({"question": eval_questions[0] , "chat_history": []})
result0

**Metrics definition:**

These metrics are categorized into two groups: Generation assessment and retrieval
assessment, which are described as follows.


1.Generation metrics:

*   Faithfulness: This metric measures the factual consistency of the generated answer against the given context.
*   Answer Correctness: This metric measures the accuracy of the generated answer compared to the ground truth.
*   Answer Relevancy: These metrics gauge the relevancy of the retrieved context, calculated based on both the question and contexts.

2.Retrieval metrics:



*   Context Recall: This metric measures how much the retrieved context aligns with the annotated answer, treated as the ground truth.
*   Context Precision: This metric examines how well important information is prioritized in various situations.


In [None]:
from ragas.integrations.langchain import EvaluatorChain
#from ragas.langchain.evalchain import RagasEvaluatorChain
from ragas.metrics import (
    faithfulness,
    answer_correctness,
    context_precision,
    context_recall,
    answer_relevancy
)

In [None]:
faithfulness_chain = EvaluatorChain(metric=faithfulness)
answer_rel_chain = EvaluatorChain(metric=answer_correctness)
context_rel_chain = EvaluatorChain(metric=context_precision)
context_recall_chain = EvaluatorChain(metric=context_recall)

In [None]:
result0['answer']

'The total population of province Prato is 259244 persons.'

In [None]:
results = []
contexts = []
for query in eval_questions:
    resulttt = qasecond({"question": query , "chat_history" : []})

    results.append(resulttt['answer'])
    sources = resulttt["source_documents"]
    contents = []
    for i in range(len(sources)):
        contents.append(sources[i].page_content)
    contexts.append(contents)

In [None]:
resulttt

In [None]:
from datasets import Dataset

In [None]:
from ragas import evaluate


The following code defines a dictionary "d" that contains key evaluation components: question, answer, contexts, and ground_truth. The question and answer represent the inputs and outputs of the evaluation, while contexts refer to the relevant documents or information used in generating the answers, and ground_truth provides the correct answers for comparison. The dictionary d is then converted into a dataset using Dataset.from_dict(d). The evaluate function is used to assess the model's performance, using multiple metrics such as faithfulness, answer correctness, answer relevancy, context precision, and context recall. The evaluation results are converted into a Pandas DataFrame (score_df), which is then saved as a CSV file (EvaluationScores.csv) for further analysis and reporting, with UTF-8 encoding and no row index.








In [None]:
d = {
    "question": eval_questions,
    "answer": results,
    "contexts": contexts,
    "ground_truth": eval_answers
}

dataset = Dataset.from_dict(d)
score = evaluate(dataset,metrics=[faithfulness,answer_correctness, answer_relevancy, context_precision, context_recall])
score_df = score.to_pandas()
score_df.to_csv("EvaluationScores.csv", encoding="utf-8", index=False)


Evaluating:   0%|          | 0/100 [00:00<?, ?it/s]

In [None]:
score_df[['faithfulness','answer_correctness','answer_relevancy', 'context_precision', 'context_recall']].mean(axis=0)


Unnamed: 0,0
faithfulness,0.880357
answer_correctness,0.688299
answer_relevancy,0.726833
context_precision,0.9
context_recall,0.75
