# RAG with Galileo, LangChain and GPT
Retrieval-Augmented Generation (RAG) is an architectural approach that can enhance the effectiveness of large language model (LLM) applications using customized data. In this example, we use LangChain, an orchestrator for language pipelines, to build an assistant capable of loading information from a web page and use it for answering user questions

## Step 0: Configuring the environment

This step install the necessary libraries for connecting with Galileo and the models

In [None]:
!pip install langchain-community
!pip install langchain
!pip install langchain_openai
!pip install promptquality #Galileo
!pip install chromadb
!pip install sentence-transformers
!pip install openai
!pip install PyPDF

## Step 1: Data Loading

In this step, we will use the Langchain framework to  extract the content from a local PDF file with the product documentation. Also, we have commented some example on how to use Web Loaders to load data form pages on the web.

In [1]:
from langchain.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyPDFLoader

file_path = (
    "docs/AIStudioDoc.pdf"
)
pdf_loader = PyPDFLoader(file_path)
pdf_data = pdf_loader.load()

USER_AGENT environment variable not set, consider setting it to identify your requests.


## Step 2: Connect to Galileo
Through the Galileo library called Prompt Quality, we connect our API generated in the Galileo console to log in. To get your ApiKey, use this link: https://console.hp.galileocloud.io/api-keys

In [2]:
import promptquality as pq
import os

os.environ['GALILEO_API_KEY'] = os.environ['GALILEO_API_KEY_HP']
galileo_url = "https://console.hp.galileocloud.io/"
config = pq.login(galileo_url)

  from .autonotebook import tqdm as notebook_tqdm


👋 You have logged into 🔭 Galileo (https://console.hp.galileocloud.io/) as minh@rungalileo.io.


## Step 3: Model Selection

In this example, we will define our LLM as GPT-3.5 model hosted by OpenAI. A broader range of models could be used.

In [4]:
import os

from langchain_openai import OpenAI
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", api_key=os.environ["OPENAI_API_KEY"])

#from langchain_openai import ChatOpenAI
#llm = ChatOpenAI(model_name="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])

### Code to connect to Hugging Face models
#import yaml
#with open('config.yaml') as file:
    #config = yaml.safe_load(file)
#huggingfacehub_api_token = config["hf_key"]
#repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
#llm = HuggingFaceEndpoint(
   #huggingfacehub_api_token=huggingfacehub_api_token,
   #repo_id=repo_id,
#)

## Step 4: Embed, Chunk, Construct Chain
First, we split the loaded documents into chunks, so we have smaller and more specific texts to add do our vector database.

Then, we transform the texts into embeddings and store them in a vector database. This allows us to perform similarity search, and proper retrieval of documents.

Next, we define a pipeline that receives a question and context, formats the context documents, and uses a chat model to answer the question based on the provided context. The output is then formatted as a string for easy reading.

Finally, through callbacks, we choose the metrics we want to monitor via the Galileo console. We pass a list of queries to run our created chain and log in to Galileo.

In [5]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma

embedding = HuggingFaceEmbeddings()
CHUNK_SIZES = [i for i in range(500, 800, 100)]
CHUNK_OVERLAPS = [i for i in range(0, 300, 100)]

from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from typing import List
from langchain.schema.document import Document

def format_docs(docs: List[Document]) -> str:
    return "\n\n".join([d.page_content for d in docs])

SYS_INSTRUCTIONS = [
    """You are an virtual Assistant for a Data Science platform called AI Studio. Answer the question based on the following context:""",
    """You are a specialized virtual assistant for AI Studio, a comprehensive Data Science platform. Your role is to assist users with tasks related to data analysis, machine learning, and model building. Always base your answers on the provided context and ensure responses are directly relevant to the platform's functionality, tools, and workflows, prioritizing precision and clarity in your explanations.
    When answering user queries, ensure that responses are concise yet informative. If the context provided is insufficient or the information is unavailable, acknowledge this clearly, and offer alternative suggestions or next steps the user can take within the AI Studio platform. Avoid making unsupported assumptions and focus on providing actionable insights related to the user's Data Science tasks.
    In situations where users inquire about advanced data science topics (such as hyperparameter tuning, feature engineering, or algorithm selection), provide detailed, clear, and example-driven explanations. Make sure your guidance is digestible for users at various experience levels. Avoid unnecessary jargon, and when advanced technical terms are necessary, include brief definitions or explanations to ensure clarity.
    If the user’s query seems ambiguous or lacks enough information, prompt them for clarification to refine the scope of your response. This will help you provide more accurate and relevant information within the context of AI Studio’s features, such as model evaluation, data preprocessing, or code execution. Always aim for accuracy and ensure your response aligns with the platform’s capabilities and tools.""",

]

# Run your chain experiments across multiple inputs with the galileo callback
inputs = [
    "What is AI Studio",
    "How to create projects in AI Studio?",
    "How to monitor experiments?",
    "What are the different workspaces available?",
    "What, exactly, is a workspace?",
    "How to share my experiments with my team?",
    "Can I access my Git repository?",
    "Do I have access to files on my local computer?",
    "How do I access files on the cloud?",
    "Can I invite more people to my team?",
    "How do I install dependencies in AI Studio?",
    "How to set up version control in AI Studio?",
    "What kind of data formats can be imported?",
    "Can I schedule experiments to run automatically?",
    "How do I manage different environments?",
    "What are the options for running experiments?",
    "Can I collaborate with external users on projects?",
    "How can I track changes in the project?",
    "Is there a way to automate reporting?",
    "What integrations are supported?",
    "How do I update existing projects?",
    "What is the best way to organize my projects?",
    "How do I manage roles and permissions?",
    "Can I customize my workspace layout?",
    "What security features are in place?",
    "Is there a limit to the number of experiments I can run?",
    "How do I visualize experiment results?",
    "Can I deploy models directly from AI Studio?",
    "How do I back up my data?",
    "Can I use third-party libraries?",
    "Is there a way to optimize my experiments for performance?",
    "What are the best practices for data handling in AI Studio?",
    "How do I create pipelines for workflows?",
    "Can I set up notifications for completed tasks?",
    "What are the hardware requirements for running AI Studio?",
    "How do I manage resource usage?",
    "Can I import projects from other platforms?",
    "Is there support for real-time collaboration?",
    "How do I perform hyperparameter tuning?",
    "What are the limitations of the free tier?",
    "How can I extend AI Studio with custom tools?"
]

for CHUNK_SIZE in CHUNK_SIZES:
    for CHUNK_OVERLAP in CHUNK_OVERLAPS:
        for idx, SYS_INSTRUCTION in enumerate(SYS_INSTRUCTIONS):
            text_splitter = RecursiveCharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP)
            splits = text_splitter.split_documents(pdf_data)
            vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
            retriever = vectordb.as_retriever()

            template = """[SYS_INSTRUCTION]

                {context}

                Question: {query}
                """.replace("[SYS_INSTRUCTION]", SYS_INSTRUCTION)
            prompt = ChatPromptTemplate.from_template(template)
            chain = {"context": retriever | format_docs, "query": RunnablePassthrough()} | prompt | llm | StrOutputParser()
            
            # Metadata tagging for our Galileo Evaluate run
            metadata_tag_vectordb = pq.RunTag(key="Vector Database", value=vectordb.__class__.__name__, tag_type=pq.TagType.GENERIC)
            metadata_tag_retriever = pq.RunTag(key="Retriever", value=retriever.get_name(), tag_type=pq.TagType.GENERIC)
            metadata_tag_embeddings = pq.RunTag(key="Embeddings", value=embedding.model_name, tag_type=pq.TagType.GENERIC)
            metadata_tag_orchestration = pq.RunTag(key="Orchestration", value="LangChain", tag_type=pq.TagType.GENERIC)
            metadata_tag_model = pq.RunTag(key="Model", value=llm.model_name, tag_type=pq.TagType.GENERIC)
            metadata_tag_CHUNK_SIZE = pq.RunTag(key="CHUNK_SIZE", value=str(CHUNK_SIZE), tag_type=pq.TagType.GENERIC)
            metadata_tag_CHUNK_OVERLAP = pq.RunTag(key="CHUNK_OVERLAP", value=str(CHUNK_OVERLAP), tag_type=pq.TagType.GENERIC)
            metadata_tag_SYS_INSTRUCTION = pq.RunTag(key="SYS_INSTRUCTION_INDEX", value=str(idx), tag_type=pq.TagType.GENERIC)
            run_tags = [metadata_tag_vectordb, metadata_tag_retriever, metadata_tag_embeddings, metadata_tag_orchestration, metadata_tag_model, metadata_tag_CHUNK_SIZE, metadata_tag_CHUNK_OVERLAP, metadata_tag_SYS_INSTRUCTION]

            # Create callback handler
            prompt_handler = pq.GalileoPromptCallback(
                project_name="AIStudio_RAG_Evaluate_35",
                scorers=[pq.Scorers.context_adherence_plus, pq.Scorers.correctness, pq.Scorers.chunk_attribution_utilization_luna, pq.Scorers.toxicity, pq.Scorers.sexist, pq.Scorers.pii,],
                run_tags=run_tags
            )

            # run the chain
            chain.batch(inputs, config=dict(callbacks=[prompt_handler]))
            
            # publish the results of your run
            prompt_handler.finish()

  embedding = HuggingFaceEmbeddings()
  embedding = HuggingFaceEmbeddings()
Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.79it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Done ✅
latency: Done ✅
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/9c846e65-abcf-40e6-a484-e4bcad859252?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  2.26it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/cc94e416-7581-4363-8cfd-25d428992730?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:03<00:00,  1.63it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/cca1973e-cf5c-4b8f-af04-9956ed07cade?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  2.11it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/fd57b31d-3941-48fe-aaac-bdff9d35c99b?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  2.31it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/ffb72885-292f-4fce-b2a0-76933ab3843c?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.94it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/c055e3fe-648a-4e9c-8384-91b63b52e3fe?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  2.30it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/1ab729c6-4c74-4180-bf2a-c489144b5c0a?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.80it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/4af64296-a3d9-4cbf-a484-f5d2407a6de4?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:05<00:00,  1.04s/it]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/0de92901-10c0-49dc-aacd-a9e5d548258b?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:04<00:00,  1.17it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/4f0bc36a-3897-45d5-b979-1749e43f056c?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:04<00:00,  1.14it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/0761c12b-6e8f-4f5f-b35e-8b75e5f2ff4b?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.70it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/d67446e8-86a6-442a-8d73-beea5730ee90?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  2.18it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/1c366c0e-085f-46f0-8c92-bbef5f15464d?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:03<00:00,  1.44it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/36c0df1b-cd1f-46ca-b7e6-833e28386c99?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.75it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/6a0d4c1f-6627-4459-abad-17726a059947?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:03<00:00,  1.56it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/ca61001e-7044-4c01-b394-40b504a6fb27?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:03<00:00,  1.66it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/931d6251-4b69-4bc8-a925-1e26f3f84d2f?taskType=12


Processing complete!: 100%|██████████| 5/5 [00:02<00:00,  1.68it/s]   


Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Computing 🚧
cost: Computing 🚧
toxicity: Computing 🚧
sexist: Computing 🚧
pii: Computing 🚧
protect_status: Computing 🚧
latency: Computing 🚧
groundedness: Computing 🚧
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/3562c0af-f907-48a5-bf96-cf0e351c90fa/ed5f0d25-a15b-4508-b3ab-5953322ac44c?taskType=12
