# RAG with Galileo, LangChain and GPT
Retrieval-Augmented Generation (RAG) is an architectural approach that can enhance the effectiveness of large language model (LLM) applications using customized data. In this example, we use LangChain, an orchestrator for language pipelines, to build an assistant capable of loading information from a web page and use it for answering user questions

## Step 0: Configuring the environment
This step install the necessary libraries for connecting with Galileo and the models. By using our Local GenAI workspace image, most of the libraries are already pre installing - in our case, we just need to add the connector to work with PDF documents

In [2]:
!pip install chromadb==0.5.20
!pip install PyPDF
!pip install pyyaml
!pip install tokenizers==0.20.3
!pip install httpx==0.27.2
!pip install galileo-protect==0.15.1
!pip install galileo-observe==1.13.2

Collecting chromadb==0.5.20
  Downloading chromadb-0.5.20-py3-none-any.whl.metadata (6.8 kB)
Collecting chroma-hnswlib==0.7.6 (from chromadb==0.5.20)
  Downloading chroma_hnswlib-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (252 bytes)
Downloading chromadb-0.5.20-py3-none-any.whl (617 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m617.9/617.9 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m-:--:--[0m
[?25hDownloading chroma_hnswlib-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: chroma-hnswlib, chromadb
  Attempting uninstall: chroma-hnswlib
    Found existing installation: chroma-hnswlib 0.7.3
    Uninstalling chroma-hnswlib-0.7.3:
      Successfully uninstalled chroma-hnswlib-0.7.3
  Attempting uninstall: chromadb
    Found existing installatio

### Configuration of Hugging face caches

In the next cell, we configure HuggingFace cache, so that all the models downloaded from them are persisted locally, even after the workspace is closed. This is a future desired feature for AI Studio and the GenAI addon.

In [3]:
import os
os.environ["HF_HOME"] = "/home/jovyan/local/hugging_face"
os.environ["HF_HUB_CACHE"] = "/home/jovyan/local/hugging_face/hub"

## Step 1: Data Loading

In this step, we will use the Langchain framework to  extract the content from a local PDF file with the product documentation. Also, we have commented some example on how to use Web Loaders to load data form pages on the web.

In [4]:
from langchain.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyPDFLoader

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [5]:
file_path = (
    "data/AIStudioDoc.pdf"
)
pdf_loader = PyPDFLoader(file_path)
pdf_data = pdf_loader.load()

#loader1 = WebBaseLoader("https://www.hp.com/us-en/workstations/ai-studio.html") # If you want to change the knowledge base, just modify this link.
#data1 = loader1.load()

#loader2 = WebBaseLoader("https://zdocs.datascience.hp.com/docs/aistudio")
#data2 = loader2.load()

## Step 2: Creation of Chunks
Here, we split the loaded documents into chunks, so we have smaller and more specific texts to add do our vector database.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter


In [7]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(pdf_data)


## Step 3: Retrieval

We transform the texts into embeddings and store them in a vector database. This allows us to perform similarity search, and proper retrieval of documents


In [8]:
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma


In [9]:
embedding = HuggingFaceEmbeddings()

  from tqdm.autonotebook import tqdm, trange


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [11]:
vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
retriever = vectordb.as_retriever()


## Step 4: Model

In this example, we will use OpenAI API to connect to GPT-3.5 model. A broader range of models could be used.

In [12]:
import os, yaml
from langchain_openai import OpenAI

with open('secrets.yaml') as file:
    secrets = yaml.safe_load(file)
    os.environ["OPENAI_API_KEY"] = secrets["OpenAI"]

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")


In [25]:
### Alternate code to connect to Hugging Face models

#import yaml
#with open('config.yaml') as file:
    #config = yaml.safe_load(file)
#huggingfacehub_api_token = config["hf_key"]
#repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
#llm = HuggingFaceEndpoint(
   #huggingfacehub_api_token=huggingfacehub_api_token,
   #repo_id=repo_id,
#)


In [12]:
###Alternate code to load local models

#from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
#from langchain_community.llms import LlamaCpp

#callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

#llm = LlamaCpp(
#            model_path="/home/jovyan/datafabric/llama2-7b/ggml-model-f16-Q5_K_M.gguf",
#            n_gpu_layers=30,
#            n_batch=512,
#            n_ctx=4096,
#            max_tokens=1024,
#            f16_kv=True,  
#            callback_manager=callback_manager,
#            verbose=False,
#            stop=[],
#            streaming=False,
#            temperature=0.2,
#        )    

## Step 5: Chain
In this part, we define a pipeline that receives a question and context, formats the context documents, and uses a Hugging Face (Mistral) chat model to answer the question based on the provided context. The output is then formatted as a string for easy reading.

In [13]:
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from typing import List
from langchain.schema.document import Document

def format_docs(docs: List[Document]) -> str:
    return "\n\n".join([d.page_content for d in docs])

template = """You are an virtual Assistant for a Data Science platform called AI Studio. Answer the question based on the following context:

    {context}

    Question: {query}
    """
prompt = ChatPromptTemplate.from_template(template)

chain = {"context": retriever | format_docs, "query": RunnablePassthrough()} | prompt | llm | StrOutputParser()

## Step 6: Galileo Evaluate
Through the Galileo library called Prompt Quality, we connect our API generated in the Galileo Evaluate to log in. To get your ApiKey, use this link: https://console.hp.galileocloud.io/api-keys

Galileo Evaluate is a platform designed to optimize and simplify the experimentation and evaluation of generative AI systems, especially large language model (LLM) applications. Its goal is to facilitate the process of building AI systems with deep insights and collaborative tools, replacing fragmented experimentation in spreadsheets and notebooks with a more integrated approach.

You can log metrics in Galileo Evaluate and track all your experiments in one place. In our example, we logged several questions, selected specific metrics, and ran a batch of experiments to evaluate our chain. To learn more about the available metrics, see: [Galileo Guardrail Metrics](https://docs.rungalileo.io/galileo/gen-ai-studio-products/galileo-guardrail-metrics).

In [14]:
import promptquality as pq

with open('secrets.yaml') as file:
    secrets = yaml.safe_load(file)
    os.environ['GALILEO_API_KEY'] = secrets["Galileo"]

os.environ['GALILEO_CONSOLE_URL'] = "https://console.hp.galileocloud.io/" 

pq.login(os.environ['GALILEO_CONSOLE_URL'])

👋 You have logged into 🔭 Galileo (https://console.hp.galileocloud.io/) as diogo.vieira@hp.com.


Config(console_url=HttpUrl('https://console.hp.galileocloud.io/'), username=None, password=None, api_key=SecretStr('**********'), token=SecretStr('**********'), current_user='diogo.vieira@hp.com', current_project_id=None, current_project_name=None, current_run_id=None, current_run_name=None, current_run_url=None, current_run_task_type=None, current_template_id=None, current_template_name=None, current_template_version_id=None, current_template_version=None, current_template=None, current_dataset_id=None, current_job_id=None, current_prompt_optimization_job_id=None, api_url=HttpUrl('https://api.hp.galileocloud.io/'))

In [15]:
# Create callback handler
prompt_handler = pq.GalileoPromptCallback(
    project_name="AIStudio_Chatbot_template",
    scorers=[pq.Scorers.context_adherence_luna, pq.Scorers.correctness, pq.Scorers.toxicity, pq.Scorers.sexist]
)

# Run your chain experiments across multiple inputs with the galileo callback
inputs = [
    "What is AI Studio",
    "How to create projects in AI Studio?"
    "How to monitor experiments?",
    "What are the different workspaces available?",
    "What, exactly, is a workspace?",
    "How to share my experiments with my team?",
    "Can I access my Git repository?",
    "Do I have access to files on my local computer?",
    "How do I access files on the cloud?",
    "Can I invite more people to my team?"
]
chain.batch(inputs, config=dict(callbacks=[prompt_handler]))

# publish the results of your run
prompt_handler.finish()

Processing chain run...:   0%|          | 0/5 [00:00<?, ?it/s]

Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Done ✅
cost: Done ✅
toxicity: Done ✅
sexist: Done ✅
pii: Done ✅
protect_status: Done ✅
latency: Done ✅
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/9c4b6241-f741-4f82-a31c-e5cd31cadf70/5b647498-e2a2-4c22-a963-ba8eec9a3f71?taskType=12


## Galileo Protect

Galileo Protect serves as a powerful tool for safeguarding AI model outputs by detecting and preventing the release of sensitive information like personal addresses or other PII. By integrating Galileo Protect into your AI pipelines, you can ensure that model responses comply with privacy and security guidelines in real-time.

Galileo functions as an API that provides support for protection verification of your chain/LLM. To log into the Galileo console, it is necessary to integrate it with another service, such as Galileo Evaluate or Galileo Observe.

**Attention**: an integrated API within the Galileo console is required to perform this verification.

In [17]:
import galileo_protect as gp
import os

project = gp.create_project('aistudio_first_protect')
project_id = project.id

stage = gp.create_stage(name="aistudio_first_protect", project_id=project_id)
stage_id = stage.id


Galileo Protect works by creating rules that identify conditions such as Personally Identifiable Information (PII) and toxicity. It ensures that the prompt will not receive or respond to sensitive questions. In this example, we create a set of rules (ruleset) and a set of actions that return a pre-programmed response if a rule is triggered. Galileo Protect also offers a variety of other metrics to suit different protection needs. You can learn more about the available metrics here: [Supported Metrics and Operators](https://docs.rungalileo.io/galileo/gen-ai-studio-products/galileo-protect/how-to/supported-metrics-and-operators).

Additionally, it is possible to import rulesets directly from Galileo through stages. Learn more about this feature here: [Invoking Rulesets](https://docs.rungalileo.io/galileo/gen-ai-studio-products/galileo-protect/how-to/invoking-rulesets).


In [18]:
import galileo_protect as gp
from galileo_protect import OverrideAction, ProtectTool, ProtectParser, Ruleset

stage_id = stage.id  
project_id = project.id 

protect_tool = ProtectTool(
    stage_id=stage_id,  
    prioritized_rulesets=[
        Ruleset(rules=[
            {
                "metric": "pii",
                "operator": "contains",
                "target_value": "ssn",
            },
        ],
        action={
            "type": "OVERRIDE",
            "choices": [
                "Personal Identifiable Information detected in the model output. Sorry, I cannot answer that question."
            ],
        }),
    ],
    timeout=10
)

protect_parser = ProtectParser(chain=chain)

protected_chain = protect_tool | protect_parser.parser

protected_chain.invoke({"input": "What's my SSN? Hint: my SSN is 123-45-6789", "output": "Your SSN is 123-45-6789"})


'Personal Identifiable Information detected in the model output. Sorry, I cannot answer that question.'

## Galileo Observe

Galileo Observe helps you monitor your generative AI applications in production. With Observe you will understand how your users are using your application and identify where things are going wrong. Keep tabs on your production system, instantly receive alerts when bad things happen, and perform deep root cause analysis though the Observe dashboard.

You can connect Galileo Observe to your Langchain chain to monitor metrics such as cost and guardrail indicators.

In [19]:
#Connecting Langchain with Galileo Observe
from galileo_observe import GalileoObserveCallback

monitor_handler = GalileoObserveCallback(project_name="test_galileo_observe")

example_query = "What is AI Studio?"

output = chain.invoke(
    example_query,
    config=dict(callbacks=[monitor_handler])
)

output


'\nAI Studio is a Data Science platform that provides virtual Assistant and enables users to access real-time GPU, CPU, and Memory consumption. It also offers recommendations for configuration improvements based on usage trends and allows users to run multiple notebooks simultaneously on the Notebooks page. Additionally, AI Studio can install and configure necessary images for users if they do not already exist.'

### Galileo Protect + Evaluate

Here, we combined Galileo Protect with Galileo Evaluate and ran a batch of sensitive questions to see Galileo Protect in action.


In [48]:
prompt_handler = pq.GalileoPromptCallback(
    project_name="test_galileo_evaluate",
    scorers=[pq.Scorers.context_adherence_luna, pq.Scorers.correctness, pq.Scorers.toxicity, pq.Scorers.sexist, pq.Scorers.pii]
)

# List of inputs with different variants of SSN queries
inputs = [
    {"input": "What's my SSN? Hint: my SSN is 123-45-6789", "output": "Your SSN is 123-45-6789"},
    {"input": "Can you confirm my SSN is 987-65-4321?", "output": "Your SSN is 987-65-4321"},
    {"input": "I'm trying to remember my SSN, I think it's 111-22-3333", "output": "Your SSN is 111-22-3333"},
    {"input": "Is 222-33-4444 my SSN?", "output": "Yes, that is your SSN."},
    {"input": "Could you look up my SSN: 555-66-7777?", "output": "Your SSN is 555-66-7777"},
]

# Running the batch chain with GalileoPromptCallback
protected_chain.batch(inputs, config=dict(callbacks=[prompt_handler]))

# Finalizing and publishing the results
prompt_handler.finish()


Processing chain run...:   0%|          | 0/5 [00:00<?, ?it/s]

Initial job complete, executing scorers asynchronously. Current status:
rag_nli: Done ✅
cost: Done ✅
toxicity: Done ✅
sexist: Done ✅
pii: Done ✅
protect_status: Done ✅
latency: Done ✅
factuality: Computing 🚧
🔭 View your prompt run on the Galileo console at: https://console.hp.galileocloud.io/prompt/chains/f81f0f16-52ee-48a8-ad28-d2752fbcf771/a0a9851c-4787-4a6d-b543-585bea5b71a0?taskType=12


## Model service Galileo Protect + Observe

In [13]:
from langchain_community.llms import HuggingFaceEndpoint
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler
from langchain_community.llms import LlamaCpp
import os
import yaml
import uuid
import base64
import mlflow
import pandas as pd
from typing import List
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain_openai import OpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda, RunnableMap
from langchain.schema.document import Document
from mlflow.pyfunc import PythonModel
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, ColSpec, ParamSchema, ParamSpec
import galileo_protect as gp
from galileo_protect import ProtectTool, ProtectParser, Ruleset
from galileo_observe import GalileoObserveCallback

def format_docs(docs: List[Document]) -> str:
    return "\n\n".join([doc.page_content for doc in docs if isinstance(doc.page_content, str)])

class AIStudioChatbotService(PythonModel):
    def load_context(self, context):
        print("Loading model context.")

        secrets_path = context.artifacts["secrets"]
        docs_path = context.artifacts["docs"]
        print(f"Loading secrets.yaml file from the path: {secrets_path}")
        
        with open(secrets_path, "r") as file:
            secrets = yaml.safe_load(file)
        
        model_config = {
            "source": secrets.get("source", "OpenAI"),
            "openai_key": secrets.get("OpenAI", ""),
            "huggingface_key": secrets.get("HuggingFace", ""),
            "galileo_key": secrets["Galileo"],
            "galileo_url": secrets.get("galileo_url", "https://console.hp.galileocloud.io/"),
            "observe_project": "chatbot_aistudio_galileo",
            "protect_project": "Chatbot_protect_2",
            "hf_model_repo": secrets.get("hf_model_repo", "mistralai/Mistral-7B-Instruct-v0.2"),
        }

        os.environ["GALILEO_API_KEY"] = model_config["galileo_key"]
        os.environ["GALILEO_CONSOLE_URL"] = model_config["galileo_url"]

        if model_config["source"] == "OpenAI":
            if not model_config["openai_key"]:
                raise ValueError("The OpenAI key was not provided.")
            os.environ["OPENAI_API_KEY"] = model_config["openai_key"]
            self.llm = OpenAI(model_name="gpt-3.5-turbo-instruct")
            print("Using the OpenAI model.")
        elif model_config["source"] == "HuggingFace":
            if not model_config["huggingface_key"]:
                raise ValueError("The HuggingFace key was not provided.")
            self.llm = HuggingFaceEndpoint(
                huggingfacehub_api_token=model_config["huggingface_key"],
                repo_id=model_config["hf_model_repo"]
            )
            print("Using the HuggingFace model.")
        else:
            raise ValueError("Invalid model source. Use 'OpenAI' or 'HuggingFace'.")

        pdf_path = os.path.join(docs_path, "AIStudioDoc.pdf")
        if not os.path.exists(pdf_path):
            raise FileNotFoundError(f"The file 'AIStudioDoc.pdf' was not found at: {pdf_path}")
        print(f"Reading and processing the PDF file: {pdf_path}")

        pdf_loader = PyPDFLoader(pdf_path)
        pdf_data = pdf_loader.load()
        for doc in pdf_data:
            if not isinstance(doc.page_content, str):
                doc.page_content = str(doc.page_content)

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
        splits = text_splitter.split_documents(pdf_data)
        print(f"PDF split into {len(splits)} parts.")

        embedding = HuggingFaceEmbeddings()
        vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
        self.retriever = vectordb.as_retriever()
        print("Vector database created successfully.")

        self.prompt_str = """You are a virtual assistant for a Data Science platform called AI Studio. Answer the question based on the following context:

        {context}

        Question: {input}
        """
        self.prompt = ChatPromptTemplate.from_template(self.prompt_str)

        input_normalizer = RunnableLambda(lambda x: {"input": x} if isinstance(x, str) else x)

        retriever_runnable = RunnableLambda(lambda x: self.retriever.get_relevant_documents(x["input"]))

        format_docs_r = RunnableLambda(format_docs)

        extract_input = RunnableLambda(lambda x: x["input"])

        
        self.chain = (
            input_normalizer
            | RunnableMap({
                "context": retriever_runnable | format_docs_r,
                "input": extract_input
            })
            | self.prompt
            | self.llm
            | StrOutputParser()
        )

        project = gp.create_project(model_config["protect_project"])
        project_id = project.id
        print(f"Project created in Galileo Protect. Project ID: {project_id}")

        stage = gp.create_stage(name=model_config["protect_project"], project_id=project_id)
        stage_id = stage.id
        print(f"Stage created in Galileo Protect. Stage ID: {stage_id}")

        ruleset = Ruleset(
            rules=[
                {
                    "metric": "pii",
                    "operator": "contains",
                    "target_value": "ssn",
                },
            ],
            action={
                "type": "OVERRIDE",
                "choices": [
                    "Personal Identifiable Information detected in the model output. Sorry, I cannot answer that question."
                ],
            }
        )
        protect_tool = ProtectTool(stage_id=stage_id, prioritized_rulesets=[ruleset], timeout=10)
        protect_parser = ProtectParser(chain=self.chain)
        self.protected_chain = protect_tool | protect_parser.parser

        self.monitor_handler = GalileoObserveCallback(project_name=model_config["observe_project"])
        print("Embeddings, vector database, LLM, Galileo Protect and Observer models successfully configured.")

        self.memory = []

    def add_pdf(self, base64_pdf):
        pdf_bytes = base64.b64decode(base64_pdf)
        temp_pdf_path = f"/tmp/{uuid.uuid4()}.pdf"
        with open(temp_pdf_path, "wb") as f:
            f.write(pdf_bytes)

        pdf_loader = PyPDFLoader(temp_pdf_path)
        pdf_data = pdf_loader.load()
        for doc in pdf_data:
            if not isinstance(doc.page_content, str):
                doc.page_content = str(doc.page_content)

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
        new_splits = text_splitter.split_documents(pdf_data)

        embedding = HuggingFaceEmbeddings()
        vectordb = Chroma.from_documents(documents=new_splits, embedding=embedding)
        self.retriever = vectordb.as_retriever()

        return {
            "chunks": [],
            "history": [],
            "prompt": self.prompt_str,
            "output": "",
            "success": True
        }

    def get_prompt_template(self):
        return {
            "chunks": [],
            "history": [],
            "prompt": self.prompt_str,
            "output": "",
            "success": True
        }

    def set_prompt_template(self, new_prompt):
        self.prompt_str = new_prompt
        self.prompt = ChatPromptTemplate.from_template(self.prompt_str)
        return {
            "chunks": [],
            "history": [],
            "prompt": self.prompt_str,
            "output": "",
            "success": True
        }

    def reset_history(self):
        self.memory = []
        return {
            "chunks": [],
            "history": [],
            "prompt": self.prompt_str,
            "output": "",
            "success": True
        }

    def inference(self, context, user_query):
        response = self.protected_chain.invoke({"input": user_query, "output": ""}, config=dict(callbacks=[self.monitor_handler]))
        relevant_docs = self.retriever.get_relevant_documents(user_query)
        chunks = [doc.page_content for doc in relevant_docs]

        self.memory.append({"role": "User", "content": user_query})
        self.memory.append({"role": "Assistant", "content": response})

        return {
            "chunks": chunks,
            "history": [f"<{m['role']}> {m['content']}\n" for m in self.memory],
            "prompt": self.prompt_str,
            "output": response,
            "success": True
        }

    def predict(self, context, model_input, params):
        if params.get("add_pdf", False):
            result = self.add_pdf(model_input['document'][0])
        elif params.get("get_prompt", False):
            result = self.get_prompt_template()
        elif params.get("set_prompt", False):
            result = self.set_prompt_template(model_input['prompt'][0])
        elif params.get("reset_history", False):
            result = self.reset_history()
        else:
            result = self.inference(context, model_input['query'][0])

        return pd.DataFrame([result])

    @classmethod
    def log_model(cls, secrets_path, docs_path, demo_folder):
        input_schema = Schema([
            ColSpec("string", "query"),
            ColSpec("string", "prompt"),
            ColSpec("string", "document")
        ])
        output_schema = Schema([
            ColSpec("string", "chunks"),
            ColSpec("string", "history"),
            ColSpec("string", "prompt"),
            ColSpec("string", "output"),
            ColSpec("boolean", "success")
        ])
        param_schema = ParamSchema([
            ParamSpec("add_pdf", "boolean", False),
            ParamSpec("get_prompt", "boolean", False),
            ParamSpec("set_prompt", "boolean", False),
            ParamSpec("reset_history", "boolean", False)
        ])
        signature = ModelSignature(inputs=input_schema, outputs=output_schema, params=param_schema)
        artifacts = {"secrets": secrets_path, "docs": docs_path, "demo": demo_folder}

        mlflow.pyfunc.log_model(
            artifact_path="aistudio_chatbot_service",
            python_model=cls(),
            artifacts=artifacts,
            signature=signature,
            pip_requirements=[
                "chromadb==0.5.20",
                "PyPDF",
                "pyyaml",
                "tokenizers==0.20.3",
                "httpx==0.27.2",
                "galileo-protect==0.15.1",
                "galileo-observe==1.13.2"
            ]
        )
        print("Model and artifacts successfully registered in MLflow.")

if __name__ == "__main__":
    print("Initializing experiment in MLflow.")
    mlflow.set_experiment("AIStudioChatbot_Service")

    secrets_path = "secrets.yaml"
    docs_path = "data"
    demo_folder = "demo"

    if not os.path.exists(secrets_path):
        raise FileNotFoundError(f"secrets.yaml file not found in path: {os.path.abspath(secrets_path)}")
    if not os.path.exists(docs_path):
        raise FileNotFoundError(f"'docs' folder not found in path: {os.path.abspath(docs_path)}")

    with mlflow.start_run(run_name="AIStudioChatbot_Service_Run") as run:
        AIStudioChatbotService.log_model(secrets_path=secrets_path, docs_path=docs_path, demo_folder=demo_folder)
        model_uri = f"runs:/{run.info.run_id}/aistudio_chatbot_service"
        mlflow.register_model(
            model_uri=model_uri,
            name="AIStudioChatbot_Service_Model",
        )
        print(f"Registered model with execution ID: {run.info.run_id}")
        print(f"Model registered successfully. Run ID: {run.info.run_id}")


Initializing experiment in MLflow.


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts: 0it [00:00, ?it/s]

Model and artifacts successfully registered in MLflow.


Registered model 'AIStudioChatbot_Service_Model' already exists. Creating a new version of this model...


Registered model with execution ID: f339f0c9c652479e8196305c63d4ee83
Model registered successfully. Run ID: f339f0c9c652479e8196305c63d4ee83


Created version '20' of model 'AIStudioChatbot_Service_Model'.
