# 🚀 LLMOps for Production RAG

<a target="_blank" href="https://colab.research.google.com/github/unionai-oss/llmops-production-rag/blob/main/workshop.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Welcome to the LLMOps for Production RAG workshop! In this workshop, we will cover:

1. Creating a baseline RAG pipeline
2. Bootstrapping an evaluation dataset
3. RAG Hyperparameter Optimization


In [None]:
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    !git clone https://github.com/unionai-oss/llmops-production-rag.git
    %cd llmops-production-rag
    %pip install -r requirements.lock.txt
    %pip install gradio

In [None]:
%cd /content/llmops-production-rag
!union create login --auth device-flow --serverless

## 🔑 Create OpenAI API Key Secret on Union

First go to https://platform.openai.com/account/api-keys and create an OpenAI API key.

Then, run the following command to make the secret accessible on Union:

In [None]:
!union create secret openai_api_key

In [None]:
!union get secret

If you have issues with the secret, you can delete it by uncommenting the code cell below:

In [None]:
#!union delete secret openai_api_key

## 🗂️ Creating a Baseline RAG Pipeline

Create the vector store:

In [None]:
!union run --remote llmops_rag/vector_store.py create_vector_store --limit 10

Then run the simple rag pipeline

In [None]:
!union run --remote llmops_rag/rag_basic.py rag_basic --questions '["How do I read and write a pandas dataframe to csv format?"]'

You can also run the pipeline with an Ollama server:

In [None]:
!union run --remote llmops_rag/rag_basic.py rag_basic_ollama --questions '["How do I read and write a pandas dataframe to csv format?"]'

### 💻 Run RAG pipeline with Gradio App

In [None]:
import gradio as gr
from app import bot, add_message
with gr.Blocks() as demo:
    chatbot = gr.Chatbot(elem_id="chatbot", bubble_full_width=False, type="messages")

    chat_input = gr.Textbox(
        interactive=True,
        placeholder="How do I write a dataframe to csv?",
        show_label=False,
    )
    chat_msg = chat_input.submit(
        add_message, [chatbot, chat_input], [chatbot, chat_input]
    )
    bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
    bot_msg.then(lambda: gr.Textbox(interactive=True), None, [chat_input])


demo.launch(debug=True)

## 🥾 Bootstrapping an Evaluation Dataset

Then generate a question and answer dataset. This will use the raw knowledge base we created
in the previous step.

In [None]:
!union run --remote llmops_rag/create_qa_dataset.py create_qa_dataset --n_questions_per_doc 5 --n_answers_per_question 5

Filter the dataset with an LLM critic:

In [None]:
!union run --remote llmops_rag/create_llm_filtered_dataset.py create_llm_filtered_dataset

## 📊 RAG Hyperparameter Optimization

Experiment with different chunksizes:

In [None]:
!union run --remote llmops_rag/optimize_rag.py optimize_rag --gridsearch_config config/chunksize_experiment.yaml

Experiment with different embedding models:

In [None]:
!union run --remote llmops_rag/optimize_rag.py optimize_rag --gridsearch_config config/embedding_model_experiment.yaml

## 🧪 More experiments to run

Experiment with different splitters:

In [None]:
!union run --remote llmops_rag/optimize_rag.py optimize_rag --gridsearch_config config/splitter_experiment.yaml

Experiment with reranking:

In [None]:
!union run --remote llmops_rag/optimize_rag.py optimize_rag --gridsearch_config config/reranking_experiment.yaml

Experiment with document retrieval:

In [None]:
!union run --remote llmops_rag/optimize_rag.py optimize_rag --gridsearch_config config/search_params_experiment.yaml