## Load OpenAI API Key

### Add OPENAI_API_KEY secret

At first, you need to add your openai api key at secrets.
Check out left side and go to 'Secrets' tab.
At there, press 'add new secret' and set name to `OPENAI_API_KEY`.
And set value to your openai api key.
Be sure to press toggle for notebook access!

If there are no error at below code, you are ready to go!

In [None]:
from google.colab import output
output.enable_custom_widget_manager()

In [None]:
from google.colab import userdata
import os
openai_api_key = userdata.get('OPENAI_API_KEY')
assert bool(openai_api_key), "You have to set OPENAI_API_KEY at colab secrets."
os.environ["OPENAI_API_KEY"] = openai_api_key

In [None]:
%%shell
apt-get remove python3-blinker
pip install blinker==1.8.2

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  distro-info-data gir1.2-glib-2.0 gir1.2-packagekitglib-1.0 libappstream4 libgirepository-1.0-1
  libglib2.0-bin libpackagekit-glib2-18 libpolkit-agent-1-0 libpolkit-gobject-1-0 libstemmer0d
  libxmlb2 libyaml-0-2 lsb-release packagekit pkexec policykit-1 polkitd python-apt-common
  python3-apt python3-cffi-backend python3-cryptography python3-dbus python3-distro python3-gi
  python3-httplib2 python3-importlib-metadata python3-jeepney python3-jwt python3-keyring
  python3-lazr.uri python3-more-itertools python3-pkg-resources python3-pyparsing
  python3-secretstorage python3-six python3-wadllib python3-zipp
Use 'apt autoremove' to remove them.
The following packages will be REMOVED:
  python3-blinker python3-launchpadlib python3-lazr.restfulclient python3-oauthlib
  python3-software-properties software-propertie



In [None]:
%pip install -Uq ipykernel==5.5.6 ipywidgets-bokeh==1.0.2

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/845.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m112.6/845.7 kB[0m [31m3.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━[0m [32m553.0/845.7 kB[0m [31m8.0 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m845.7/845.7 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for ipywidgets-bokeh (setup.py) ... [?25l[?25hdone


In [None]:
%pip install -Uq AutoRAG>=0.3.7 datasets

In [None]:
import nest_asyncio
nest_asyncio.apply()

## Download data and preprocess

In this tutorial, we will use `eli5` dataset for evaluation.

In [None]:
import os
os.makedirs('/content/eli5_data')

In [None]:
import pathlib

from datasets import load_dataset

def load_eli5_dataset(save_path):
    # set file path
    file_path = "MarkrAI/eli5_sample_autorag"

    # load dataset
    corpus_dataset = load_dataset(file_path, "corpus")['train'].to_pandas()
    qa_train_dataset = load_dataset(file_path, "qa")['train'].to_pandas()
    qa_test_dataset = load_dataset(file_path, "qa")['test'].to_pandas()

    # save data
    if os.path.exists(os.path.join(save_path, "corpus.parquet")) is True:
        raise ValueError("corpus.parquet already exists")
    if os.path.exists(os.path.join(save_path, "qa.parquet")) is True:
        raise ValueError("qa.parquet already exists")
    corpus_dataset.to_parquet(os.path.join(save_path, "corpus.parquet"))
    qa_train_dataset.to_parquet(os.path.join(save_path, "qa_train.parquet"))
    qa_test_dataset.to_parquet(os.path.join(save_path, "qa_test.parquet"))

In [None]:
load_eli5_dataset("/content/eli5_data")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/229 [00:00<?, ?B/s]

Some datasets params were ignored: ['splits']. Make sure to use only valid params for the dataset builder and to have a up-to-date version of the `datasets` library.


corpus.parquet:   0%|          | 0.00/7.32M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/2000 [00:00<?, ? examples/s]

Some datasets params were ignored: ['splits']. Make sure to use only valid params for the dataset builder and to have a up-to-date version of the `datasets` library.


qa_train.parquet:   0%|          | 0.00/324k [00:00<?, ?B/s]

qa_test.parquet:   0%|          | 0.00/209k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/600 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/400 [00:00<?, ? examples/s]

Some datasets params were ignored: ['splits']. Make sure to use only valid params for the dataset builder and to have a up-to-date version of the `datasets` library.


In [None]:
import pandas as pd
qa_df = pd.read_parquet('/content/eli5_data/qa_train.parquet')
sample_qa_df = qa_df.sample(50, random_state=42) # In this sample code, we will only optimize pipeline with 50 samples.
sample_qa_df = sample_qa_df.reset_index()
sample_qa_df.to_parquet('/content/eli5_data/qa_sample.parquet')

## Making config YAML file

In this file, we will test three different retrieval methods, which are vectordb, bm25, and hybrid_rrf.
And use one prompt and openai gpt-3.5-turbo-16k model for generation.
It also evaluates generation performance with meteor, rouge, and sem_score.

You can learn about config YAML file at [here](https://marker-inc-korea.github.io/AutoRAG/optimization/custom_config.html).

In [None]:
%%writefile config.yaml

node_lines:
- node_line_name: retrieve_node_line
  nodes:
    - node_type: retrieval
      strategy:
        metrics: [retrieval_f1, retrieval_recall, retrieval_ndcg, retrieval_mrr]
      top_k: 3
      modules:
        - module_type: vectordb
        - module_type: bm25
        - module_type: hybrid_rrf
          weight_range: (4,80)
- node_line_name: post_retrieve_node_line
  nodes:
    - node_type: prompt_maker
      strategy:
        metrics:
          - metric_name: meteor
          - metric_name: rouge
          - metric_name: sem_score
            embedding_model: openai
      modules:
        - module_type: fstring
          prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
    - node_type: generator
      strategy:
        metrics:
          - metric_name: meteor
          - metric_name: rouge
          - metric_name: sem_score
            embedding_model: openai
      modules:
        - module_type: openai_llm
          llm: gpt-4o-mini
          batch: 16 # If you have low tier at OpenAI, decrease this.

Writing config.yaml


You must make new project folder per dataset.
**Per dataset, you have to use one project folder.**
It means, if dataset is changed a little bit, you need to make a new project folder.

In [None]:
# make project folder
import os
os.makedirs('/content/project_dir')

## Run evaluation

In [None]:
from autorag.evaluator import Evaluator
evaluator = Evaluator(qa_data_path='/content/eli5_data/qa_sample.parquet', corpus_data_path='/content/eli5_data/corpus.parquet',
                      project_dir='/content/project_dir')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

You are successfully evaluate RAG pipeline with data!

You can check out the resut at the project dir. `project_dir` folder at your left. Go to every file, and see `.csv` file.

In [None]:
evaluator.start_trial('/content/config.yaml')

100%|██████████| 77/77 [00:02<00:00, 31.36it/s]
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


Downloading builder script:   0%|          | 0.00/7.02k [00:00<?, ?B/s]

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...


Generating embeddings:   0%|          | 0/5 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/5 [00:00<?, ?it/s]

100%|██████████| 77/77 [00:27<00:00,  2.80it/s]
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


Generating embeddings:   0%|          | 0/50 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/50 [00:00<?, ?it/s]

# Check out the optimization Result

You can run dashboard from the evaluation result.
You just specify the trial folder and run the cli command to execute dashboard.

In [None]:
!autorag dashboard --trial_dir /content/project_dir/0

2024-09-25 04:03:22.209322: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-25 04:03:22.255631: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-25 04:03:22.268141: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-25 04:03:22.296941: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.





[2;36m[09/25/24 04:03:31][0m[2;36m [0m[34mI

## Extract pipeline

Now, let's deploy with the optimal pipeline we found with evaluate!

---

First, you need to create an optimal pipeline as a yaml file.

Let's make `best.yaml` file to use `extract_best_config` function.

`output_path` must be `.yaml` or `.yml` file. If None, it does not save yaml file and just return dict values.

In [None]:
from autorag.deploy import extract_best_config
extract_best_config(trial_path='/content/project_dir/0', output_path='/content/project_dir/0/best.yaml')

{'node_lines': [{'node_line_name': 'retrieve_node_line',
   'nodes': [{'node_type': 'retrieval',
     'strategy': {'metrics': ['retrieval_f1',
       'retrieval_recall',
       'retrieval_ndcg',
       'retrieval_mrr']},
     'modules': [{'module_type': 'HybridRRF',
       'top_k': 3,
       'weight': 4.0,
       'target_modules': ('VectorDB', 'BM25'),
       'target_module_params': ({'top_k': 3}, {'top_k': 3})}]}]},
  {'node_line_name': 'post_retrieve_node_line',
   'nodes': [{'node_type': 'prompt_maker',
     'strategy': {'metrics': [{'metric_name': 'meteor'},
       {'metric_name': 'rouge'},
       {'metric_name': 'sem_score', 'embedding_model': 'openai'}]},
     'modules': [{'module_type': 'Fstring',
       'prompt': 'Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : '}]},
    {'node_type': 'generator',
     'strategy': {'metrics': [{'metric_name': 'meteor'},
       {'metric_name': 'rouge'},
       {'metric_name': 'se

## Deploy your optimal RAG pipeline

Second, it can be deployed as a `CLI`, `API server`, or `Web Interface`.

### 1. Run as a CLI

You can use a found optimal RAG pipeline right away with extracted yaml file.

In [None]:
from autorag.deploy import Runner
runner = Runner.from_yaml('/content/project_dir/0/best.yaml', project_dir='/content/project_dir')
runner.run('who are you?')

'I am an AI language model created by OpenAI, designed to assist users by providing information, answering questions, and engaging in conversation based on the text input I receive.'

### 2. Run as an API server

You can run this pipeline as an API server.

Check out API endpoint at [here](https://marker-inc-korea.github.io/AutoRAG/deploy/api_endpoint.html).

Warning! => The API server at Colab might not work.

In [None]:
%pip freeze | grep -i quart

Quart==0.19.8


In [None]:
from autorag.deploy import ApiRunner
runner = ApiRunner.from_yaml('/content/project_dir/0/best.yaml', project_dir='/content/project_dir')
runner.run_api_server()

### 3. Run as a Web Interface

you can run this pipeline as a web interface.

Check out web interface at [here](https://marker-inc-korea.github.io/AutoRAG/deploy/web.html).

Warning : Colab Web interface might not work

In [None]:
!autorag run_web --yaml_path /content/project__dir/0/best.yaml --project_dir /content/project_dir

2024-09-25 03:59:54.500453: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-25 03:59:54.537443: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-25 03:59:54.548693: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered





[2;36m[09/25/24 04:00:05][0m[2;36m [0m[34mINFO    [0m [1m[[0m_client.py:[1;36m1038[0m[1m][0m >> HTTP Request: [1;33mGET[0m                  ]8;id=954997;file:///usr/local/lib/python3.10/dist-packages/httpx/_client.py\[2m_client.py[0m]8;;\[2m:[0m]8;id=346225;file:///usr/local/lib/python3.10/dist-packages/httpx/_client.py#1038\[2m1038