# üç´Tune your RAG data pipeline and evaluate its performance

> ‚ö†Ô∏è This notebook can be run on your local machine or on a virtual machine and requires [Docker Compose](https://docs.docker.com/desktop/).
> Please note that it is not compatible with Google Colab as the latter does not support Docker.

In this notebook we demonstrate how to iteratively evaluate and tune a Retrieval-Augmented Generation (RAG) system using [Fondant](https://fondant.ai).

We will:

1. Set up a [Weaviate](https://weaviate.io/platform) vector store
2. Define a parameter set to test
3. Run a Fondant pipeline with those parameters to index our documents into the vector store
4. Run a Fondant pipeline with those parameters to evaluate the performance
5. Inspect the evaluation results and data between each processing step
6. Repeat step 2 - 5 until we're happy with the results

<div align="center">
<img src="../art/iteration.png" width="1000"/>
</div>

## Set up environment

> ‚ö†Ô∏è This section checks the prerequisites of your environment. Read any errors or warnings carefully.

Ensure a **Python between version 3.8 and 3.10** is available

In [1]:
import sys
if sys.version_info < (3, 8, 0) or sys.version_info >= (3, 11, 0):
    raise Exception(f"A Python version between 3.8 and 3.10 is required. You are running {sys.version}")

Check if **docker compose** is installed and the **docker daemon** is running

In [2]:
!docker compose version

Docker Compose version v2.19.1


Install Fondant framework

In [4]:
!pip install -q -r ../requirements.txt --disable-pip-version-check && echo "Success"

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.16.1 requires fsspec[http]<=2023.10.0,>=2023.1.0, but you have fsspec 2023.12.2 which is incompatible.[0m[31m
[0mSuccess


**Check if GPU is available**

In [3]:
import logging
import subprocess

try:
    subprocess.check_output('nvidia-smi')
    logging.info("Found GPU, using it!")
    number_of_accelerators = 1
    accelerator_name = "GPU"
except Exception:
    logging.warning("We recommend to run this pipeline on a GPU, but none could be found, using CPU instead")
    number_of_accelerators = None
    accelerator_name = None



## Spin up the Weaviate vector store

> ‚ö†Ô∏è For **Apple M1/M2** chip users:
> 
> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to **un**check `Use containerd` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)
> - Make sure that Docker uses linux/amd64 platform and not arm64 (cell below should take care of that)

Run **Weaviate** with Docker compose

In [4]:
!docker compose -f weaviate_service/docker-compose.yaml up --detach

[1A[1B[0G[?25l[+] Running 2/0
 [32m‚úî[0m Container weaviate_service-weaviate-1       [32mRunning[0m                     [34m0.0s [0m
 [32m‚úî[0m Container weaviate_service-contextionary-1  [32mRunning[0m                     [34m0.0s [0m
[?25h

Make sure you have **Weaviate client v3**

Make sure the vectorDB is running and accessible

In [5]:
import logging
import weaviate

try:
    local_weaviate_client = weaviate.Client("http://localhost:8081")
    logging.info("Connected to Weaviate instance")
except weaviate.WeaviateStartUpError:
    logging.error("Cannot connect to weaviate instance, is it running?")

            Please consider upgrading to the latest version. See https://weaviate.io/developers/weaviate/client-libraries/python for details.


#### Indexing pipeline

`pipeline_index.py` processes text data and loads it into the vector database

<div align="center">
<img src="../art/indexing_ltr.png" width="800"/>
</div>

- [**Load data**](https://github.com/ml6team/fondant/tree/main/components/load_from_parquet): loads data from the Hugging Face Hub
- [**Chunk data**](https://github.com/ml6team/fondant/tree/main/components/chunk_text): divides the text into sections of a certain size and with a certain overlap
- [**Embed chunks**](https://github.com/ml6team/fondant/tree/main/components/embed_text): embeds each chunk as a vector, e.g. using [Cohere](https://cohere.com/embeddings)
- [**Index vector store**](https://github.com/ml6team/fondant/tree/main/components/index_weaviate): writes data and embeddings to the vector store

> üí° This notebook defaults to the first 1000 rows of the [wikitext](https://huggingface.co/datasets/wikitext) dataset for demonstration purposes, but you can load your own dataset using one the other load components available on the [**Fondant Hub**](https://fondant.ai/en/latest/components/hub/#component-hub) or by creating your own [**custom load component**](https://fondant.ai/en/latest/guides/implement_custom_components/). Keep in mind that changing the dataset implies that you also need to change the evaluation dataset used in the evaluation pipeline. 

#### Create the indexing pipeline

We are reusing the index pipeline from the [indexing notebook](./indexing.ipynb). Therefore, we have extracted the code into a separate file and created a function that parameterizes the entire pipeline. 

In [6]:
import pipeline_index
import utils

# Path where data and artifacts will be stored
BASE_PATH = "./data"
utils.create_directory_if_not_exists(BASE_PATH)

# Parameters shared between indexing and evaluation pipeline
shared_args = {
    "base_path": BASE_PATH,
    "embed_model_provider": "huggingface",
    "embed_model": "all-MiniLM-L6-v2",
    "weaviate_url": f"http://{utils.get_host_ip()}:8081",
    "weaviate_class": "Pipeline1", # Capitalized, avoid special characters (_, =, -, etc.)
}

# Parameters for the indexing pipeline
indexing_args = {
    "n_rows_to_load": 1000,
    "chunk_args": {"chunk_size": 512, "chunk_overlap": 32}
}

# Parameters for the GPU resources
resources_args = {
    "number_of_accelerators": number_of_accelerators,
    "accelerator_name": accelerator_name,
}

indexing_pipeline = pipeline_index.create_pipeline(**shared_args, **indexing_args, **resources_args)

#### Run the indexing pipeline

> üí° The first time you run a pipeline, you need to **download a docker image for each component** which may take a minute.

> üí° Use a **GPU** or an external API to speed up the embedding step

> üí° Steps that have been processed before are **cached** and will be skipped in subsequent runs which speeds up processing.

In [7]:
from fondant.pipeline.runner import DockerRunner

runner = DockerRunner()
runner.run(indexing_pipeline)

INFO:root:Found reference to un-compiled pipeline... compiling
INFO:fondant.pipeline.compiler:Compiling indexing-pipeline to .fondant/compose.yaml
INFO:fondant.pipeline.compiler:Base path found on local system, setting up ./data as mount volume
INFO:fondant.pipeline.pipeline:Sorting pipeline component graph topologically.
INFO:fondant.pipeline.pipeline:All pipeline component specifications match.
INFO:fondant.pipeline.compiler:Compiling service for load_from_hugging_face_hub
INFO:fondant.pipeline.compiler:Compiling service for chunktextcomponent
INFO:fondant.pipeline.compiler:Compiling service for embed_text
INFO:fondant.pipeline.compiler:Compiling service for index_weaviate
INFO:fondant.pipeline.compiler:Successfully compiled to .fondant/compose.yaml
 load_from_hugging_face_hub Pulling 
 embed_text Pulling 
 index_weaviate Pulling 
 chunktextcomponent Pulling 


Starting pipeline run...


 load_from_hugging_face_hub Pulled 
 chunktextcomponent Pulled 
 embed_text Pulled 
 index_weaviate Pulled 
 Container indexing-pipeline-load_from_hugging_face_hub-1  Recreate
 Container indexing-pipeline-load_from_hugging_face_hub-1  Recreated
 Container indexing-pipeline-chunktextcomponent-1  Recreate
 Container indexing-pipeline-chunktextcomponent-1  Recreated
 Container indexing-pipeline-embed_text-1  Recreate
 Container indexing-pipeline-embed_text-1  Recreated
 Container indexing-pipeline-index_weaviate-1  Recreate
 Container indexing-pipeline-index_weaviate-1  Recreated


Attaching to indexing-pipeline-chunktextcomponent-1, indexing-pipeline-embed_text-1, indexing-pipeline-index_weaviate-1, indexing-pipeline-load_from_hugging_face_hub-1


indexing-pipeline-load_from_hugging_face_hub-1  | [2024-02-06 12:18:43,582 | fondant.cli | INFO] Component `LoadFromHubComponent` found in module main
indexing-pipeline-load_from_hugging_face_hub-1  | [2024-02-06 12:18:43,588 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
indexing-pipeline-load_from_hugging_face_hub-1  | [2024-02-06 12:18:43,592 | fondant.component.executor | INFO] Skipping component execution
indexing-pipeline-load_from_hugging_face_hub-1  | [2024-02-06 12:18:43,594 | fondant.component.executor | INFO] Matching execution detected for component. The last execution of the component originated from `indexing-pipeline-20240206095839`.
indexing-pipeline-load_from_hugging_face_hub-1  | [2024-02-06 12:18:43,599 | fondant.component.executor | INFO] Saving output manifest to /data/indexing-pipeline/indexing-pipeline-20240206131838/load_from_hugging_face_hub/man

indexing-pipeline-load_from_hugging_face_hub-1 exited with code 0
indexing-pipeline-chunktextcomponent-1          | Collecting langchain==0.0.329 (from -r requirements.txt (line 1))
indexing-pipeline-chunktextcomponent-1          |   Obtaining dependency information for langchain==0.0.329 from https://files.pythonhosted.org/packages/42/4e/86204994aeb2e4ac367a7fade896b13532eae2430299052eb2c80ca35d2c/langchain-0.0.329-py3-none-any.whl.metadata
indexing-pipeline-chunktextcomponent-1          |   Downloading langchain-0.0.329-py3-none-any.whl.metadata (16 kB)
indexing-pipeline-chunktextcomponent-1          | Collecting SQLAlchemy<3,>=1.4 (from langchain==0.0.329->-r requirements.txt (line 1))
indexing-pipeline-chunktextcomponent-1          |   Obtaining dependency information for SQLAlchemy<3,>=1.4 from https://files.pythonhosted.org/packages/7a/de/0ca53bf49d213bea164b0bd0187d3c94d6fea650b7679a8e41c91e3182d7/SQLAlchemy-2.0.25-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metad

indexing-pipeline-chunktextcomponent-1          | 
indexing-pipeline-chunktextcomponent-1          | [notice] A new release of pip is available: 23.2.1 -> 24.0
indexing-pipeline-chunktextcomponent-1          | [notice] To update, run: pip install --upgrade pip
indexing-pipeline-chunktextcomponent-1          | 
indexing-pipeline-chunktextcomponent-1          | [2024-02-06 12:18:56,376 | fondant.cli | INFO] Component `ChunkTextComponent` found in module main
indexing-pipeline-chunktextcomponent-1          | [2024-02-06 12:18:56,380 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
indexing-pipeline-chunktextcomponent-1          | [2024-02-06 12:18:56,384 | fondant.component.executor | INFO] Previous component `load_from_hugging_face_hub` run was cached. Cached pipeline id: indexing-pipeline-20240206095839
indexing-pipeline-chunktextcomponent-1          | [2024-02-06 12:18:56

indexing-pipeline-chunktextcomponent-1 exited with code 0


indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:00,042 | fondant.cli | INFO] Component `EmbedTextComponent` found in module main
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:00,049 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:00,055 | fondant.component.executor | INFO] Caching disabled for the component
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:00,055 | root | INFO] Executing component
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:03,098 | sentence_transformers.SentenceTransformer | INFO] Load pretrained SentenceTransformer: all-MiniLM-L6-v2
.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]
.gitattributes: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1.18k/1.18k [00:00<00:00, 1.77MB/s]
1_Pooling/config.json:  

[                                        ] | 0% Completed | 613.46 us
[                                        ] | 0% Completed | 116.45 ms
[                                        ] | 0% Completed | 216.90 ms
[                                        ] | 0% Completed | 322.53 ms
[                                        ] | 0% Completed | 430.40 ms
[                                        ] | 0% Completed | 530.65 ms
[                                        ] | 0% Completed | 631.18 ms
[                                        ] | 0% Completed | 731.55 ms
[                                        ] | 0% Completed | 832.06 ms
[                                        ] | 0% Completed | 939.03 ms
[                                        ] | 0% Completed | 1.04 s
[                                        ] | 0% Completed | 1.14 s
[                                        ] | 0% Completed | 1.24 s
[                                        ] | 0% Completed | 1.34 s
[                               

Batches:  11%|‚ñà         | 1/9 [00:04<00:35,  4.41s/it]
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  10%|‚ñà         | 1/10 [00:04<00:40,  4.55s/it]
indexing-pipeline-embed_text-1                  | [A[A


[                                        ] | 0% Completed | 4.55 s
[                                        ] | 0% Completed | 4.65 s
[                                        ] | 0% Completed | 4.75 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  11%|‚ñà         | 1/9 [00:04<00:37,  4.63s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 4.85 s
[                                        ] | 0% Completed | 4.96 s
[                                        ] | 0% Completed | 5.06 s
[                                        ] | 0% Completed | 5.16 s
[                                        ] | 0% Completed | 5.26 s
[                                        ] | 0% Completed | 5.37 s
[                                        ] | 0% Completed | 5.53 s
[                                        ] | 0% Completed | 5.65 s
[                                        ] | 0% Completed | 5.75 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  12%|‚ñà‚ñé        | 1/8 [00:05<00:39,  5.58s/it]
indexing-pipeline-embed_text-1                  | [A[A[A


[                                        ] | 0% Completed | 5.85 s
[                                        ] | 0% Completed | 5.95 s
[                                        ] | 0% Completed | 6.05 s
[                                        ] | 0% Completed | 6.15 s
[                                        ] | 0% Completed | 6.25 s
[                                        ] | 0% Completed | 6.36 s
[                                        ] | 0% Completed | 6.46 s
[                                        ] | 0% Completed | 6.56 s
[                                        ] | 0% Completed | 6.66 s
[                                        ] | 0% Completed | 6.76 s
[                                        ] | 0% Completed | 6.86 s
[                                        ] | 0% Completed | 6.96 s
[                                        ] | 0% Completed | 7.06 s
[                                        ] | 0% Completed | 7.16 s
[                                        ] | 0% Completed | 7.

indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  20%|‚ñà‚ñà        | 2/10 [00:07<00:30,  3.81s/it]
indexing-pipeline-embed_text-1                  | [A[A


[                                        ] | 0% Completed | 8.16 s
[                                        ] | 0% Completed | 8.26 s
[                                        ] | 0% Completed | 8.36 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  22%|‚ñà‚ñà‚ñè       | 2/9 [00:08<00:28,  4.07s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 8.46 s
[                                        ] | 0% Completed | 8.56 s
[                                        ] | 0% Completed | 8.66 s
[                                        ] | 0% Completed | 8.76 s
[                                        ] | 0% Completed | 8.86 s


Batches:  22%|‚ñà‚ñà‚ñè       | 2/9 [00:08<00:29,  4.27s/it]


[                                        ] | 0% Completed | 8.96 s
[                                        ] | 0% Completed | 9.06 s
[                                        ] | 0% Completed | 9.17 s
[                                        ] | 0% Completed | 9.27 s
[                                        ] | 0% Completed | 9.37 s
[                                        ] | 0% Completed | 9.47 s
[                                        ] | 0% Completed | 9.57 s
[                                        ] | 0% Completed | 9.67 s
[                                        ] | 0% Completed | 9.77 s
[                                        ] | 0% Completed | 9.87 s
[                                        ] | 0% Completed | 9.97 s
[                                        ] | 0% Completed | 10.07 s
[                                        ] | 0% Completed | 10.17 s
[                                        ] | 0% Completed | 10.27 s
[                                        ] | 0% Completed |

indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  25%|‚ñà‚ñà‚ñå       | 2/8 [00:10<00:30,  5.04s/it]
indexing-pipeline-embed_text-1                  | [A[A[A


[                                        ] | 0% Completed | 10.57 s
[                                        ] | 0% Completed | 10.67 s
[                                        ] | 0% Completed | 10.77 s
[                                        ] | 0% Completed | 10.87 s
[                                        ] | 0% Completed | 10.97 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  30%|‚ñà‚ñà‚ñà       | 3/10 [00:10<00:23,  3.42s/it]
indexing-pipeline-embed_text-1                  | [A[A


[                                        ] | 0% Completed | 11.08 s
[                                        ] | 0% Completed | 11.18 s
[                                        ] | 0% Completed | 11.28 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  33%|‚ñà‚ñà‚ñà‚ñé      | 3/9 [00:11<00:20,  3.46s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 11.38 s
[                                        ] | 0% Completed | 11.48 s
[                                        ] | 0% Completed | 11.59 s
[                                        ] | 0% Completed | 11.69 s
[                                        ] | 0% Completed | 11.81 s
[                                        ] | 0% Completed | 11.92 s
[                                        ] | 0% Completed | 12.02 s
[                                        ] | 0% Completed | 12.12 s
[                                        ] | 0% Completed | 12.22 s
[                                        ] | 0% Completed | 12.32 s
[                                        ] | 0% Completed | 12.42 s
[                                        ] | 0% Completed | 12.52 s


Batches:  33%|‚ñà‚ñà‚ñà‚ñé      | 3/9 [00:12<00:24,  4.09s/it]


[                                        ] | 0% Completed | 12.62 s
[                                        ] | 0% Completed | 12.72 s
[                                        ] | 0% Completed | 12.82 s
[                                        ] | 0% Completed | 12.92 s
[                                        ] | 0% Completed | 13.02 s
[                                        ] | 0% Completed | 13.12 s
[                                        ] | 0% Completed | 13.22 s
[                                        ] | 0% Completed | 13.32 s
[                                        ] | 0% Completed | 13.42 s
[                                        ] | 0% Completed | 13.52 s
[                                        ] | 0% Completed | 13.63 s
[                                        ] | 0% Completed | 13.73 s
[                                        ] | 0% Completed | 13.83 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  44%|‚ñà‚ñà‚ñà‚ñà‚ñç     | 4/9 [00:13<00:15,  3.17s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 13.93 s
[                                        ] | 0% Completed | 14.03 s
[                                        ] | 0% Completed | 14.13 s
[                                        ] | 0% Completed | 14.26 s
[                                        ] | 0% Completed | 14.38 s
[                                        ] | 0% Completed | 14.48 s
[                                        ] | 0% Completed | 14.58 s
[                                        ] | 0% Completed | 14.68 s
[                                        ] | 0% Completed | 14.78 s
[                                        ] | 0% Completed | 14.88 s
[                                        ] | 0% Completed | 14.98 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  40%|‚ñà‚ñà‚ñà‚ñà      | 4/10 [00:14<00:21,  3.65s/it]
indexing-pipeline-embed_text-1                  | [A[A
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  38%|‚ñà‚ñà‚ñà‚ñä      | 3/8 [00:14<00:24,  4.87s/it]
indexing-pipeline-embed_text-1                  | [A[A[A


[                                        ] | 0% Completed | 15.08 s
[                                        ] | 0% Completed | 15.18 s
[                                        ] | 0% Completed | 15.28 s
[                                        ] | 0% Completed | 15.38 s
[                                        ] | 0% Completed | 15.48 s
[                                        ] | 0% Completed | 15.58 s
[                                        ] | 0% Completed | 15.69 s
[                                        ] | 0% Completed | 15.79 s
[                                        ] | 0% Completed | 15.89 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  56%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå    | 5/9 [00:15<00:10,  2.74s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 15.99 s
[                                        ] | 0% Completed | 16.09 s
[                                        ] | 0% Completed | 16.19 s
[                                        ] | 0% Completed | 16.29 s
[                                        ] | 0% Completed | 16.39 s


Batches:  44%|‚ñà‚ñà‚ñà‚ñà‚ñç     | 4/9 [00:16<00:19,  3.94s/it]


[                                        ] | 0% Completed | 16.49 s
[                                        ] | 0% Completed | 16.59 s
[                                        ] | 0% Completed | 16.69 s
[                                        ] | 0% Completed | 16.79 s
[                                        ] | 0% Completed | 16.89 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  67%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñã   | 6/9 [00:16<00:06,  2.14s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 16.99 s
[                                        ] | 0% Completed | 17.09 s
[                                        ] | 0% Completed | 17.19 s
[                                        ] | 0% Completed | 17.29 s
[                                        ] | 0% Completed | 17.39 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 7/9 [00:17<00:03,  1.64s/it]
indexing-pipeline-embed_text-1                  | [A


[                                        ] | 0% Completed | 17.49 s
[                                        ] | 0% Completed | 17.59 s
[                                        ] | 0% Completed | 17.69 s
[                                        ] | 0% Completed | 17.80 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  89%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñâ | 8/9 [00:17<00:01,  1.20s/it]
indexing-pipeline-embed_text-1                  | [A
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 9/9 [00:17<00:00,  1.96s/it]
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 


[########                                ] | 20% Completed | 17.90 s
[########                                ] | 20% Completed | 18.00 s
[########                                ] | 20% Completed | 18.10 s
[########                                ] | 20% Completed | 18.20 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  50%|‚ñà‚ñà‚ñà‚ñà‚ñà     | 5/10 [00:18<00:17,  3.51s/it][A[A


[########                                ] | 20% Completed | 18.30 s
[########                                ] | 20% Completed | 18.40 s
[########                                ] | 20% Completed | 18.51 s
[########                                ] | 20% Completed | 18.61 s
[########                                ] | 20% Completed | 18.71 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  50%|‚ñà‚ñà‚ñà‚ñà‚ñà     | 4/8 [00:18<00:17,  4.41s/it][A[A[A


[########                                ] | 20% Completed | 18.81 s
[########                                ] | 20% Completed | 18.91 s
[########                                ] | 20% Completed | 19.01 s
[########                                ] | 20% Completed | 19.11 s


Batches:  56%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå    | 5/9 [00:18<00:14,  3.54s/it]


[########                                ] | 20% Completed | 19.21 s
[########                                ] | 20% Completed | 19.32 s
[########                                ] | 20% Completed | 19.42 s
[########                                ] | 20% Completed | 19.52 s
[########                                ] | 20% Completed | 19.63 s
[########                                ] | 20% Completed | 19.73 s
[########                                ] | 20% Completed | 19.83 s
[########                                ] | 20% Completed | 19.93 s
[########                                ] | 20% Completed | 20.03 s
[########                                ] | 20% Completed | 20.13 s
[########                                ] | 20% Completed | 20.23 s
[########                                ] | 20% Completed | 20.33 s
[########                                ] | 20% Completed | 20.53 s
[########                                ] | 20% Completed | 20.71 s
[########                         

Batches:  67%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñã   | 6/9 [00:20<00:08,  2.88s/it]


[########                                ] | 20% Completed | 20.91 s
[########                                ] | 20% Completed | 21.01 s
[########                                ] | 20% Completed | 21.11 s
[########                                ] | 20% Completed | 21.21 s
[########                                ] | 20% Completed | 21.31 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  60%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà    | 6/10 [00:21<00:13,  3.34s/it][A[A


[########                                ] | 20% Completed | 21.42 s
[########                                ] | 20% Completed | 21.52 s


Batches:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 7/9 [00:21<00:04,  2.18s/it]


[########                                ] | 20% Completed | 21.62 s
[########                                ] | 20% Completed | 21.72 s


Batches:  89%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñâ | 8/9 [00:21<00:01,  1.57s/it]
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  62%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé   | 5/8 [00:21<00:11,  3.95s/it][A[A[A
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 9/9 [00:21<00:00,  1.13s/it]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 9/9 [00:21<00:00,  2.42s/it]


[########                                ] | 20% Completed | 21.82 s
[########                                ] | 20% Completed | 21.94 s
[########                                ] | 20% Completed | 22.04 s
[################                        ] | 40% Completed | 22.14 s
[################                        ] | 40% Completed | 22.24 s
[################                        ] | 40% Completed | 22.34 s
[################                        ] | 40% Completed | 22.45 s
[################                        ] | 40% Completed | 22.55 s
[################                        ] | 40% Completed | 22.65 s
[################                        ] | 40% Completed | 22.75 s
[################                        ] | 40% Completed | 22.85 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  75%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå  | 6/8 [00:22<00:05,  2.87s/it][A[A[A
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  70%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà   | 7/10 [00:22<00:08,  2.76s/it][A[A
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  88%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä | 7/8 [00:22<00:02,  2.01s/it][A[A[A
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  

[################                        ] | 40% Completed | 22.95 s
[################                        ] | 40% Completed | 23.06 s
[########################                ] | 60% Completed | 23.16 s
[########################                ] | 60% Completed | 23.27 s
[########################                ] | 60% Completed | 23.37 s
[########################                ] | 60% Completed | 23.47 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  80%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà  | 8/10 [00:23<00:04,  2.07s/it][A[A


[########################                ] | 60% Completed | 23.57 s
[########################                ] | 60% Completed | 23.67 s


indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
indexing-pipeline-embed_text-1                  | 
Batches:  90%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 9/10 [00:23<00:01,  1.53s/it][A[A
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 10/10 [00:23<00:00,  2.36s/it]
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:44,047 | fondant.component.executor | INFO] Saving output manifest to /data/indexing-pipeline/indexing-pipeline-20240206131838/embed_text/manifest.json
indexing-pipeline-embed_text-1                  | [2024-02-06 12:19:44,048 | fondant.component.executor | INFO] Writing cache key with manifest reference to /data/indexing-pipeline/cache/dd31849c92bae5c6f64b2cca40426e9f.txt


[########################                ] | 60% Completed | 23.78 s
[########################################] | 100% Completed | 23.88 s
indexing-pipeline-embed_text-1 exited with code 0


indexing-pipeline-index_weaviate-1              | [2024-02-06 12:19:48,587 | fondant.cli | INFO] Component `IndexWeaviateComponent` found in module main
indexing-pipeline-index_weaviate-1              | [2024-02-06 12:19:48,598 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
indexing-pipeline-index_weaviate-1              | [2024-02-06 12:19:48,623 | fondant.component.executor | INFO] Previous component `embed_text` is not cached. Invalidating cache for current and subsequent components
indexing-pipeline-index_weaviate-1              | [2024-02-06 12:19:48,623 | fondant.component.executor | INFO] Caching disabled for the component
indexing-pipeline-index_weaviate-1              | [2024-02-06 12:19:48,624 | root | INFO] Executing component
indexing-pipeline-index_weaviate-1              |             Please consider upgrading to the latest version. See https://weaviate.io

indexing-pipeline-index_weaviate-1 exited with code 0
Finished pipeline run.


## Evaluation Pipeline

`pipeline_eval.py` evaluates retrieval performance using the questions provided in your test dataset

<div align=center>
<img src="../art/evaluation_ltr.png" width="800"/>
</div>

- [**Load eval data**](https://github.com/ml6team/fondant/tree/main/components/load_from_csv): loads the evaluation dataset (questions) from a csv file
- [**Embed questons**](https://github.com/ml6team/fondant/tree/main/components/embed_text): embeds each question as a vector, e.g. using [Cohere](https://cohere.com/embeddings)
- [**Query vector store**](https://github.com/ml6team/fondant/tree/main/components/retrieve_from_weaviate): retrieves the most relevant chunks for each question from the vector store
- [**Evaluate**](https://github.com/ml6team/fondant/tree/0.8.0/components/evaluate_ragas): evaluates the retrieved chunks for each question, e.g. using [RAGAS](https://docs.ragas.io/en/latest/index.html)
- [**Aggregate**](https://github.com/ml6team/fondant-usecase-RAG/tree/main/src/components/aggregate_eval_results): calculates aggregated results

### Create the evaluation pipeline

‚ö†Ô∏è If you want to use an **OpenAI** model for evaluation you will need an [API key](https://platform.openai.com/docs/quickstart) (see TODO below)

Change the arguments below if you want to run the pipeline with different parameters.

In [8]:
evaluation_args = {
    "retrieval_top_k": 2,
    "llm_module_name": "langchain.chat_models",
    "llm_class_name": "ChatOpenAI",
    "llm_kwargs": {
      "openai_api_key":"" ,   # TODO: Update with your key or use a different model
      "model_name" : "gpt-3.5-turbo"
    },
    "evaluation_metrics": ["context_precision", "context_relevancy"]
}

We begin by initializing our pipeline.

In [16]:
import pyarrow as pa
from fondant.pipeline import Pipeline
evaluation_pipeline = Pipeline(
        name="evaluation-pipeline",
        description="Pipeline to evaluate a RAG solution",
        base_path=shared_args["base_path"],
)


We have created a set of evaluation questions which we will use to evaluate the retrieval performance of the RAG system. Therefore, we need to load the CSV file containing the questions. We are going to use a reusable component for this task, `load_from_csv`.

In [17]:
evaluation_set_filename = "wikitext_1000_q.csv"

load_from_csv = evaluation_pipeline.read(
    "load_from_csv",
    arguments={
        "dataset_uri": "/evaldata/" + evaluation_set_filename,
        # mounted dir from within docker as extra_volumes
        "column_separator": ";",
    },
    produces={
        "question": pa.string(),
    },
)

Afterward, we are going to embed our questions and retrieve answers from the database. Here we will once again use the reusable `embed_text` component.

In [21]:
from fondant.pipeline import Resources
embed_text_op = load_from_csv.apply(
    "embed_text",
    arguments={
        "model_provider": shared_args["embed_model_provider"],
        "model": shared_args["embed_model"]
    },
    consumes={
        "text": "question",
    },
    resources=Resources(
        accelerator_number=number_of_accelerators,
        accelerator_name=accelerator_name,
    ),
    cluster_type="local" if number_of_accelerators is not None else "default",
)

TODO: describe retrieve chunk component and ragas eval component

In [19]:
from components.retrieve_from_weaviate import RetrieveFromWeaviateComponent
from components.evaluate_ragas import RagasEvaluator
from components.aggregrate_eval_results import AggregateResults

retrieve_chunks = embed_text_op.apply(
    RetrieveFromWeaviateComponent,
    arguments={
        "weaviate_url": shared_args["weaviate_url"],
        "class_name": shared_args["weaviate_url"],
        "top_k": 2
    },
    cache=False,
)

retriever_eval = retrieve_chunks.apply(
    RagasEvaluator,
    arguments={
        "llm_module_name": evaluation_args["llm_module_name"],
        "llm_class_name": evaluation_args["llm_class_name"],
        "llm_kwargs": evaluation_args["llm_kwargs"],
    }
)

retriever_eval.apply(
    AggregateResults
)

 Consumes: {'question': {'type': 'string'}, 'embedding': {'type': 'array', 'items': {'type': 'float32'}}}
 Consumes: {'question': {'type': 'string'}, 'embedding': {'type': 'array', 'items': {'type': 'float32'}}, 'retrieved_chunks': {'type': 'array', 'items': {'type': 'string'}}, 'context_precision': {'type': 'float32'}, 'context_relevancy': {'type': 'float32'}}


<fondant.pipeline.pipeline.Dataset at 0x1370e6ad0>

#### Run the evaluation pipeline

In [20]:
import os
if utils.check_weaviate_class_exists(
    local_weaviate_client,
    shared_args["weaviate_class"]
): 
    runner = DockerRunner()
    extra_volumes = [str(os.path.join(os.path.abspath('.'), "evaluation_datasets")) + ":/evaldata"]
    runner.run(evaluation_pipeline, extra_volumes=extra_volumes)

INFO:root:Class Pipeline1 exists in Weaviate.


INFO:root:Found reference to un-compiled pipeline... compiling
INFO:fondant.pipeline.compiler:Compiling evaluation-pipeline to .fondant/compose.yaml
INFO:fondant.pipeline.compiler:Base path found on local system, setting up ./data as mount volume
INFO:fondant.pipeline.pipeline:Sorting pipeline component graph topologically.
INFO:fondant.pipeline.pipeline:All pipeline component specifications match.
INFO:fondant.pipeline.compiler:Compiling service for load_from_csv
INFO:fondant.pipeline.compiler:Compiling service for embed_text
INFO:fondant.pipeline.compiler:Compiling service for retrievefromweaviatecomponent
INFO:fondant.pipeline.compiler:Compiling service for ragasevaluator
INFO:fondant.pipeline.compiler:Compiling service for aggregateresults
INFO:fondant.pipeline.compiler:Successfully compiled to .fondant/compose.yaml
 retrievefromweaviatecomponent Pulling 
 embed_text Pulling 
 load_from_csv Pulling 
 aggregateresults Pulling 
 ragasevaluator Pulling 


Starting pipeline run...


 ragasevaluator Pulled 
 embed_text Pulled 
 load_from_csv Pulled 
 aggregateresults Pulled 
 retrievefromweaviatecomponent Pulled 
 Container evaluation-pipeline-load_from_csv-1  Recreate
 Container evaluation-pipeline-load_from_csv-1  Recreated
 Container evaluation-pipeline-embed_text-1  Recreate
 Container evaluation-pipeline-embed_text-1  Recreated
 Container evaluation-pipeline-retrievefromweaviatecomponent-1  Recreate
 Container evaluation-pipeline-retrievefromweaviatecomponent-1  Recreated
 Container evaluation-pipeline-ragasevaluator-1  Recreate
 Container evaluation-pipeline-ragasevaluator-1  Recreated
 Container evaluation-pipeline-aggregateresults-1  Recreate
 Container evaluation-pipeline-aggregateresults-1  Recreated


Attaching to evaluation-pipeline-aggregateresults-1, evaluation-pipeline-embed_text-1, evaluation-pipeline-load_from_csv-1, evaluation-pipeline-ragasevaluator-1, evaluation-pipeline-retrievefromweaviatecomponent-1


evaluation-pipeline-load_from_csv-1                  | [2024-02-06 12:29:18,986 | fondant.cli | INFO] Component `CSVReader` found in module main
evaluation-pipeline-load_from_csv-1                  | [2024-02-06 12:29:18,991 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-load_from_csv-1                  | [2024-02-06 12:29:18,994 | fondant.component.executor | INFO] Skipping component execution
evaluation-pipeline-load_from_csv-1                  | [2024-02-06 12:29:18,996 | fondant.component.executor | INFO] Matching execution detected for component. The last execution of the component originated from `evaluation-pipeline-20240206105318`.
evaluation-pipeline-load_from_csv-1                  | [2024-02-06 12:29:19,002 | fondant.component.executor | INFO] Saving output manifest to /data/evaluation-pipeline/evaluation-pipeline-20240206132913/load_from_

evaluation-pipeline-load_from_csv-1 exited with code 0


evaluation-pipeline-embed_text-1                     | [2024-02-06 12:29:22,282 | fondant.cli | INFO] Component `EmbedTextComponent` found in module main
evaluation-pipeline-embed_text-1                     | [2024-02-06 12:29:22,287 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-embed_text-1                     | [2024-02-06 12:29:22,290 | fondant.component.executor | INFO] Previous component `load_from_csv` run was cached. Cached pipeline id: evaluation-pipeline-20240206105318
evaluation-pipeline-embed_text-1                     | [2024-02-06 12:29:22,292 | fondant.component.executor | INFO] Skipping component execution
evaluation-pipeline-embed_text-1                     | [2024-02-06 12:29:22,293 | fondant.component.executor | INFO] Matching execution detected for component. The last execution of the component originated from `evaluation-pipeline

evaluation-pipeline-embed_text-1 exited with code 0
evaluation-pipeline-retrievefromweaviatecomponent-1  | Collecting weaviate-client==3.24.1 (from -r requirements.txt (line 1))
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Obtaining dependency information for weaviate-client==3.24.1 from https://files.pythonhosted.org/packages/59/8f/44d164ed990f7c6faf28125925160af9004595020aeaaf01e94462e3bf8e/weaviate_client-3.24.1-py3-none-any.whl.metadata
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Downloading weaviate_client-3.24.1-py3-none-any.whl.metadata (3.3 kB)
evaluation-pipeline-retrievefromweaviatecomponent-1  | Collecting validators<1.0.0,>=0.21.2 (from weaviate-client==3.24.1->-r requirements.txt (line 1))
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Obtaining dependency information for validators<1.0.0,>=0.21.2 from https://files.pythonhosted.org/packages/3a/0c/785d317eea99c3739821718f118c70537639aa43f96bfa1d83a71f68eaf6/validators-0.22.0-py3-none-any.

evaluation-pipeline-retrievefromweaviatecomponent-1  | 
evaluation-pipeline-retrievefromweaviatecomponent-1  | [notice] A new release of pip is available: 23.2.1 -> 24.0
evaluation-pipeline-retrievefromweaviatecomponent-1  | [notice] To update, run: pip install --upgrade pip
evaluation-pipeline-retrievefromweaviatecomponent-1  | 
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,117 | fondant.cli | INFO] Component `RetrieveFromWeaviateComponent` found in module main
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,122 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,124 | fondant.component.executor | INFO] Caching disabled for the component
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,125 | root | INFO] Executing c

[                                        ] | 0% Completed | 625.37 us
[                                        ] | 0% Completed | 101.04 ms
[                                        ] | 0% Completed | 201.95 ms
[########################################] | 100% Completed | 302.22 ms


evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,982 | fondant.component.executor | INFO] Saving output manifest to /data/evaluation-pipeline/evaluation-pipeline-20240206132913/retrievefromweaviatecomponent/manifest.json
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-06 12:29:27,982 | fondant.component.executor | INFO] Writing cache key with manifest reference to /data/evaluation-pipeline/cache/7046a1d39ca5adb567e53d6d7bbc0c86.txt


evaluation-pipeline-retrievefromweaviatecomponent-1 exited with code 0
evaluation-pipeline-ragasevaluator-1                 | Collecting ragas==0.0.21 (from -r requirements.txt (line 1))
evaluation-pipeline-ragasevaluator-1                 |   Obtaining dependency information for ragas==0.0.21 from https://files.pythonhosted.org/packages/24/96/1b72b4081f53f0bfcf42525edfbf5a544fca597aaefde3272c28b70f080c/ragas-0.0.21-py3-none-any.whl.metadata
evaluation-pipeline-ragasevaluator-1                 |   Downloading ragas-0.0.21-py3-none-any.whl.metadata (4.6 kB)
evaluation-pipeline-ragasevaluator-1                 | Collecting datasets (from ragas==0.0.21->-r requirements.txt (line 1))
evaluation-pipeline-ragasevaluator-1                 |   Obtaining dependency information for datasets from https://files.pythonhosted.org/packages/ec/93/454ada0d1b289a0f4a86ac88dbdeab54921becabac45da3da787d136628f/datasets-2.16.1-py3-none-any.whl.metadata
evaluation-pipeline-ragasevaluator-1                 |

evaluation-pipeline-ragasevaluator-1                 | ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
evaluation-pipeline-ragasevaluator-1                 | gcsfs 2023.12.2.post1 requires fsspec==2023.12.2, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | adlfs 2024.1.0 requires fsspec>=2023.12.0, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | s3fs 2023.12.2 requires fsspec==2023.12.2, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | [notice] A new release of pip is available: 23.2.1 -> 24.0
evaluation-pipeline-ragasevaluator-1                 | [notice] To update, run: pip install --upgrade pip
evaluation-pipeline-ragasevaluator

evaluation-pipeline-ragasevaluator-1                 | Successfully installed SQLAlchemy-2.0.25 annotated-types-0.6.0 anyio-4.2.0 dataclasses-json-0.6.4 datasets-2.16.1 dill-0.3.7 distro-1.9.0 filelock-3.13.1 fsspec-2023.10.0 greenlet-3.0.3 h11-0.14.0 httpcore-1.0.2 httpx-0.26.0 huggingface-hub-0.20.3 jsonpatch-1.33 jsonpointer-2.4 langchain-0.1.5 langchain-community-0.0.17 langchain-core-0.1.18 langsmith-0.0.86 marshmallow-3.20.2 multiprocess-0.70.15 mypy-extensions-1.0.0 nest-asyncio-1.6.0 openai-1.11.1 pyarrow-hotfix-0.6 pydantic-2.6.1 pydantic-core-2.16.2 pysbd-0.3.4 ragas-0.0.21 regex-2023.12.25 sniffio-1.3.0 tenacity-8.2.3 tiktoken-0.5.2 tqdm-4.66.1 typing-inspect-0.9.0 xxhash-3.4.1


evaluation-pipeline-ragasevaluator-1                 | [2024-02-06 12:29:49,340 | fondant.cli | INFO] Component `RagasEvaluator` found in module main
evaluation-pipeline-ragasevaluator-1                 | [2024-02-06 12:29:49,344 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-ragasevaluator-1                 | [2024-02-06 12:29:49,349 | fondant.component.executor | INFO] Previous component `retrievefromweaviatecomponent` is not cached. Invalidating cache for current and subsequent components
evaluation-pipeline-ragasevaluator-1                 | [2024-02-06 12:29:49,349 | fondant.component.executor | INFO] Caching disabled for the component
evaluation-pipeline-ragasevaluator-1                 | [2024-02-06 12:29:49,349 | root | INFO] Executing component
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1      

[                                        ] | 0% Completed | 629.17 us
[                                        ] | 0% Completed | 101.66 ms
[                                        ] | 0% Completed | 202.07 ms


evaluation-pipeline-ragasevaluator-1                 | Traceback (most recent call last):
evaluation-pipeline-ragasevaluator-1                 |   File "/usr/local/bin/fondant", line 8, in <module>
evaluation-pipeline-ragasevaluator-1                 |     sys.exit(entrypoint())
evaluation-pipeline-ragasevaluator-1                 |              ^^^^^^^^^^^^
evaluation-pipeline-ragasevaluator-1                 |   File "/usr/local/lib/python3.11/site-packages/fondant/cli.py", line 89, in entrypoint
evaluation-pipeline-ragasevaluator-1                 |     args.func(args)
evaluation-pipeline-ragasevaluator-1                 |   File "/usr/local/lib/python3.11/site-packages/fondant/cli.py", line 711, in execute
evaluation-pipeline-ragasevaluator-1                 |     executor.execute(component)
evaluation-pipeline-ragasevaluator-1                 |   File "/usr/local/lib/python3.11/site-packages/fondant/component/executor.py", line 383, in execute
evaluation-pipeline-ragasevaluator-1 

Finished pipeline run.


service "ragasevaluator" didn't complete successfully: exit 1


#### Show evaluation results

In [None]:
utils.get_metrics_latest_run(base_path=BASE_PATH)

## Explore data

You can also check your data and results at each step in the pipelines using the **Fondant data explorer**. The first time you run the data explorer, you need to download the docker image which may take a minute. Then you can access the data explorer at: **http://localhost:8501/**

Enjoy the exploration! üç´ 

Press the ‚óºÔ∏è in the notebook toolbar to **stop the explorer**.

In [None]:
from fondant.explore import run_explorer_app

run_explorer_app(base_path=BASE_PATH)

To stop the Explore, run the cell below.

In [None]:
from fondant.explore import stop_explorer_app

stop_explorer_app()

## Clean up your environment

After your pipeline run successfully, you can **clean up** your environment and stop the weaviate database.

In [None]:
!docker compose -f weaviate/docker-compose.yaml down

## Feedback

Please share your experience or **let us know how we can improve** through our 
* [**Discord**](https://discord.gg/HnTdWhydGp) 
* [**GitHub**](https://github.com/ml6team/fondant)

And of course feel free to give us a [**star** ‚≠ê](https://github.com/ml6team/fondant) if you like what we are doing!