# 🍫Tune your RAG data pipeline and evaluate its performance

> ⚠️ This notebook can be run on your local machine or on a virtual machine and requires [Docker Compose](https://docs.docker.com/desktop/).
> Please note that it is not compatible with Google Colab as the latter does not support Docker.

In this notebook we demonstrate how to iteratively evaluate and tune a Retrieval-Augmented Generation (RAG) system using [Fondant](https://fondant.ai).

We will:

1. Set up a [Weaviate](https://weaviate.io/platform) vector store
2. Define a parameter set to test
3. Run a Fondant pipeline with those parameters to index our documents into the vector store
4. Run a Fondant pipeline with those parameters to evaluate the performance
5. Inspect the evaluation results and data between each processing step
6. Repeat step 2 - 5 until we're happy with the results

<div align="center">
<img src="../art/iteration.png" width="1000"/>
</div>

## Set up environment

> ⚠️ This section checks the prerequisites of your environment. Read any errors or warnings carefully.

Ensure a **Python between version 3.8 and 3.10** is available

In [1]:
import sys
if sys.version_info < (3, 8, 0) or sys.version_info >= (3, 11, 0):
    raise Exception(f"A Python version between 3.8 and 3.10 is required. You are running {sys.version}")

Check if **docker compose** is installed and the **docker daemon** is running

In [2]:
!docker compose version

Docker Compose version v2.19.1


Install Fondant framework

In [3]:
!pip install -q -r ../requirements.txt --disable-pip-version-check && echo "Success"

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.16.1 requires fsspec[http]<=2023.10.0,>=2023.1.0, but you have fsspec 2023.12.2 which is incompatible.[0m[31m
[0mSuccess


## Spin up the Weaviate vector store

> ⚠️ For **Apple M1/M2** chip users:
> 
> - In Docker Desktop Dashboard `Settings -> Features in development`, make sure to **un**check `Use containerd` for pulling and storing images. More info [here](https://docs.docker.com/desktop/settings/mac/#beta-features)
> - Make sure that Docker uses linux/amd64 platform and not arm64 (cell below should take care of that)

Run **Weaviate** with Docker compose

In [27]:
!docker compose -f weaviate_service/docker-compose.yaml up --detach

[1A[1B[0G[?25l[+] Running 0/0
 ⠋ Network weaviate_service_default  Creating                              [34m0.0s [0m
[?25h[1A[1A[0G[?25l[34m[+] Running 1/1[0m
 [32m✔[0m Network weaviate_service_default            [32mCreated[0m                     [34m0.1s [0m
 ⠋ Container weaviate_service-contextionary-1  Creating                    [34m0.0s [0m
 ⠋ Container weaviate_service-weaviate-1       Creating                    [34m0.0s [0m
[?25h[1A[1A[1A[1A[0G[?25l[+] Running 1/3
 [32m✔[0m Network weaviate_service_default            [32mCreated[0m                     [34m0.1s [0m
 ⠙ Container weaviate_service-contextionary-1  Creating                    [34m0.1s [0m
 ⠙ Container weaviate_service-weaviate-1       Creating                    [34m0.1s [0m
[?25h[1A[1A[1A[1A[0G[?25l[+] Running 1/3
 [32m✔[0m Network weaviate_service_default            [32mCreated[0m                     [34m0.1s [0m
 ⠿ Container weaviate_service-contextionary-1  

Make sure you have **Weaviate client v3**

Make sure the vectorDB is running and accessible

In [4]:
import logging
import weaviate

try:
    local_weaviate_client = weaviate.Client("http://localhost:8081")
    logging.info("Connected to Weaviate instance")
except weaviate.WeaviateStartUpError:
    logging.error("Cannot connect to weaviate instance, is it running?")

            Please consider upgrading to the latest version. See https://weaviate.io/developers/weaviate/client-libraries/python for details.


#### Indexing pipeline

Before we can evaluate data in a vector database we have to index documents. We have created a pipeline in the indexing notebook. Before you continue here, have a look at the notebook and initialise the database and the documents.  

## Evaluation Pipeline

In [9]:
import utils
base_path = "./data"
utils.create_directory_if_not_exists(base_path)
weaviate_url = f"http://{utils.get_host_ip()}:8081"
weaviate_class = "Index"

`pipeline_eval.py` evaluates retrieval performance using the questions provided in your test dataset

<div align=center>
<img src="../art/evaluation_ltr.png" width="800"/>
</div>

- [**Load eval data**](https://github.com/ml6team/fondant/tree/main/components/load_from_csv): loads the evaluation dataset (questions) from a csv file
- [**Embed questons**](https://github.com/ml6team/fondant/tree/main/components/embed_text): embeds each question as a vector, e.g. using [Cohere](https://cohere.com/embeddings)
- [**Query vector store**](https://github.com/ml6team/fondant/tree/main/components/retrieve_from_weaviate): retrieves the most relevant chunks for each question from the vector store
- [**Evaluate**](https://github.com/ml6team/fondant/tree/0.8.0/components/evaluate_ragas): evaluates the retrieved chunks for each question, e.g. using [RAGAS](https://docs.ragas.io/en/latest/index.html)
- [**Aggregate**](https://github.com/ml6team/fondant-usecase-RAG/tree/main/src/components/aggregate_eval_results): calculates aggregated results

### Create the evaluation pipeline

⚠️ If you want to use an **OpenAI** model for evaluation you will need an [API key](https://platform.openai.com/docs/quickstart) (see TODO below)

Change the arguments below if you want to run the pipeline with different parameters.

In [5]:
import os
os.environ["OPENAI_API_KEY"] = ""

We begin by initializing our pipeline.

In [10]:
import pyarrow as pa
from fondant.pipeline import Pipeline
evaluation_pipeline = Pipeline(
        name="evaluation-pipeline",
        description="Pipeline to evaluate a RAG system",
        base_path=base_path,
)


We have created a set of evaluation questions which we will use to evaluate the retrieval performance of the RAG system. Therefore, we need to load the CSV file containing the questions. We are going to use a reusable component for this task, `load_from_csv`.

In [11]:
evaluation_set_filename = "wikitext_1000_q.csv"

load_from_csv = evaluation_pipeline.read(
    "load_from_csv",
    arguments={
        "dataset_uri": "/evaldata/" + evaluation_set_filename,
        # mounted dir from within docker as extra_volumes
        "column_separator": ";",
    },
    produces={
        "question": pa.string(),
    },
)

Afterward, we are going to embed our questions and retrieve answers from the database. Here we will once again use the reusable `embed_text` component.

In [12]:
embed_text_op = load_from_csv.apply(
    "embed_text",
    arguments={
        "model_provider": "huggingface",
        "model": "all-MiniLM-L6-v2"
    },
    consumes={
        "text": "question",
    }
)

Before we can evaluate answers, we need to retrieve these for our questions. Hence, we are building a custom lightweight component to add to our pipeline later.

In [13]:
import pandas as pd
import pyarrow as pa
from fondant.component import PandasTransformComponent
from fondant.pipeline import lightweight_component


@lightweight_component(
    produces={"retrieved_chunks": pa.list_(pa.string())},
    extra_requires=["weaviate-client==3.24.1"],
)
class RetrieveFromWeaviateComponent(PandasTransformComponent):
    def __init__(self, *, weaviate_url: str, class_name: str, top_k: int) -> None:
        import weaviate

        self.client = weaviate.Client(
            url=weaviate_url,
            additional_config=None,
            additional_headers=None,
        )
        self.class_name = class_name
        self.k = top_k

    def teardown(self) -> None:
        # Ensure the weaviate client is closed at the end of the component lifetime
        del self.client

    def retrieve_chunks_from_embeddings(self, vector_query: list):
        """Get results from weaviate database."""
        query = (
            self.client.query.get(self.class_name, ["passage"])
            .with_near_vector({"vector": vector_query})
            .with_limit(self.k)
            .with_additional(["distance"])
        )

        result = query.do()
        result_dict = result["data"]["Get"][self.class_name]
        return [retrieved_chunk["passage"] for retrieved_chunk in result_dict]

    def transform(self, dataframe: pd.DataFrame) -> pd.DataFrame:
        dataframe["retrieved_chunks"] = dataframe["embedding"].apply(self.retrieve_chunks_from_embeddings)
        return dataframe

# Add component to pipeline
retrieve_chunks = embed_text_op.apply(
    RetrieveFromWeaviateComponent,
    arguments={
        "weaviate_url": weaviate_url,
        "class_name": weaviate_class,
        "top_k": 2
    },
)

 Consumes: {'question': {'type': 'string'}, 'embedding': {'type': 'array', 'items': {'type': 'float32'}}}


`RetrieveFromWeaviateComponent` will produce `retrieved_chunks`. We aim to evaluate these chunks using RAGAS. RAGAS is an open-source library designed to assess RAG systems by leveraging LLMs. In this example, we'll use gpt-3.5-turbo. Essentially, we pass the retrieved chunks along with the answer to a LLM and ask it to judge the quality of the provided answers.

Feel free to explore the RAGAS documentation and modify the component to suit your needs. RAGAS provides support for altering the prompt and adapting it to your specific domain or language.

In [23]:
@lightweight_component(
    consumes={
        "question": pa.string(),
        "retrieved_chunks": pa.list_(pa.string()),
    },
    produces={

        "context_relevancy": pa.float32(),
    },
    extra_requires=["ragas==0.1.0"],
)
class RagasEvaluator(PandasTransformComponent):
    def __init__(self, *, open_ai_key: str) -> None:
        import os
        os.environ["OPENAI_API_KEY"] = open_ai_key

    def transform(self, dataframe: pd.DataFrame) -> pd.DataFrame:
        from datasets import Dataset
        from ragas import evaluate
        from ragas.metrics import context_relevancy
        from langchain_openai.chat_models import ChatOpenAI

        gpt_evaluator = ChatOpenAI(model_name="gpt-3.5-turbo")

        dataframe = dataframe.rename(
            columns={"retrieved_chunks": "contexts"},
        )
        
        dataset = Dataset.from_pandas(dataframe)

        result = evaluate(
            dataset,  
            metrics=[context_relevancy],
            llm=gpt_evaluator,
        )

        results_df = result.to_pandas()
        results_df = results_df.set_index(dataframe.index)

        return results_df
    
# Add component to pipeline
retriever_eval = retrieve_chunks.apply(
    RagasEvaluator,
    arguments={
        "open_ai_key": os.getenv("OPENAI_API_KEY")
    }
)

The `RAGASEvaluator` component will append two additional columns to our dataset: `context_precision` and `context_relevancy` for each question-answer pair. To evaluate the overall performance of our RAG setup, we need to aggregate these results. For demonstration purposes, we'll write the results to a file. Of course, you can export the aggregated results to any dashboard tool of your choice.

In [25]:
from fondant.component import DaskWriteComponent
import dask.dataframe as dd


@lightweight_component(
    consumes={
        "context_relevancy": pa.float32(),
    }
)
class AggregateResults(DaskWriteComponent):
    def write(self, dataframe: dd.DataFrame) -> None:
        import pandas as pd
        mean_context_relevancy = dataframe["context_relevancy"].mean()
        df = pd.DataFrame({
            "context_relevancy": mean_context_relevancy
        })

        df.to_csv("./evaldata/aggregated_results.csv")

# Add component to pipeline
retriever_eval.apply(
    AggregateResults, 
    consumes={
        "context_relevancy": "context_relevancy"
    }
)

<fondant.pipeline.pipeline.Dataset at 0x13634baf0>

#### Run the evaluation pipeline

In [26]:
import os
from fondant.pipeline.runner import DockerRunner
runner = DockerRunner() 
extra_volumes = [str(os.path.join(os.path.abspath('.'), "evaluation_datasets")) + ":/evaldata"]
runner.run(evaluation_pipeline, extra_volumes=extra_volumes)

INFO:root:Found reference to un-compiled pipeline... compiling
INFO:fondant.pipeline.compiler:Compiling evaluation-pipeline to .fondant/compose.yaml
INFO:fondant.pipeline.compiler:Base path found on local system, setting up ./data as mount volume
INFO:fondant.pipeline.pipeline:Sorting pipeline component graph topologically.
INFO:fondant.pipeline.pipeline:All pipeline component specifications match.
INFO:fondant.pipeline.compiler:Compiling service for load_from_csv
INFO:fondant.pipeline.compiler:Compiling service for embed_text
INFO:fondant.pipeline.compiler:Compiling service for retrievefromweaviatecomponent
INFO:fondant.pipeline.compiler:Compiling service for ragasevaluator
INFO:fondant.pipeline.compiler:Compiling service for aggregateresults
INFO:fondant.pipeline.compiler:Successfully compiled to .fondant/compose.yaml
 load_from_csv Pulling 
 ragasevaluator Pulling 
 aggregateresults Pulling 
 embed_text Pulling 
 retrievefromweaviatecomponent Pulling 


Starting pipeline run...


 retrievefromweaviatecomponent Pulled 
 ragasevaluator Pulled 
 load_from_csv Pulled 
 embed_text Pulled 
 aggregateresults Pulled 
 Container evaluation-pipeline-load_from_csv-1  Recreate
 Container evaluation-pipeline-load_from_csv-1  Recreated
 Container evaluation-pipeline-embed_text-1  Recreate
 Container evaluation-pipeline-embed_text-1  Recreated
 Container evaluation-pipeline-retrievefromweaviatecomponent-1  Recreate
 Container evaluation-pipeline-retrievefromweaviatecomponent-1  Recreated
 Container evaluation-pipeline-ragasevaluator-1  Recreate
 Container evaluation-pipeline-ragasevaluator-1  Recreated
 Container evaluation-pipeline-aggregateresults-1  Recreate
 Container evaluation-pipeline-aggregateresults-1  Recreated


Attaching to evaluation-pipeline-aggregateresults-1, evaluation-pipeline-embed_text-1, evaluation-pipeline-load_from_csv-1, evaluation-pipeline-ragasevaluator-1, evaluation-pipeline-retrievefromweaviatecomponent-1


evaluation-pipeline-load_from_csv-1                  | [2024-02-08 13:30:29,549 | fondant.cli | INFO] Component `CSVReader` found in module main
evaluation-pipeline-load_from_csv-1                  | [2024-02-08 13:30:29,554 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-load_from_csv-1                  | [2024-02-08 13:30:29,558 | fondant.component.executor | INFO] Skipping component execution
evaluation-pipeline-load_from_csv-1                  | [2024-02-08 13:30:29,561 | fondant.component.executor | INFO] Matching execution detected for component. The last execution of the component originated from `evaluation-pipeline-20240206105318`.
evaluation-pipeline-load_from_csv-1                  | [2024-02-08 13:30:29,566 | fondant.component.executor | INFO] Saving output manifest to /data/evaluation-pipeline/evaluation-pipeline-20240208143024/load_from_

evaluation-pipeline-load_from_csv-1 exited with code 0


evaluation-pipeline-embed_text-1                     | [2024-02-08 13:30:33,559 | fondant.cli | INFO] Component `EmbedTextComponent` found in module main
evaluation-pipeline-embed_text-1                     | [2024-02-08 13:30:33,564 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-embed_text-1                     | [2024-02-08 13:30:33,569 | fondant.component.executor | INFO] Previous component `load_from_csv` run was cached. Cached pipeline id: evaluation-pipeline-20240206105318
evaluation-pipeline-embed_text-1                     | [2024-02-08 13:30:33,571 | fondant.component.executor | INFO] Skipping component execution
evaluation-pipeline-embed_text-1                     | [2024-02-08 13:30:33,574 | fondant.component.executor | INFO] Matching execution detected for component. The last execution of the component originated from `evaluation-pipeline

evaluation-pipeline-embed_text-1 exited with code 0
evaluation-pipeline-retrievefromweaviatecomponent-1  | Collecting weaviate-client==3.24.1 (from -r requirements.txt (line 1))
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Obtaining dependency information for weaviate-client==3.24.1 from https://files.pythonhosted.org/packages/59/8f/44d164ed990f7c6faf28125925160af9004595020aeaaf01e94462e3bf8e/weaviate_client-3.24.1-py3-none-any.whl.metadata
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Downloading weaviate_client-3.24.1-py3-none-any.whl.metadata (3.3 kB)
evaluation-pipeline-retrievefromweaviatecomponent-1  | Collecting validators<1.0.0,>=0.21.2 (from weaviate-client==3.24.1->-r requirements.txt (line 1))
evaluation-pipeline-retrievefromweaviatecomponent-1  |   Obtaining dependency information for validators<1.0.0,>=0.21.2 from https://files.pythonhosted.org/packages/3a/0c/785d317eea99c3739821718f118c70537639aa43f96bfa1d83a71f68eaf6/validators-0.22.0-py3-none-any.

evaluation-pipeline-retrievefromweaviatecomponent-1  | 
evaluation-pipeline-retrievefromweaviatecomponent-1  | [notice] A new release of pip is available: 23.2.1 -> 24.0
evaluation-pipeline-retrievefromweaviatecomponent-1  | [notice] To update, run: pip install --upgrade pip
evaluation-pipeline-retrievefromweaviatecomponent-1  | 
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-08 13:30:38,734 | fondant.cli | INFO] Component `RetrieveFromWeaviateComponent` found in module main
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-08 13:30:38,739 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-retrievefromweaviatecomponent-1  | [2024-02-08 13:30:38,742 | fondant.component.executor | INFO] Previous component `embed_text` run was cached. Cached pipeline id: evaluation-pipeline-20240206105318
evaluation-pipeline-retrievefromweavia

evaluation-pipeline-retrievefromweaviatecomponent-1 exited with code 0
evaluation-pipeline-ragasevaluator-1                 | Collecting ragas==0.1.0 (from -r requirements.txt (line 1))
evaluation-pipeline-ragasevaluator-1                 |   Obtaining dependency information for ragas==0.1.0 from https://files.pythonhosted.org/packages/5e/94/97777b227098625c48fcde0ac292caff3bf2b2a8c6b1cd49e417498722c2/ragas-0.1.0-py3-none-any.whl.metadata
evaluation-pipeline-ragasevaluator-1                 |   Downloading ragas-0.1.0-py3-none-any.whl.metadata (4.7 kB)
evaluation-pipeline-ragasevaluator-1                 | Collecting datasets (from ragas==0.1.0->-r requirements.txt (line 1))
evaluation-pipeline-ragasevaluator-1                 |   Obtaining dependency information for datasets from https://files.pythonhosted.org/packages/ec/93/454ada0d1b289a0f4a86ac88dbdeab54921becabac45da3da787d136628f/datasets-2.16.1-py3-none-any.whl.metadata
evaluation-pipeline-ragasevaluator-1                 |   Do

evaluation-pipeline-ragasevaluator-1                 | ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
evaluation-pipeline-ragasevaluator-1                 | gcsfs 2023.12.2.post1 requires fsspec==2023.12.2, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | adlfs 2024.1.0 requires fsspec>=2023.12.0, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | s3fs 2023.12.2 requires fsspec==2023.12.2, but you have fsspec 2023.10.0 which is incompatible.
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | [notice] A new release of pip is available: 23.2.1 -> 24.0
evaluation-pipeline-ragasevaluator-1                 | [notice] To update, run: pip install --upgrade pip
evaluation-pipeline-ragasevaluator

evaluation-pipeline-ragasevaluator-1                 | Successfully installed SQLAlchemy-2.0.25 annotated-types-0.6.0 anyio-4.2.0 appdirs-1.4.4 dataclasses-json-0.6.4 datasets-2.16.1 dill-0.3.7 distro-1.9.0 filelock-3.13.1 fsspec-2023.10.0 greenlet-3.0.3 h11-0.14.0 httpcore-1.0.2 httpx-0.26.0 huggingface-hub-0.20.3 jsonpatch-1.33 jsonpointer-2.4 langchain-0.1.5 langchain-community-0.0.19 langchain-core-0.1.21 langchain-openai-0.0.5 langsmith-0.0.87 marshmallow-3.20.2 multiprocess-0.70.15 mypy-extensions-1.0.0 nest-asyncio-1.6.0 openai-1.11.1 pyarrow-hotfix-0.6 pydantic-2.6.1 pydantic-core-2.16.2 pysbd-0.3.4 ragas-0.1.0 regex-2023.12.25 sniffio-1.3.0 tenacity-8.2.3 tiktoken-0.5.2 tqdm-4.66.1 typing-inspect-0.9.0 xxhash-3.4.1


evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,364 | fondant.cli | INFO] Component `RagasEvaluator` found in module main
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,370 | fondant.component.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,385 | fondant.component.executor | INFO] Previous component `retrievefromweaviatecomponent` run was cached. Cached pipeline id: evaluation-pipeline-20240208135836
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,386 | fondant.component.executor | INFO] No matching execution for component detected
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,386 | root | INFO] Executing component
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:30:58,556 | root | 

[                                        ] | 0% Completed | 561.25 us
[                                        ] | 0% Completed | 103.53 ms
[                                        ] | 0% Completed | 207.98 ms
[                                        ] | 0% Completed | 315.33 ms
[                                        ] | 0% Completed | 428.78 ms
[                                        ] | 0% Completed | 529.75 ms
[                                        ] | 0% Completed | 632.47 ms
[                                        ] | 0% Completed | 738.72 ms
[                                        ] | 0% Completed | 843.85 ms
[                                        ] | 0% Completed | 982.68 ms
[                                        ] | 0% Completed | 1.09 s
[                                        ] | 0% Completed | 1.20 s




[                                        ] | 0% Completed | 1.30 s
[                                        ] | 0% Completed | 1.40 s
[                                        ] | 0% Completed | 1.50 s
[                                        ] | 0% Completed | 1.60 s
[                                        ] | 0% Completed | 1.70 s


Evaluating:   0%|          | 0/10 [00:00<?, ?it/s]   | 
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | 
Evaluating:   0%|          | 0/10 [00:00<?, ?it/s][A| 
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | 
Evaluating:   0%|          | 0/10 [00:00<?, ?it/s][A[A
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | 
evaluation-pipeline-ragasevaluator-1                 | 
Evaluating:   0%|          | 0/10 [00:00<?, ?it/s][A[A[A
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:31:00,342 | openai._base_client | INFO] Retrying request to /chat/completions in 0.787427 seconds
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:31:00,345 | openai._base_client | INFO] Retrying request to /chat/completions in 0.88

[                                        ] | 0% Completed | 1.80 s
[                                        ] | 0% Completed | 1.91 s
[                                        ] | 0% Completed | 2.01 s
[                                        ] | 0% Completed | 2.11 s
[                                        ] | 0% Completed | 2.21 s
[                                        ] | 0% Completed | 2.32 s
[                                        ] | 0% Completed | 2.42 s
[                                        ] | 0% Completed | 2.53 s
[                                        ] | 0% Completed | 2.64 s
[                                        ] | 0% Completed | 2.74 s


evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:31:01,381 | httpx | INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:31:01,394 | httpx | INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


[                                        ] | 0% Completed | 2.84 s
[                                        ] | 0% Completed | 2.94 s
[                                        ] | 0% Completed | 3.05 s
[                                        ] | 0% Completed | 3.15 s
[                                        ] | 0% Completed | 3.26 s
[                                        ] | 0% Completed | 3.36 s
[                                        ] | 0% Completed | 3.46 s
[                                        ] | 0% Completed | 3.56 s


evaluation-pipeline-ragasevaluator-1                 | [2024-02-08 13:31:02,236 | httpx | INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


[                                        ] | 0% Completed | 3.67 s
[                                        ] | 0% Completed | 3.77 s
[                                        ] | 0% Completed | 3.87 s
[                                        ] | 0% Completed | 3.97 s
[                                        ] | 0% Completed | 4.08 s
[                                        ] | 0% Completed | 4.18 s
[                                        ] | 0% Completed | 4.28 s
[                                        ] | 0% Completed | 4.38 s
[                                        ] | 0% Completed | 4.49 s
[                                        ] | 0% Completed | 4.59 s
[                                        ] | 0% Completed | 4.70 s
[                                        ] | 0% Completed | 4.80 s
[                                        ] | 0% Completed | 4.90 s
[                                        ] | 0% Completed | 5.01 s
[                                        ] | 0% Completed | 5.

#### Show evaluation results

In [22]:
import pandas as pd
df = pd.read_csv("./evaluation_dataset/aggregated_results.csv")
df

FileNotFoundError: [Errno 2] No such file or directory: './evaluation_dataset/aggregated_results.csv'

## Explore data

You can also check your data and results at each step in the pipelines using the **Fondant data explorer**. The first time you run the data explorer, you need to download the docker image which may take a minute. Then you can access the data explorer at: **http://localhost:8501/**

Enjoy the exploration! 🍫 

Press the ◼️ in the notebook toolbar to **stop the explorer**.

In [None]:
from fondant.explore import run_explorer_app
run_explorer_app(base_path=base_path)

To stop the Explore, run the cell below.

In [None]:
from fondant.explore import stop_explorer_app
stop_explorer_app()

## Clean up your environment

After your pipeline run successfully, you can **clean up** your environment and stop the weaviate database.

In [None]:
!docker compose -f weaviate/docker-compose.yaml down

## Feedback

Please share your experience or **let us know how we can improve** through our 
* [**Discord**](https://discord.gg/HnTdWhydGp) 
* [**GitHub**](https://github.com/ml6team/fondant)

And of course feel free to give us a [**star** ⭐](https://github.com/ml6team/fondant) if you like what we are doing!