Skip to content

ml6team/fondant-usecase-RAG

Repository files navigation

Retrieval Augmented Generation (RAG) Tuning

Check out our blogpost about how to to fine-tune your RAG pipeline using Fondant!

Introduction

This repository contains data pipelines and ready-to-use notebooks for tuning RAG systems both manually and automatically using parameter search. To achieve this, it leverages Fondant, a free and open source framework for production-ready, easy and shareable data processing. Check out the Fondant website if you want to learn more and join our Discord if you want to stay up to date.

Available notebooks

A simple RAG indexing pipeline

A notebook with a simple Fondant pipeline to index your data into a RAG system.

Iterative tuning of a RAG indexing pipeline

A notebook which iteratively runs a Fondant pipeline to evaluate a RAG system using RAGAS.

Getting started

⚠️ Prerequisites:

  • A Python version between 3.8 and 3.10 installed on your system.
  • Docker and docker compose installed and configured on your system. More info here.
  • A GPU is recommended to run the model-based components of the pipeline.

Cloning the repository

Clone this repository to your local machine using one of the following commands:

HTTPS

git clone https://github.com/ml6team/fondant-usecase-rag.git

SSH

git clone git@github.com:ml6team/fondant-usecase-rag.git

Installing the requirements

pip install -r requirements.txt

Confirm that Fondant has been installed correctly on your system by executing the following command:

fondant --help

Running the pipeline

There are two options to run the pipeline: