Skip to content

andreia-oca/rag-workshop

Repository files navigation

Generative AI - RAG Workshop

This repository contains the source code for the workshop "Can LLMs Learn? Let’s Customize an LLM to Chat With Your Own Data" held at C3 Festival 2024.

The application is a RAG application that takes as an input the user description and interests and generates a list of must-see speakers from C3 Festival.

Feel free to fork it and use it as a template for your own projects.

Step-by-step tutorial

LLMs challenges are:

  1. Adding new or proprietary data - LLMs are trained on public tokens (public data from the Internet) up to a certain date. If you want to add new data, you need to retrain the model or to provide the information in the prompt.
  2. Costs - each prompt cost increases with the number of tokens in the prompt. The longer the prompt, the more expensive it is.

Prerequisites

Create a server/.env file to export the OPENAI_API_KEY:

# .env
OPENAI_API_KEY="your_openai_api_key"

Create a vector store

The vector store is a fancy storage for the proprietary data.

Load the data into documents:

    // Document loading
    const loader = new TextLoader("./data/talks.txt");
    const raw_documents = await loader.load();
    console.log(raw_documents.length);
    console.log(raw_documents[0].length);

Split the data into more manageable chunks:

    // Document splitting
    const splitter = new RecursiveCharacterTextSplitter({
        separators: ["\n\n", "\n", ",", " ", ""],
        chunkSize: 1024,
        chunkOverlap: 256,
    });

    const documents = await splitter.splitDocuments(raw_documents);
    console.log(documents.length);

Create the vector store/vector database:

    // Use the OpenAIEmbeddings model to create embeddings from text
    const embeddings = new OpenAIEmbeddings({openAIApiKey: OPENAI_API_KEY});

    // Create the database directory if it doesn't exist
    if (!fs.existsSync(DATABASE_PATH)) {
      try {
        fs.mkdirSync(DATABASE_PATH);
      } catch (e) {
        console.error(`Error creating directory '${DATABASE_PATH}':`, e);
      }
    }

    // Connect to the vector store
    const db = await lancedb.connect(DATABASE_PATH);

Set the vector schema - highly dependant on how the documents metadata looks like:

    const table = await db.createTable(
      "vectors",
      [
        {
          vector: await embeddings.embedQuery("string"),
          text: "contents",
          source: "filename",
          loc: { lines: { from: 0, to: 0 } },
        },
      ],
      { writeMode: lancedb.WriteMode.Overwrite }
    );

Save the embeddings in the vector store:

    // Save the data as OpenAI embeddings in a table
    await LanceDB.fromDocuments(documents, embeddings, { table });

Retrieve data based on a question

Open the vector database:

      // Connect to the database
      const db = await connect(DATABASE_PATH);

      // Open the table
      const table = await db.openTable("vectors");

      // Initialize the vector store object with the OpenAI embeddings and the table
      const vectorStore = new LanceDB(new OpenAIEmbeddings(), { table });

Optional, for debugging purposes, you can run a similarity search to understand which data is relevant to the question from the vector database. This data will be added as a context in the final prompt:

      // Debugging: Retrieve the most similar context to the input question
      const result = await vectorStore.similaritySearch(user.description, CONTEXT_DOCS_NUMBER);
      for (const item of result) {
        console.log("Context metadata: ", item.metadata);
        console.log("Context content: ", item.pageContent.slice(0, 10));
      }

Create the pipeline that will construct the final prompt. The data retrieved will be pasted instead of {context} and the {question} will be the actual user description:

      // Retrieve the most similar context to the input question
      const retriever = vectorStore.asRetriever(
        {
          vectorStore: vectorStore,
          k: CONTEXT_DOCS_NUMBER,
          searchType: "similarity",
          filter: {},
        },
        {
          verbose: true
        },
      );

      // Create a pipeline that will feed the input question and the database retrieved context to the model
      const setupAndRetrieval = RunnableMap.from({
        context: new RunnableLambda({
          func: (input: string) => {
            return retriever.invoke(input).then((response) => response.map(item => item.pageContent).join(' ')
          )
          }
        }).withConfig({ runName: "context" }),
        question: new RunnablePassthrough(),
      });
      // Define the prompt that will be fed to the model
      const prompt = ChatPromptTemplate.fromMessages([
        [
          "ai",
          `Your task is to advise me on the top 3 speakers I should see at a conference.

Based on the provided user description select the top 3 speakers you would recommend to the user.
You must also mention why you selected these speakers.

You must respond as a json object with the following structure: a list of speakers with the following fields: speaker, reason.

Do not add any additional information to the response.

Respond only based on the context provided below - do not use any external information:

Context: {context}`,
        ],
        [
          "human",
          `User description: {question}`,],
      ]);

Initialize the model:

      // Define the OpenAI model
      const model = new OpenAI({
        modelName: "gpt-4o",
        openAIApiKey: OPENAI_API_KEY,
        temperature: 0.9,
        verbose: true
      });

Lastly, the chain is completed by mentioning an output parser. In our case the output will be a simple string.

      // Create an output parser that will convert the model's response to a string
      const outputParser = new StringOutputParser();

Run the chain:

      // Feed the input question and the database retrieved context to the model
      const chain = setupAndRetrieval.pipe(prompt).pipe(model).pipe(outputParser);

      // Invoke the model to answer the question
      const rawResponse = await chain.invoke(
        user.description,
      );

Lastly, manipulate the data to make sure we can parse it as a JSON in the frontend.

      const response = rawResponse.replace('```json', '').replace('```', '');
      const recommendationList = JSON.parse(response) as Recommendation[];

Easily deploy with Genezio

You can either locally test your fullstack application by running:

genezio local

Or you can deploy it to production by running:

genezio deploy

The last command will give you a publicly available URL where your app is deployed.

Support

Feel free to use this repository as a starting point for implementing your on RAG application over OpenAI API.

If you encounter any issues, please leave a [GitHub issue] and I'll try to help you.

Resources