# Getting Started with RAG using Fireworks Fast Inference LLMs

<a href="https://colab.research.google.com/github/fw-ai/cookbook/blob/main/recipes/rag/rag-paper-titles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

While large language models (LLMs) show powerful capabilities that power advanced use cases, they suffer from issues such as factual inconsistency and hallucination. Retrieval-augmented generation (RAG) is a powerful approach to enrich LLM capabilities and improve their reliability. RAG involves combining LLMs with external knowledge by enriching the prompt context with relevant information that helps accomplish a task.

This tutorial shows how to getting started with RAG by leveraging vector store and open-source LLMs. To showcase the power of RAG, this use case will cover building a RAG system that suggests short and easy to read ML paper titles from original ML paper titles. Paper tiles can be too technical for a general audience so using RAG to generate short titles based on previously created short titles can make research paper titles more accessible and used for science communication such as in the form of newsletters or blogs.

Before getting started, let's first install the libraries we will use:

In [None]:
%%capture
# !pip install chromadb tqdm fireworks-ai python-dotenv pandas
# !pip install sentence-transformers

Let's download the dataset we will use:

In [None]:
!wget https://raw.githubusercontent.com/dair-ai/ML-Papers-of-the-Week/main/research/ml-potw-10232023.csv
!mkdir data
!mv ml-potw-10232023.csv data/

--2024-04-07 23:16:07--  https://raw.githubusercontent.com/dair-ai/ML-Papers-of-the-Week/main/research/ml-potw-10232023.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 664158 (649K) [text/plain]
Saving to: ‘ml-potw-10232023.csv’


2024-04-07 23:16:07 (20.0 MB/s) - ‘ml-potw-10232023.csv’ saved [664158/664158]



In [None]:
%%capture
# !pip install datasets

Before continuing, you need to obtain a Fireworks API Key to use the Mistral 7B model.

Checkout this quick guide to obtain your Fireworks API Key: https://readme.fireworks.ai/docs

In [None]:
import fireworks.client
import os
import dotenv
import chromadb
import json
from tqdm.auto import tqdm
import pandas as pd
import random
from google.colab import userdata

# you can set envs using Colab secrets
fireworks.client.api_key = userdata.get('FIREWORKS_API_KEY')

## Getting Started

Let's define a function to get completions from the Fireworks inference platform.

In [None]:
def get_completion(prompt, model=None, max_tokens=50):

    fw_model_dir = "accounts/fireworks/models/"

    if model is None:
        model = fw_model_dir + "llama-v2-7b"
    else:
        model = fw_model_dir + model

    completion = fireworks.client.Completion.create(
        model=model,
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=0
    )

    return completion.choices[0].text

Let's first try the function with a simple prompt:

In [None]:
get_completion("Hello, my name is")

' Katie and I am a 20 year old student at the University of Leeds. I am currently studying a BA in English Literature and Creative Writing. I have been working as a tutor for over 3 years now and I'

Now let's test with Mistral-7B-Instruct:

In [None]:
mistral_llm = "mistral-7b-instruct-4k"


## RAG Use Case: Generating Short Paper Titles

For the RAG use case, we will be using [a dataset](https://github.com/dair-ai/ML-Papers-of-the-Week/tree/main/research) that contains a list of weekly top trending ML papers.

The user will provide an original paper title. We will then take that input and then use the dataset to generate a context of short and catchy papers titles that will help generate catchy title for the original input title.



### Step 1: Load the Dataset

Let's first load the dataset we will use:

In [None]:
# load dataset from data/ folder to pandas dataframe
# dataset contains column names

from datasets import load_dataset
ds = load_dataset("Coder-Dragon/wikipedia-movies", split='train[:1000]')

# Convert dataset to pandas DataFrame
df = ds.to_pandas()

# remove rows with empty titles or descriptions
df = df.dropna(subset=["Title", "Plot"])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/1.04k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/75.0M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

In [None]:
df.head()

Unnamed: 0,Release Year,Title,Origin/Ethnicity,Director,Cast,Genre,Wiki Page,Plot,Image
0,1901,Kansas Saloon Smashers,American,Unknown,,unknown,https://en.wikipedia.org/wiki/Kansas_Saloon_Sm...,"A bartender is working at a saloon, serving dr...",upload.wikimedia.org/wikipedia/commons/2/2d/Ka...
1,1901,Love by the Light of the Moon,American,Unknown,,unknown,https://en.wikipedia.org/wiki/Love_by_the_Ligh...,"The moon, painted with a smiling face hangs ov...",upload.wikimedia.org/wikipedia/commons/2/22/St...
2,1901,The Martyred Presidents,American,Unknown,,unknown,https://en.wikipedia.org/wiki/The_Martyred_Pre...,"The film, just over a minute long, is composed...",upload.wikimedia.org/wikipedia/commons/e/e4/Th...
3,1903,Alice in Wonderland,American,Cecil Hepworth,May Clark,unknown,https://en.wikipedia.org/wiki/Alice_in_Wonderl...,"Alice follows a large white rabbit down a ""Rab...",upload.wikimedia.org/wikipedia/commons/9/9a/No...
4,1903,The Great Train Robbery,American,Edwin S. Porter,,western,https://en.wikipedia.org/wiki/The_Great_Train_...,The film opens with two bandits breaking into ...,upload.wikimedia.org/wikipedia/commons/5/51/Th...


##Modifying the dataframe for easier analysis

In [None]:
columns_to_drop = ['Release Year', 'Origin/Ethnicity', 'Director', 'Cast', 'Genre', 'Wiki Page', 'Image']
df2 = df.drop(columns=columns_to_drop)

In [None]:
# convert dataframe to list of dicts with Title and Description columns only

df_dict = df2.to_dict(orient="records")

In [None]:
df_dict[0]

{'Title': 'Kansas Saloon Smashers',
 'Plot': "A bartender is working at a saloon, serving drinks to customers. After he fills a stereotypically Irish man's bucket with beer, Carrie Nation and her followers burst inside. They assault the Irish man, pulling his hat over his eyes and then dumping the beer over his head. The group then begin wrecking the bar, smashing the fixtures, mirrors, and breaking the cash register. The bartender then sprays seltzer water in Nation's face before a group of policemen appear and order everybody to leave.[1]"}

We will be using SentenceTransformer for generating embeddings that we will store to a chroma document store.

In [None]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

class MyEmbeddingFunction(EmbeddingFunction):
    def __call__(self, input: Documents) -> Embeddings:
        batch_embeddings = embedding_model.encode(input)
        return batch_embeddings.tolist()

embed_fn = MyEmbeddingFunction()

# Initialize the chromadb directory, and client.
client = chromadb.PersistentClient(path="./chromadb")

# create collection
collection = client.get_or_create_collection(
    name=f"movie_plots_1920s_cinema"
)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

We will now generate embeddings for batches:

In [None]:
# Generate embeddings, and index titles in batches
batch_size = 50

# loop through batches and generated + store embeddings
for i in tqdm(range(0, len(df_dict), batch_size)):

    i_end = min(i + batch_size, len(df_dict))
    batch = df_dict[i : i + batch_size]

    # Replace title with "No Title" if empty string
    batch_titles = [str(movie["Title"]) if str(movie["Title"]) != "" else "No Title" for movie in batch]
    batch_ids = [str(sum(ord(c) + random.randint(1, 10000) for c in movie["Title"])) for movie in batch]
    batch_metadata = [dict(
                             plot=movie['Plot']
                           )
                            for movie in batch]

    # generate embeddings
    batch_embeddings = embedding_model.encode(batch_metadata)

    # upsert to chromadb
    collection.upsert(
        ids=batch_ids,
        metadatas=batch_metadata,
        documents=batch_titles,
        embeddings=batch_embeddings.tolist(),
    )

  0%|          | 0/20 [00:00<?, ?it/s]

Now we can test the retriever:

In [None]:
# # Query the collection
# collection = client.get_or_create_collection(
#     name=f"movie_plots_1920s_cinema",
#     embedding_function=embed_fn
# )

# retriever_results = collection.query(
#     query_texts=["Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo."],
#     n_results=5,
# )

# # Extract the documents (movie titles) and convert them into a set to remove duplicates
# unique_documents = set([doc for sublist in retriever_results["documents"] for doc in sublist])

# # Extract documents and metadata
# documents = retriever_results["documents"]
# metadata = retriever_results["metadatas"]

# # Print documents and their associated metadata
# for i, sublist in enumerate(documents):
#     for doc in sublist:
#         print(f"{doc}: {metadata[i]}")

Now let's put together our final prompt:

In [None]:
# user query
user_query = "Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions
Model Suggestions:

Based on the provided movie descriptions and the user query, here are 5 suggested movies:

1. Nanook of the North
2. The Shriek of Araby
3. The Way of All Men
4. The Barker
5. Uncharted Seas


In [None]:
# user query
user_query = "Western romance"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Western romance
Model Suggestions:

Based on the provided movie descriptions and the user query, here are 5 suggested movies:

1. Bucking Broadway
2. Golden Rule Kate
3. The Rogue Song
4. The Cavalier
5. Love Never Dies


In [None]:
# user query
user_query = "Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo.
Model Suggestions:

Based on the plot description, here are 5 suggested movies that may be similar to the USER_QUERY:

1. Sahara
2. Foolish Wives
3. Through the Back Door
4. Long Pants
5. A Child for Sale


In [None]:
# user query
user_query = "Comedy film, office disguises, boss's daughter, elopement."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Comedy film, office disguises, boss's daughter, elopement.
Model Suggestions:

Based on the provided movie plot and user query, here are 5 suggested movies:

1. Amarilly of Clothes-Line Alley
2. Ask Father
3. The Guardsman
4. The Saturday Night Kid
5. The Front Page


In [None]:
# user query
user_query = "Lost film, Cleopatra charms Caesar, plots world rule, treasures from mummy, revels with Antony, tragic end with serpent in Alexandria."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Lost film, Cleopatra charms Caesar, plots world rule, treasures from mummy, revels with Antony, tragic end with serpent in Alexandria.
Model Suggestions:

Based on the provided movie plot and user query, here are 5 suggested movies:

1. Cleopatra
2. General Crack
3. For Better, For Worse
4. Disraeli
5. Revenge


In [None]:
# user query
user_query = "Denis Gage Deane-Tanner"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
movie_names = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 different SUGGESTED_MOVIES selected from MOVIE_NAMES based on the MOVIE_PLOT and its similaries to the USER_QUERY

Give us the names of the movies

MOVIE_DESCRIPTION: {user_query}

MOVIE_NAMES: {movie_names}

[/INST]

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Description:")
print(user_query)
print("Model Suggestions:")
print(suggested_titles)
# print("\n\n\nPrompt Template:")
# print(prompt_template)

Description:
Denis Gage Deane-Tanner
Model Suggestions:

Based on the similarities between the USER_QUERY and the MOVIE_PLOT, here are 5 suggested movies:

1. Captain Alvarez
2. Branded
3. The Wolf Song
4. The Spoilers
5. West Point


##Model Evaluation

In [None]:
# Recall@1

(1 + 0 + 1 + 0 + 1 + 1)/6

0.6666666666666666

In [None]:
# MRR

(1 + 0 + 1 + (1/2) + 1 + 1)/6

0.75

As you can see, the short titles generated by the LLM are somewhat okay. This use case still needs a lot more work and could potentially benefit from finetuning as well. For the purpose of this tutorial, we have provided a simple application of RAG using open-source models from Firework's blazing-fast models.

Try out other open-source models here: https://app.fireworks.ai/models

Read more about the Fireworks APIs here: https://readme.fireworks.ai/reference/createchatcompletion
