# Getting Started with RAG using Fireworks Fast Inference LLMs

<a href="https://colab.research.google.com/github/fw-ai/cookbook/blob/main/recipes/rag/rag-paper-titles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

While large language models (LLMs) show powerful capabilities that power advanced use cases, they suffer from issues such as factual inconsistency and hallucination. Retrieval-augmented generation (RAG) is a powerful approach to enrich LLM capabilities and improve their reliability. RAG involves combining LLMs with external knowledge by enriching the prompt context with relevant information that helps accomplish a task.

This tutorial shows how to getting started with RAG by leveraging vector store and open-source LLMs. To showcase the power of RAG, this use case will cover building a RAG system that suggests short and easy to read ML paper titles from original ML paper titles. Paper tiles can be too technical for a general audience so using RAG to generate short titles based on previously created short titles can make research paper titles more accessible and used for science communication such as in the form of newsletters or blogs.

Before getting started, let's first install the libraries we will use:

In [1]:
%%capture
!pip install chromadb tqdm fireworks-ai python-dotenv pandas
!pip install sentence-transformers datasets

Let's download the dataset we will use:

In [2]:
!wget https://raw.githubusercontent.com/dair-ai/ML-Papers-of-the-Week/main/research/ml-potw-10232023.csv
!mkdir data
!mv ml-potw-10232023.csv data/

--2024-04-15 03:37:41--  https://raw.githubusercontent.com/dair-ai/ML-Papers-of-the-Week/main/research/ml-potw-10232023.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 664158 (649K) [text/plain]
Saving to: ‘ml-potw-10232023.csv’


2024-04-15 03:37:41 (35.0 MB/s) - ‘ml-potw-10232023.csv’ saved [664158/664158]



Before continuing, you need to obtain a Fireworks API Key to use the Mistral 7B model.

Checkout this quick guide to obtain your Fireworks API Key: https://readme.fireworks.ai/docs

In [3]:
import fireworks.client
import os
import dotenv
import chromadb
import json
from tqdm.auto import tqdm
import pandas as pd
import random
from google.colab import userdata

# you can set envs using Colab secrets
fireworks.client.api_key = userdata.get('New_API_key')

## Getting Started

Let's define a function to get completions from the Fireworks inference platform.

In [4]:
def get_completion(prompt, model=None, max_tokens=50):

    fw_model_dir = "accounts/fireworks/models/"

    if model is None:
        model = fw_model_dir + "llama-v2-7b"
    else:
        model = fw_model_dir + model

    completion = fireworks.client.Completion.create(
        model=model,
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=0
    )

    return completion.choices[0].text

Let's first try the function with a simple prompt:

In [5]:
get_completion("Hello, my name is")

' Katie and I am a 20 year old student at the University of Leeds. I am currently studying a BA in English Literature and Creative Writing. I have been working as a tutor for over 3 years now and I'

Now let's test with Mistral-7B-Instruct:

In [6]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("Hello, my name is", model=mistral_llm)

' [Your Name]. I am a [Your Profession/Occupation]. I am writing to [Purpose of Writing].\n\nI am writing to [Purpose of Writing] because [Reason for Writing]. I believe that ['

The Mistral 7B Instruct model needs to be instructed using special instruction tokens `[INST] <instruction> [/INST]` to get the right behavior. You can find more instructions on how to prompt Mistral 7B Instruct here: https://docs.mistral.ai/llm/mistral-instruct-v0.1

In [7]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("Tell me 2 jokes", model=mistral_llm)

".\n1. Why don't scientists trust atoms? Because they make up everything!\n2. Did you hear about the mathematician who’s afraid of negative numbers? He will stop at nothing to avoid them."

In [8]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("[INST]Tell me 2 jokes[/INST]", model=mistral_llm)

" Sure, here are two jokes for you:\n\n1. Why don't scientists trust atoms? Because they make up everything!\n2. Why did the tomato turn red? Because it saw the salad dressing!"

Now let's try with a more complex prompt that involves instructions:

In [9]:
prompt = """[INST]
Given the following wedding guest data, write a very short 3-sentences thank you letter:

{
  "name": "John Doe",
  "relationship": "Bride's cousin",
  "hometown": "New York, NY",
  "fun_fact": "Climbed Mount Everest in 2020",
  "attending_with": "Sophia Smith",
  "bride_groom_name": "Tom and Mary"
}

Use only the data provided in the JSON object above.

The senders of the letter is the bride and groom, Tom and Mary.
[/INST]"""

get_completion(prompt, model=mistral_llm, max_tokens=150)

" Dear John Doe,\n\nWe, Tom and Mary, would like to extend our heartfelt gratitude for your attendance at our wedding. It was a pleasure to have you there, and we truly appreciate the effort you made to be a part of our special day.\n\nWe were thrilled to learn about your fun fact - climbing Mount Everest is an incredible accomplishment! We hope you had a safe and memorable journey.\n\nThank you again for joining us on this special occasion. We hope to stay in touch and catch up on all the amazing things you've been up to.\n\nWith love,\n\nTom and Mary"

## RAG Use Case: Generating Short Paper Titles

For the RAG use case, we will be using [a dataset](https://github.com/dair-ai/ML-Papers-of-the-Week/tree/main/research) that contains a list of weekly top trending ML papers.

The user will provide an original paper title. We will then take that input and then use the dataset to generate a context of short and catchy papers titles that will help generate catchy title for the original input title.



### Step 1: Load the Dataset

Let's first load the dataset we will use:

In [10]:
# load dataset from data/ folder to pandas dataframe
# dataset contains column names

from datasets import load_dataset
ds = load_dataset("Coder-Dragon/wikipedia-movies", split='train[:1000]')

passages = []
for row in ds:
      passages.append([row['Title'], row['Plot']])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/1.04k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/75.0M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

In [11]:
passages[:3]

[['Kansas Saloon Smashers',
  "A bartender is working at a saloon, serving drinks to customers. After he fills a stereotypically Irish man's bucket with beer, Carrie Nation and her followers burst inside. They assault the Irish man, pulling his hat over his eyes and then dumping the beer over his head. The group then begin wrecking the bar, smashing the fixtures, mirrors, and breaking the cash register. The bartender then sprays seltzer water in Nation's face before a group of policemen appear and order everybody to leave.[1]"],
 ['Love by the Light of the Moon',
  "The moon, painted with a smiling face hangs over a park at night. A young couple walking past a fence learn on a railing and look up. The moon smiles. They embrace, and the moon's smile gets bigger. They then sit down on a bench by a tree. The moon's view is blocked, causing him to frown. In the last scene, the man fans the woman with his hat because the moon has left the sky and is perched over her shoulder to see everythi

We will be using SentenceTransformer for generating embeddings that we will store to a chroma document store.

In [12]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

class MyEmbeddingFunction(EmbeddingFunction):
    def __call__(self, input: Documents) -> Embeddings:
        batch_embeddings = embedding_model.encode(input)
        return batch_embeddings.tolist()

embed_fn = MyEmbeddingFunction()

# Initialize the chromadb directory, and client.
client = chromadb.PersistentClient(path="./chromadb")

# create collection
collection = client.get_or_create_collection(
    name=f"movie-titles"
)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

We will now generate embeddings for batches:

In [13]:
# Generate embeddings, and index titles in batches
batch_size = 50

# loop through batches and generated + store embeddings
for i in tqdm(range(0, len(passages), batch_size)):

    i_end = min(i + batch_size, len(passages))
    batch = passages[i : i + batch_size]

    # Replace title with "No Title" if empty string
    batch_titles = [str(paper[0]) if str(paper[0]) != "" else "No Title" for paper in batch]
    batch_ids = [str(sum(ord(c) + random.randint(1, 10000) for c in paper[0])) for paper in batch]
    batch_metadata = [dict(url=paper[0])
                           for paper in batch]

    # generate embeddings
    batch_embeddings = embedding_model.encode(batch_titles)

    # upsert to chromadb
    collection.upsert(
        ids=batch_ids,
        metadatas=batch_metadata,
        documents=batch_titles,
        embeddings=batch_embeddings.tolist(),
    )

  0%|          | 0/20 [00:00<?, ?it/s]

Now we can test the retriever:

In [14]:
collection = client.get_or_create_collection(
    name=f"movie-titles",
    embedding_function=embed_fn
)

retriever_results = collection.query(
    query_texts=["Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions"],
    n_results=5,
)

print(retriever_results["documents"])

[['The Frozen North', 'From Leadville to Aspen: A Hold-Up in the Rockies', 'The Ghost of Slumber Mountain', 'The Viking', 'The Call of the Wild']]


Now let's put together our final prompt:

In [15]:
# user query
user_query = "Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. "The Frozen North" - This documentary explores the lives of indigenous peoples living in the Arctic regions, showcasing their survival techniques and daily routines.
2. "The Viking" - This film delves into the history and culture of the Vikings, including their survival skills and their impact on the Arctic regions.
3. "The Call of the Wild" - This classic novel by Jack London is adapted into a documentary, exploring the survival of a man stranded in the Arctic wilderness.
4. "The Ghost of Slumber Mountain" - This film tells the story of a group of indigenous people who live in a remote mountain village in the Arctic, and their struggles to maintain their way of life in the face of modernization.
5. "From Leadville to Aspen: A Hold-Up in the Rockies" - While not directly related to the Arctic, this film does explore the survival skills of a group of outlaws in the Rocky Mountains, which could be of interest to those looking for survival stories.



Prompt Templat

As you can see, the short titles generated by the LLM are somewhat okay. This use case still needs a lot more work and could potentially benefit from finetuning as well. For the purpose of this tutorial, we have provided a simple application of RAG using open-source models from Firework's blazing-fast models.

Try out other open-source models here: https://app.fireworks.ai/models

Read more about the Fireworks APIs here: https://readme.fireworks.ai/reference/createchatcompletion


In [16]:
# user query
user_query = "Western romance"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. "The Road to Romance" - A young cowboy travels across the West to find his true love, facing challenges and obstacles along the way.
2. "Song of the West" - A group of cowboys journey through the West, searching for adventure and love in the vast frontier.
3. "A Romance of Happy Valley" - A romantic tale of a cowboy and a rancher who fall in love in the picturesque town of Happy Valley.
4. "Romance" - A classic Western romance story about a cowboy and a woman who fall in love in the rugged West.
5. "Romance" - A story of a cowboy and a woman who find love in the midst of danger and adventure in the Wild West.



Prompt Template:
[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: Western romance

SHORT_TITLES: The Road to Romance
Song of the West
Romance
Romance
A Romance of Happy Valley

SUGGESTED_TITLES:

[/INST]

In [17]:
# user query
user_query = "Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. "A Parisian Star in Egypt" - This silent film follows the story of a Parisian actress who moves to Egypt and leaves her husband for a wealthy baron. However, she later discovers her family's poverty in Cairo and reconciles with her husband.
2. "The Parisian Star" - This film follows the story of a Parisian actress who moves to Egypt and leaves her husband for a wealthy baron. However, she later discovers her family's poverty in Cairo and reconciles with her husband.
3. "The Parisian Star in Hollywood" - This film follows the story of a Parisian actress who moves to Hollywood and leaves her husband for a wealthy baron. However, she later discovers her family's poverty in Cairo and reconciles with her husband.
4. "The Parisian Star in Egypt" - This film follows the story of a Parisian actress who moves to Egypt and leaves her husband for a wealthy baron. However, she later discovers her family's poverty in Cairo and reconciles with her husband.
5. "The Parisian Sta

In [18]:
# user query
user_query = "Comedy film, office disguises, boss's daughter, elopement"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. "Behind Office Doors" - A comedy film about a group of coworkers who decide to play pranks on their boss, but things take a turn when they accidentally discover his daughter's elopement plans.
2. "A Rogue's Romance" - A romantic comedy about a rogue who disguises himself as an office worker to win the heart of his boss's daughter, but their relationship is put to the test when he is caught in a web of lies.
3. "Beauty and the Rogue" - A romantic comedy about a beautiful woman who falls for a rogue who is disguised as an office worker, but their relationship is put to the test when he is caught in a web of lies.
4. "Time, the Comedian" - A comedy film about a time traveler who goes back in time to prevent a tragic event from happening, but things take a turn when he falls in love with the boss's daughter and must decide whether to change the course of history or let fate take its course.
5. "The Probation Wife" - A romantic comedy about a woman who is on probation

In [19]:
# user query
user_query = "Lost film, Cleopatra charms Caesar, plots world rule, treasures from mummy, revels with Antony, tragic end with serpent in Alexandria."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. Cleopatra: A Love Story - This movie follows the story of Cleopatra and her relationship with Julius Caesar, as well as her attempts to maintain her power and rule over Egypt.
2. The Mummy: Tomb of the Dragon - In this action-adventure film, a group of treasure hunters discover a hidden tomb filled with ancient artifacts and a deadly curse.
3. Antony and Cleopatra - This classic historical drama tells the story of the love triangle between Julius Caesar, Cleopatra, and Mark Antony, and their ultimate downfall.
4. The Lost City of Atlantis - In this science fiction film, a group of explorers discover the lost city of Atlantis and must navigate its treacherous terrain and ancient traps to uncover its secrets.
5. The Serpent Queen - This biographical drama tells the story of Cleopatra and her rise to power in Egypt, as well as her relationships with Julius Caesar and Mark Antony.



Prompt Template:
[INST]

Your main task is to suggest movie titles and their plots b

In [20]:
# user query
user_query = "Denis Gage Deane-Tanner"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=5,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. Beau Geste - A group of thieves, led by Beau Geste, rob a train and must outsmart the authorities to escape.
2. Beau Brummel - A young man, Beau Brummel, rises to prominence in Parisian society but is eventually brought down by his own excesses.
3. The Bondman - A man, the bondman, is forced to become a criminal to pay off a debt and must navigate the dangerous world of organized crime.
4. Gentlemen of Nerve - A group of thieves, led by Beau Geste, plan a daring heist to steal a valuable diamond from a museum.
5. Beau Bandit - A bandit, Beau Bandit, robs a bank and must evade the police while trying to find a way to escape with his loot.



Prompt Template:
[INST]

Your main task is to suggest movie titles and their plots based on the query that the user inputs.

You should return 5 results that best relate to the query and order them by relevance.

Query: Denis Gage Deane-Tanner

SHORT_TITLES: Beau Bandit
Beau Geste
Beau Brummel
The Bondman
Gentlemen of Nerve

S