# Getting Started with RAG using Fireworks Fast Inference LLMs

<a href="https://colab.research.google.com/github/fw-ai/cookbook/blob/main/recipes/rag/rag-paper-titles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

While large language models (LLMs) show powerful capabilities that power advanced use cases, they suffer from issues such as factual inconsistency and hallucination. Retrieval-augmented generation (RAG) is a powerful approach to enrich LLM capabilities and improve their reliability. RAG involves combining LLMs with external knowledge by enriching the prompt context with relevant information that helps accomplish a task.

This tutorial shows how to getting started with RAG by leveraging vector store and open-source LLMs. To showcase the power of RAG, this use case will cover building a RAG system that suggests short and easy to read ML paper titles from original ML paper titles. Paper tiles can be too technical for a general audience so using RAG to generate short titles based on previously created short titles can make research paper titles more accessible and used for science communication such as in the form of newsletters or blogs.

Before getting started, let's first install the libraries we will use:

In [None]:
#%%capture
#!pip install chromadb tqdm fireworks-ai python-dotenv pandas
#!pip install sentence-transformers
#!pip install datasets

Before continuing, you need to obtain a Fireworks API Key to use the Mistral 7B model.

Checkout this quick guide to obtain your Fireworks API Key: https://readme.fireworks.ai/docs

In [None]:
import fireworks.client
import os
import dotenv
import chromadb
import json
from tqdm.auto import tqdm
import pandas as pd
import random
from google.colab import userdata

# you can set envs using Colab secrets
fireworks.client.api_key = '2CjbxCGb6prsV5GL4AnVKc8mpYVWCNbyRa35jRc77a3ReZTF'

In [None]:
from datasets import load_dataset
ds = load_dataset("Coder-Dragon/wikipedia-movies", split='train[:1000]')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/1.04k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/75.0M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

## Getting Started

Let's define a function to get completions from the Fireworks inference platform.

In [None]:
def get_completion(prompt, model=None, max_tokens=50):

    fw_model_dir = "accounts/fireworks/models/"

    if model is None:
        model = fw_model_dir + "llama-v2-7b"
    else:
        model = fw_model_dir + model

    completion = fireworks.client.Completion.create(
        model=model,
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=0
    )

    return completion.choices[0].text

Let's first try the function with a simple prompt:

Now let's test with Mistral-7B-Instruct:

In [None]:
mistral_llm = "mistral-7b-instruct-4k"

The Mistral 7B Instruct model needs to be instructed using special instruction tokens `[INST] <instruction> [/INST]` to get the right behavior. You can find more instructions on how to prompt Mistral 7B Instruct here: https://docs.mistral.ai/llm/mistral-instruct-v0.1

## RAG Use Case: Generating Short Paper Titles

For the RAG use case, we will be using [a dataset](https://github.com/dair-ai/ML-Papers-of-the-Week/tree/main/research) that contains a list of weekly top trending ML papers.

The user will provide an original paper title. We will then take that input and then use the dataset to generate a context of short and catchy papers titles that will help generate catchy title for the original input title.



### Step 1: Load the Dataset

Let's first load the dataset we will use:

In [None]:
titles = ds['Title']
plots = ds['Plot']
#ChatGPT help with defining "passages" as I kept getting errors later in the code with how it was formatted
passages = [title + ": " + plot for title, plot in zip(titles, plots)]

We will be using SentenceTransformer for generating embeddings that we will store to a chroma document store.

In [None]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

class MyEmbeddingFunction(EmbeddingFunction):
    def __call__(self, input: Documents) -> Embeddings:
        batch_embeddings = embedding_model.encode(input)
        return batch_embeddings.tolist()

embed_fn = MyEmbeddingFunction()

# Initialize the chromadb directory, and client.
client = chromadb.PersistentClient(path="./chromadb")

# create collection
collection = client.get_or_create_collection(
    name=f"movie-titles"
)

We will now generate embeddings for batches:

In [None]:
# Generate embeddings, and index titles in batches
batch_size = 50

# loop through batches and generated + store embeddings
for i in tqdm(range(0, len(passages), batch_size)):

    i_end = min(i + batch_size, len(passages))
    batch = passages[i : i + batch_size]

    # Replace title with "No Title" if empty string
    batch_titles = [title if title != "" else "No Title" for title in batch]
    batch_ids = [str(sum(ord(c) + random.randint(1, 10000) for c in title)) for title in batch]

    # generate embeddings
    batch_embeddings = embedding_model.encode(batch_titles)

    # upsert to chromadb
    collection.upsert(
        ids=batch_ids,
       # metadatas=batch_metadata,
        documents=batch_titles,
        embeddings=batch_embeddings.tolist(),
    )

  0%|          | 0/20 [00:00<?, ?it/s]

Now we can test the retriever:

In [None]:
collection = client.get_or_create_collection(
    name=f"movie-titles",
    embedding_function=embed_fn
)

retriever_results = collection.query(
    query_texts=["Software Engineering"],
    n_results=2,
)

# Print the structure of 'retriever_results' to understand its format
print(retriever_results)

# If 'retriever_results' is a dictionary that contains a list under a key like 'results' or 'data'
if 'results' in retriever_results:
    for result in retriever_results['results'][:1]:  # Assuming the results are stored under 'results' key
        print(result)



{'ids': [['1381948', '671885']], 'distances': [[1.7183220386505127, 1.7184715270996094]], 'metadatas': [[None, None]], 'embeddings': None, 'documents': [["Captain Alvarez: A melodrama about an American who becomes a revolutionary leader battling evil government spies in Argentina. William Desmond Taylor portrays the title role, and Denis Gage Deane-Tanner, Taylor's younger brother, is thought to have played the small role of a blacksmith.", 'Feel My Pulse: Barbara Manning (Daniels) is a wealthy hypochondriac who inherits a sanatorium and finds love and adventure.']], 'uris': None, 'data': None}


Now let's put together our final prompt:

In [None]:
# user query
user_query = "Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions"
# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=10,
)

# concatenate titles into a single string
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)
print("\n\n\nPrompt Template:")
print(prompt_template)

Model Suggestions:

1. "Arctic Survival: A Documentary"
2. "Life in the Arctic: A Visual Journey"
3. "The Arctic Frontier: A Documentary"
4. "Arctic Adventures: A Visual Guide"
5. "Arctic Survival: A Guide to Daily Life"



Prompt Template:
[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions

SHORT_TITLES: Nanook of the North: The documentary follows the lives of an Inuk, Nanook, and his family as they travel, search for food, and trade in the Ungava Peninsula of northern Quebec, Canada. Nanook; his wife, Nyla; and their family are introduced as fearless heroes who endure rigors no other race could survive. The audience sees Nanook, often with his family, hunt a walrus, build an i

In [None]:
user_query = "Western Romance"
# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=1,
)
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)

Model Suggestions:

1. "Love in the Wild West"
2. "The Italian Opera Singer's Priest"
3. "A Grandfather's Advice: A Western Romance"
4. "The Conflict of Love and Worldliness in the West"
5. "A Pathos-Filled Western Romance"


In [None]:
user_query = "Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=1,
)
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)

Model Suggestions:

1. "The Parisian Star's Egyptian Odyssey"
2. "From Paris to Cairo: A Tale of Love and Loss"
3. "The Sahara Siren: A Silent Film Love Story"
4. "A Parisian Music Hall Star's Desert Dilemma"
5. "The Baroness of Cairo: A Silent Film Redemption"


In [None]:
user_query = "Comedy film, office disguises, boss's daughter,elopement."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=1,
)
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)

Model Suggestions:

1. "Office Disguises: A Comedy of Errors"
2. "The Boss's Daughter and the Elopement"
3. "The Office Guardians: A Comedy of Clumsiness"
4. "The Medieval Armor Adventure: A Comedy of Errors"
5. "The Wall Climbing Sequence: A Comedy of Heights"


In [None]:
user_query = "Lost film, Cleopatra charms Caesar, plots world rule, treasures from mummy, revels with Antony, tragic end with serpent in Alexandria."

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=1,
)
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)

Model Suggestions:

1. Cleopatra: The Lost Film
2. Cleopatra's Charm: A Tale of Love and Power
3. Cleopatra's Quest for World Domination
4. Cleopatra's Treasure: A Story of Love and Betrayal
5. Cleopatra's Tragic End: A Love Story


In [None]:
user_query = "Denis Gage Deane-Tanner"

# query for user query
results = collection.query(
    query_texts=[user_query],
    n_results=1,
)
short_titles = '\n'.join(results['documents'][0])

prompt_template = f'''[INST]

Your main task is to generate 5 SUGGESTED_TITLES based for the PAPER_TITLE

You should mimic a similar style and length as SHORT_TITLES but PLEASE DO NOT include titles from SHORT_TITLES in the SUGGESTED_TITLES, only generate versions of the PAPER_TILE.

PAPER_TITLE: {user_query}

SHORT_TITLES: {short_titles}

SUGGESTED_TITLES:

[/INST]
'''

responses = get_completion(prompt_template, model=mistral_llm, max_tokens=2000)
suggested_titles = ''.join([str(r) for r in responses])

# Print the suggestions.
print("Model Suggestions:")
print(suggested_titles)

Model Suggestions:

1. "The Blacksmith's Brother: A Tale of Revolution and Betrayal"
2. "The Secret Life of a Revolutionary"
3. "The Forgotten Brother: A Story of Loyalty and Sacrifice"
4. "The Shadow of the Blacksmith: A Mystery Unveiled"
5. "The Brotherhood of Revolution: A Family Torn Apart"


As you can see, the short titles generated by the LLM are somewhat okay. This use case still needs a lot more work and could potentially benefit from finetuning as well. For the purpose of this tutorial, we have provided a simple application of RAG using open-source models from Firework's blazing-fast models.

Try out other open-source models here: https://app.fireworks.ai/models

Read more about the Fireworks APIs here: https://readme.fireworks.ai/reference/createchatcompletion
