# Getting Started with RAG using Fireworks Fast Inference LLMs

While large language models (LLMs) show powerful capabilities that power advanced use cases, they suffer from issues such as factual inconsistency and hallucination. Retrieval-augmented generation (RAG) is a powerful approach to enrich LLM capabilities and improve their reliability. RAG involves combining LLMs with external knowledge by enriching the prompt context with relevant information that helps accomplish a task.

In a  nutshell, Retrieval Augmented Generation (RAG) is a mechanism where the system fetches relevant documents for context from the bulk data sources like "Wikipedia" based on input query and send the input query along with the relevant document to the AI (Artificial Intelligence) or LLMs(Large Language Models) and fetch the results. These results will have better performance when compared with the traditional ones as the relevant document will guide the LLM for getting better results by giving the context and other necessary details for getting the query results.

To showcase the power of RAG, this use case will cover building a RAG system that suggests five relevant search results from an query with a plot or movie title or both.

In [None]:
%%capture
!pip install chromadb tqdm fireworks-ai python-dotenv pandas
!pip install sentence-transformers

We have installed all the requried libraries.Let us go with fetching the dataset. We tried to fetch the dataset from Kaggle but for some reason it failed. So we are just fetching the dataset by uploading the dataset manually using Google Colab file upload feature.

In [None]:
import pandas as pd #importing the pandas module for loading the dataset into pandas dataframe

# Loading the CSV file into a DataFrame
df = pd.read_csv('/content/wiki_movie_plots_deduped.csv', nrows=1000)

# Display the first thousand rows of the DataFrame
print(df.head())

   Release Year                             Title Origin/Ethnicity  \
0          1901            Kansas Saloon Smashers         American   
1          1901     Love by the Light of the Moon         American   
2          1901           The Martyred Presidents         American   
3          1901  Terrible Teddy, the Grizzly King         American   
4          1902            Jack and the Beanstalk         American   

                             Director Cast    Genre  \
0                             Unknown  NaN  unknown   
1                             Unknown  NaN  unknown   
2                             Unknown  NaN  unknown   
3                             Unknown  NaN  unknown   
4  George S. Fleming, Edwin S. Porter  NaN  unknown   

                                           Wiki Page  \
0  https://en.wikipedia.org/wiki/Kansas_Saloon_Sm...   
1  https://en.wikipedia.org/wiki/Love_by_the_Ligh...   
2  https://en.wikipedia.org/wiki/The_Martyred_Pre...   
3  https://en.wikipedia.

I have loaded the dataset but there are lot of columns, so I am removing everything and keeping only two columns

In [None]:
import pandas as pd #loading pandas module for displaying with help of data frame

#Reading the first 1000 rows and only 'Title' and 'Plot' columns. We need those only so that we can query them and get the results
df = pd.read_csv('/content/wiki_movie_plots_deduped.csv', usecols=['Title', 'Plot'], nrows=1000)
df.head()

Unnamed: 0,Title,Plot
0,Kansas Saloon Smashers,"A bartender is working at a saloon, serving dr..."
1,Love by the Light of the Moon,"The moon, painted with a smiling face hangs ov..."
2,The Martyred Presidents,"The film, just over a minute long, is composed..."
3,"Terrible Teddy, the Grizzly King",Lasting just 61 seconds and consisting of two ...
4,Jack and the Beanstalk,The earliest known adaptation of the classic f...


Before continuing, you need to obtain a Fireworks API Key to use the Mistral 7B model.

Checkout this quick guide to obtain your Fireworks API Key: https://readme.fireworks.ai/docs

In [None]:
import pandas as pd

# Specifying the path of the new CSV file with two columns and first thousand entries
output_file = "first_1000_entries_dataset.csv"

# Saving the DataFrame to a new CSV file
df.to_csv(output_file, index=False)

print("DataFrame saved to", output_file)

DataFrame saved to first_1000_entries_dataset.csv


saving the two columns and keeping it as a new dataset.

In [None]:
!pip install FireWorks



Installing the libraries with Fireworks API.

In [None]:
!pip install python-dotenv  # For dotenv
# For chromadb, you'll need the correct package name or installation method



Installing libraries for operating with open AI keys

In [None]:
import os
from google.colab import auth

auth.authenticate_user()

os.environ['FIREWORKS_API_KEY'] = 'your_api_key_here'

In [None]:
from fireworks import LaunchPad
import fireworks.client
import os
import dotenv
import chromadb
import json
from tqdm.auto import tqdm
import pandas as pd
import random
from google.colab import userdata

fireworks.client.api_key = userdata.get('FIREWORKS_API_KEY')

I have created a FIREWORKS_API_KEY and given the value and I hid it under Secrets tab in Google Colab

In [None]:
def get_completion(prompt, model=None, max_tokens=50):

    fw_model_dir = "accounts/fireworks/models/"

    if model is None:
        model = fw_model_dir + "llama-v2-7b"
    else:
        model = fw_model_dir + model

    completion = fireworks.client.Completion.create(
        model=model,
        prompt=prompt,
        max_tokens=max_tokens,
        temperature=0
    )

    return completion.choices[0].text

In [None]:
get_completion("Hello, my name is")

' Katie and I am a 20 year old student at the University of Leeds. I am currently studying a BA in English Literature and Creative Writing. I have been working as a tutor for over 3 years now and I'

Testing whether Fireworks.ai connected or not.

In [None]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("Hello, my name is", model=mistral_llm)

' [Your Name]. I am a [Your Profession/Occupation]. I am writing to [Purpose of Writing].\n\nI am writing to [Purpose of Writing] because [Reason for Writing]. I believe that ['

In [None]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("Tell me 2 jokes", model=mistral_llm)

".\n1. Why don't scientists trust atoms? Because they make up everything!\n2. Did you hear about the mathematician who’s afraid of negative numbers? He will stop at nothing to avoid them."

In [None]:
mistral_llm = "mistral-7b-instruct-4k"

get_completion("[INST]Tell me 2 jokes[/INST]", model=mistral_llm)

" Sure, here are two jokes for you:\n\n1. Why don't scientists trust atoms? Because they make up everything!\n2. Why did the tomato turn red? Because it saw the salad dressing!"

In [None]:
prompt = """[INST]
Given the following wedding guest data, write a very short 3-sentences thank you letter:

{
  "name": "John Doe",
  "relationship": "Bride's cousin",
  "hometown": "New York, NY",
  "fun_fact": "Climbed Mount Everest in 2020",
  "attending_with": "Sophia Smith",
  "bride_groom_name": "Tom and Mary"
}

Use only the data provided in the JSON object above.

The senders of the letter is the bride and groom, Tom and Mary.
[/INST]"""

get_completion(prompt, model=mistral_llm, max_tokens=150)

" Dear John Doe,\n\nWe, Tom and Mary, would like to extend our heartfelt gratitude for your attendance at our wedding. It was a pleasure to have you there, and we truly appreciate the effort you made to be a part of our special day.\n\nWe were thrilled to learn about your fun fact - climbing Mount Everest is an incredible accomplishment! We hope you had a safe and memorable journey.\n\nThank you again for joining us on this special occasion. We hope to stay in touch and catch up on all the amazing things you've been up to.\n\nWith love,\n\nTom and Mary"

Testing whether the Mistral model is working or not by sending sample queries. Now since it is working,lets us go coding for the main functionality

## RAG Use Case: Generating top five relevant search results

For the RAG use case, we will be using the dataset whixh

The user will provide an original paper title. We will then take that input and then use the dataset to generate a context of short and catchy papers titles that will help generate catchy title for the original input title.



In [None]:
!pip install transformers datasets torch

Collecting datasets
  Downloading datasets-2.18.0-py3-none-any.whl (510 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m510.5/510.5 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash (from datasets)
  Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.1/194.1 kB[0m [31m23.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting multiprocess (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m16.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: xxhash, dill, multiprocess, datasets
Successfully installed dataset

In [None]:
!pip install sentence-transformers chromadb tqdm



Installing all the required libraries for implementing the RAG application

In [None]:
import pandas as pd

df2 = pd.read_csv('/content/first_1000_entries_dataset.csv')
df2.head()

Unnamed: 0,Title,Plot
0,Kansas Saloon Smashers,"A bartender is working at a saloon, serving dr..."
1,Love by the Light of the Moon,"The moon, painted with a smiling face hangs ov..."
2,The Martyred Presidents,"The film, just over a minute long, is composed..."
3,"Terrible Teddy, the Grizzly King",Lasting just 61 seconds and consisting of two ...
4,Jack and the Beanstalk,The earliest known adaptation of the classic f...


We have loaded the dataset and the two columns, Title and Plot. Everything is fine, so lets go with the code.

In [None]:
df2['text'] = df2['Title'] + ' ' + df2['Plot']

Now we are merging the Title 'column' and 'Plot' into one singular column for performing the Search operation easily.

In [None]:
df2[['text']].to_csv('movies_dataset.csv', index=False)

Now since we have merged two columns, we need to save the dataset to save all the changes. We are saving it as "movies_dataset.csv"

In [None]:
import pandas as pd

df = pd.read_csv('/content/movies_dataset.csv')

movies_dict = df[['text']].to_dict(orient="records")

We are converting the dataset entries into list of dictonaries.

In [None]:
from sentence_transformers import SentenceTransformer
import chromadb
from chromadb import PersistentClient
from tqdm.auto import tqdm
import random

embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

class MovieEmbeddingFunction(chromadb.EmbeddingFunction):
    def __call__(self, input: chromadb.Documents) -> chromadb.Embeddings:
        batch_embeddings = embedding_model.encode(input)
        return batch_embeddings.tolist()

embed_fn = MovieEmbeddingFunction()

# Initialize the chromadb directory and client
client = PersistentClient(path="./movie_chromadb")

# Create collection
collection_name = "movies_collection_nov_2023"
collection = client.get_or_create_collection(
    name=collection_name,
    embedding_function=embed_fn
)

In this code, we are converting the list of dictionaries into embeddings. We are also initializing the chromadb directory and client. We are also creating collcetion and saving all those embeddings into "movies_collection_nov_2023" collection.

In [None]:
batch_size = 50

for i in tqdm(range(0, len(movies_dict), batch_size)):
    i_end = min(i + batch_size, len(movies_dict))
    batch = [movie['text'] for movie in movies_dict[i:i_end]]
    batch_ids = [str(i+x) for x in range(batch_size)]  # Dummy IDs for demonstration

    batch_embeddings = embedding_model.encode(batch)

    collection.upsert(
        ids=batch_ids,
        documents=batch,
        embeddings=batch_embeddings.tolist(),
    )

  0%|          | 0/20 [00:00<?, ?it/s]

We took a random batch size of size 50. We are generating the embeddings and upserting into ChmaDb for supporting RAG Retrieval.

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')


Enter a query:“Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions
Nanook of the North The documentary follows the lives of an Inuk, Nanook, and his family as they travel, search for food, and trade in the Ungava Peninsula of northern Quebec, Canada. Nanook; his wife, Nyla; and their family are introduced as fearless heroes who endure rigors no other race could survive. The audience sees Nanook, often with his family, hunt a walrus, build an igloo, go about his day, and perform other tasks. 

In the Land of the Head Hunters The following plot synopsis was published in conjunction with a 1915 showing of the film at Carnegie Hall: 

The Frozen North The film opens near the "last stop on the subway", a terminal in Alaska, which appears to be emerging from deep snow in the middle of nowhere. A tough-looking cowboy (Buster Keaton) emerges. He arrives at a small settlement, finding people gambling in a saloon. He tries to rob them by scaring them with the c

The code is done. We are entering the query and clicking run to get the results. We are executing it six times and entering the six queries each time to make every output visible.

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')

Enter a query:Western romance
Romance As described in a film publication,[2] a youth (Arthur Rankin) in the prologue seeks advice from his grandfather (Sydney), who then recalls a romance of his own youth which is then shown as a flashback. A priest (Sydney) is in love with an Italian opera singer (Keane), and the drama involves the conflict between his efforts to rise above worldly things or to leave with her. The romance ends with a deep note of pathos. 

Wild and Woolly As described in a film magazine review,[1] Jeff Hillington (Fairbanks), son of railroad magnate Collis J. Hillington (Bytell), tires of the East and longs for the wild and woolly West. He has his apartment and office fixed up in his understanding of the accepted Western style, which he has gleaned from dime novels. A delegation from Bitter Creek comes to New York City seeking financial backing for the construction of a spur line, and go to Collis to explain their proposition. Collis sends Jeff to investigate. The cit

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')

Enter a query:Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo
Sahara Silent film femme fatale, Louise Glaum, portrays the role of Mignon, a Parisian music hall celebrity. Mignon marries a young American civil engineer, John Stanley, portrayed by Matt Moore. Stanley is transferred to Egypt to work on an engineering project in the Sahara. Mignon and her son, portrayed by Pat Moore, join Stanley in the desert.[3][4] Unhappy with life in the desert, Mignon leaves Stanley and her son in the desert and moves to Cairo with the wealthy Baron Alexis, portrayed by Edwin Stevens. Mignon lives in Baron Alexis' palace while Stanley goes blind and becomes addicted to the drug hasheesh. Mignon later encounters Stanley and her son, who have become beggars in the streets of Cairo.[3][4] Mignon returns to the desert to care for her husband, and the two are reconciled. 

Mothers Cry The film is focused 

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')

Enter a query:"Comedy film, office disguises, boss's daughter, elopement
Ask Father Lloyd is a serious young middle-class guy on the make, who wants to marry the boss’ daughter. The problem is getting in to see the boss so that he can ask for her hand in marriage; the office is guarded by a bunch of comic, clumsy flunkies who throw everyone out who tries to get in. When Lloyd gets into the boss’ office, the latter uses trap doors and conveyor belts to expel him; Lloyd then goes to the costume company next door, tries to get in wearing drag (no success), and then in medieval armor – that works, since he bangs everyone over the head with his club, but then he finds out that the daughter has eloped with another suitor. Lloyd decides to be sensible and he settles for the cute switchboard operator (Daniels) instead. The film includes a brief wall climbing sequence. Light-hearted, short, fast-paced. 

Caught in a Cabaret Chaplin plays a waiter who fakes being a Greek Ambassador to impress a 

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')

Enter a query:"Lost film, Cleopatra charms Caesar, plots world rule, treasures from mummy, revels with Antony, tragic end with serpent in Alexandria.
Cleopatra Because the film has been lost, the following summary is reconstructed from a description in a contemporary film magazine.
Cleopatra (Bara), the Siren of Egypt, by a clever ruse reaches Caesar (Leiber) and he falls victim to her charms. They plan to rule the world together, but then Caesar falls. Cleopatra's life is desired by the church, as the wanton woman's rule has become intolerable. Pharon (Roscoe), a high priest, is given a sacred dagger to take her life. He gives her his love instead and, when she is in need of some money, leads her to the tomb of his ancestors, where she tears the treasure from the breast of the mummy. With this wealth she goes to Rome to meet Antony (Hall). He leaves the affairs of state and travels to Alexandria with her, where they revel. Antony is recalled to Rome and married to Octavia (Blinn), but

In [None]:
query_plot = input("Enter a query:")

results = collection.query(
    query_texts=[query_plot],
    n_results=5,
)

for doc in results['documents'][0]:
    print(doc, '\n')

Enter a query:Denis Gage Deane-Tanner
Captain Alvarez A melodrama about an American who becomes a revolutionary leader battling evil government spies in Argentina. William Desmond Taylor portrays the title role, and Denis Gage Deane-Tanner, Taylor's younger brother, is thought to have played the small role of a blacksmith. 

Near the Rainbow's End Rancher Tug Wilson (Alfred Hewston) discovers his mate's diabolical scheme, only to be killed instantly. The criminal rancher, Buck Rankin (Al Ferguson), is guilty of killing the Bledsoes' cattle. Buck blames Tug's death on Jim (Bob Steele), the son of Tom Bledsoe (Lafe McKee). Seeking revenge, Tug's daughter Ruth (Louise Lorraine) joins a movement led by Buck to kill Jim. Jim narrowly escapes his first capture attempt but knows he will not make it far. Luckily for him, a sheep herder has witnessed Buck killing Tug and the cattle. With the truth out, Sheriff Hank Bosley (Hank Bell), who was initially on Buck's side, promptly arrests the guilt

Let us calculate the values of @Recall and @MMR now.

**Calculating the Recall values**

Recall is defined as the how many actual relevant results were shown out of all actual relevant results for the query. Mathematically, this is given by:

Recall@k =        true postives@k/(true positives@k + true negatives@K)

1. Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions

The top 5 results for this query are:
Nannok of the Earth
In the Land of Head Hunters
The Frozen North
From Leadville to the Aspen
Chang: A Drama of the Wilderness

Out of these five results,

Nannok of the Earth- RELEVANT
In the Land of Head Hunters- IRRELEVANT
The Frozen North- RELEVANT
From Leadville to the Aspen- IRRELEVANT
Chang: A Drama of the Wilderneess- RELEVANT

Out of five results, three are relevant (1,3,5- relevant and 2,4- irrelevant)

Recall@1 = 1/(1+2) = 1/3 = 0.33
Recall@2 = 1/(1+2) = 1/3 = 0.33
Recall@3 = 2/(2+1) = 2/3 = 0.66
Recall@4 = 2/(2+1) = 2/3 = 0.66
Recall@5 = 3/(3+0) = 3/3 = 1

2. Western romance

The top 5 results for this query are: Romance, Wild and Wolly, Bucking Broadway, The Enchanted Cottage, A Romance of Happy Valley

Out of these five results,
Romance- RELEVANT
Wild and Woolly- IRRELEVANT
Bucking Broadway- RELEVANT
The Enchanted Cottage- RELEVANT
A Romance of Happy Valley- IRRELEVANT (Out of five results, 1,3,4- relevant and 2,5- irrelevant)

Recall@1 = 1/(1+2) = 1/3 = 0.33
Recall@2 = 1/(1+2) = 1/3 = 0.33
Recall@3 = 2/(2+1) = 2/3 = 0.66
Recall@4 = 3/(3+0) = 3/3 = 1
We are stopping as we got one at fourth step

3. Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo

The top 5 results for this query are: Sahara, Mothers Cry, The House with Closed Shutters, A Busy Day, The Suburbannite  

Out of these five results,
Sahara- RELEVANT
Mothers Cry- IRRELEVANT
The House with Closed Shutters- IRRELEVANT
A Busy Day- IRRELEVANT
The Suburbannite- IRRELEVANT       (1- relevant, 2,3,4,5- irrelevant)

Recall@1 = 1/(1+0) = 1/1 = 1
Since we got 1 at first step, we do not need to continue.

4. Comedy film, office disguises, boss's daughter, elopement

The top 5 results for this query are: Ask Father, Caught in a Cabaret, The Extra Girl, Mabel's Blunder, Amarilly of Clothes-Line Alley   

Out of these five results.
Ask Father- RELEVANT
Caught in a Cabaret- RELEVANT
The Extra Girl- IRRELEVANT
Mabel's Blunder- RELEVANT
Amarilly of Clothes-Line Alley- IRRELEVANT       (1,2,4- relevant, 3,5- irrelevant)

Recall@1 = 1/(1+2) = 1/3 = 0.33
Recall@2 = 2/(2+1) = 2/3 = 0.66
Recall@3 = 2/(2+1) = 2/3 = 0.66
Recall@4 = 3/(3+0) = 3/3 = 1
Since we got 1 at fourth step, we do not need to continue.

5. Lost film, Cleopatra charms Caesar, plots world rule, treasures from
mummy, revels with Antony, tragic end with serpent in Alexandria.

The top 5 results for this query are: Cleopatra, A Daughter of the Gods, Disraeli, A Splendid Hazard, The Sorrows of Saturn

Out of these five results.
Cleopatra- RELEVANT
A Daughter of the Gods- IRRELEVANT
Disraeli- IRRELEVANT
A Splendid Hazard- IRRELEVANT
The Sorrows of Saturn- IRRELEVANT     (1 -relevant, 2,3,4,5 -irrelevant)

Recall@1 = 1/(1+0) = 1/1 = 1
Since we got 1 at fIRST step, we do not need to continue.

6. Denis Gage Deane-Tanner

The top 5 results for this query are: Captain Alvarez, Near the Rainbow's End, A Man from Wyoming, The Wolf Song, Tenderloin
Out of these five results.

Captain Alvarez- RELEVANT
Near the Rainbow's End- RELEVANT
A Man from Wyoming- IRRELEVANT
The Wolf Song- IRRELEVANT
Tenderloin- IRRELEVANT              (1,2 -relevant, 3,4,5 -irrelevant)

Recall@1 = 1/(1+1) = 1/2 = 0.5
Recall@2 = 2/(2+0) = 2/2 = 1
Since we got 1 at second step, we do not need to continue.

** Calculating the MRR **

This metric is useful when we want our system to return the best relevant item and want that item to be at a higher position. Mathematically, this is given by:

To calculate MRR, we first calculate the reciprocal rank. It is simply the reciprocal of the rank of the first correct relevant result and the value ranges from 0 to 1.

1. Documentaries showcasing indigenous peoples' survival and daily life in Arctic regions

The top 5 results for this query are:
Nannok of the Earth
In the Land of Head Hunters
The Frozen North
From Leadville to the Aspen
Chang: A Drama of the Wilderness

Out of these five results,

Nannok of the Earth- RELEVANT
In the Land of Head Hunters- IRRELEVANT
The Frozen North- RELEVANT
From Leadville to the Aspen- IRRELEVANT
Chang: A Drama of the Wilderneess- RELEVANT

To calculate MRR, we first calculate the reciprocal rank. It is simply the reciprocal of the rank of the first correct relevant result and the value ranges from 0 to 1.

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)

2. Western romance

he top 5 results for this query are: Romance, Wild and Wolly, Bucking Broadway, The Enchanted Cottage, A Romance of Happy Valley

Out of these five results,
Romance- RELEVANT
Wild and Woolly- IRRELEVANT
Bucking Broadway- RELEVANT
The Enchanted Cottage- RELEVANT
A Romance of Happy Valley- IRRELEVANT (Out of five results, 1,3,4- relevant and 2,5- irrelevant)

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)


3. Silent film about a Parisian star moving to Egypt, leaving her husband for a baron, and later reconciling after finding her family in poverty in Cairo

he top 5 results for this query are: Sahara, Mothers Cry, The House with Closed Shutters, A Busy Day, The Suburbannite  

Out of these five results,
Sahara- RELEVANT
Mothers Cry- IRRELEVANT
The House with Closed Shutters- IRRELEVANT
A Busy Day- IRRELEVANT
The Suburbannite- IRRELEVANT       (1- relevant, 2,3,4,5- irrelevant)

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)


4. Comedy film, office disguises, boss's daughter, elopement

Out of these five results.
Ask Father- RELEVANT
Caught in a Cabaret- RELEVANT
The Extra Girl- IRRELEVANT
Mabel's Blunder- RELEVANT
Amarilly of Clothes-Line Alley- IRRELEVANT       (1,2,4- relevant, 3,5- irrelevant)

Recall@1 = 1/(1+2) = 1/3 = 0.33
Recall@2 = 2/(2+1) = 2/3 = 0.66
Recall@3 = 2/(2+1) = 2/3 = 0.66
Recall@4 = 3/(3+0) = 3/3 = 1
Since we got 1 at fourth step, we do not need to continue.

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)


5. Lost film, Cleopatra charms Caesar, plots world rule, treasures from
mummy, revels with Antony, tragic end with serpent in Alexandria.

TThe top 5 results for this query are: Cleopatra, A Daughter of the Gods, Disraeli, A Splendid Hazard, The Sorrows of Saturn

Out of these five results.
Cleopatra- RELEVANT
A Daughter of the Gods- IRRELEVANT
Disraeli- IRRELEVANT
A Splendid Hazard- IRRELEVANT
The Sorrows of Saturn- IRRELEVANT     (1 -relevant, 2,3,4,5 -irrelevant)

Recall@1 = 1/(1+0) = 1/1 = 1
Since we got 1 at fIRST step, we do not need to continue.

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)


6. Denis Gage Deane-Tanner

he top 5 results for this query are: Captain Alvarez, Near the Rainbow's End, A Man from Wyoming, The Wolf Song, Tenderloin
Out of these five results.

Captain Alvarez- RELEVANT
Near the Rainbow's End- RELEVANT
A Man from Wyoming- IRRELEVANT
The Wolf Song- IRRELEVANT
Tenderloin- IRRELEVANT              (1,2 -relevant, 3,4,5 -irrelevant)

For this query, the reciprocal rank is
1/1 and MRR = 1 (as the first correct item is at position 1.)

After calculating the individual MRR's, we need to calculate their mean to get the MRR for the problem. Here we are having 5 Queries, so

MMR Total = [MRR(Query1)+ MRR(Query2)+ MRR(Query3)+ MRR(Query4) + MRR(Query5)]/Total queries
= 1+1+1+1+1/5
=1

MRR for this data is 1