# Simple Self RAG Notebook

<a href="https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/llama_packs/self_rag/self_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This LlamaPack implements short form the [self-RAG paper by Akari et al.](https://arxiv.org/pdf/2310.11511.pdf).

Novel framework called Self-Reflective Retrieval-Augmented Generation (SELF-RAG). Which aims to enhance the quality and factuality of large language models (LLMs) by combining retrieval and self-reflection mechanisms.

The implementation is adapted from the author [implementation](https://github.com/AkariAsai/self-rag)
A full notebook guide can be found [here](https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_packs/self_rag/self_rag.ipynb).


## Setup

In [1]:
from llama_index import Document, VectorStoreIndex
from llama_index.retrievers import VectorIndexRetriever

# Create documents
documents = [
    Document(
        text="A group of penguins, known as a 'waddle' on land, shuffled across the Antarctic ice, their tuxedo-like plumage standing out against the snow."
    ),
    Document(
        text="Emperor penguins, the tallest of all penguin species, can dive deeper than any other bird, reaching depths of over 500 meters."
    ),
    Document(
        text="Penguins' black and white coloring is a form of camouflage called countershading; from above, their black back blends with the ocean depths, and from below, their white belly matches the bright surface."
    ),
    Document(
        text="Despite their upright stance, penguins are birds that cannot fly; their wings have evolved into flippers, making them expert swimmers."
    ),
    Document(
        text="The fastest species, the Gentoo penguin, can swim up to 36 kilometers per hour, using their flippers and streamlined bodies to slice through the water."
    ),
    Document(
        text="Penguins are social birds; many species form large colonies for breeding, which can number in the tens of thousands."
    ),
    Document(
        text="Intriguingly, penguins have excellent hearing and rely on distinct calls to identify their mates and chicks amidst the noisy colonies."
    ),
    Document(
        text="The smallest penguin species, the Little Blue Penguin, stands just about 40 cm tall and is found along the coastlines of southern Australia and New Zealand."
    ),
    Document(
        text="During the breeding season, male Emperor penguins endure the harsh Antarctic winter for months, fasting and incubating their eggs, while females hunt at sea."
    ),
    Document(
        text="Penguins consume a variety of seafood; their diet mainly consists of fish, squid, and krill, which they catch on their diving expeditions."
    ),
]

index = VectorStoreIndex.from_documents(documents)

# Setup a simple retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

## Load Pack / Setup

Now we do `download_llama_pack` to load the Self-RAG LlamaPack (you can also import the module directly if using the llama-hub package).

We will also optionally setup observability/tracing so we can observe the intermediate steps.

In [16]:
# Option: if developing with the llama_hub package
# from base import SelfRAGQueryEngine


# Option: download llama_pack
from llama_index.llama_pack import download_llama_pack

download_llama_pack(
    "SelfRAGPack",
    "./self_rag_pack",
    skip_load=True,
)
from self_rag_pack.base import SelfRAGQueryEngine

In [10]:
# Download the self-RAG model
download_dir = "/home/mmaatouk/tmp"  # Replace
!pip3 install -q huggingface-hub
!huggingface-cli download m4r1/selfrag_llama2_7b-GGUF selfrag_llama2_7b.q4_k_m.gguf --local-dir {download_dir} --local-dir-use-symlinks False

Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See https://huggingface.co/docs/huggingface_hub/hf_transfer for more details.
downloading https://huggingface.co/m4r1/selfrag_llama2_7b-GGUF/resolve/main/selfrag_llama2_7b.q4_k_m.gguf to /home/mmaatouk/.cache/huggingface/hub/tmpdqmfpera
selfrag_llama2_7b.q4_k_m.gguf: 100%|███████| 4.08G/4.08G [02:37<00:00, 25.9MB/s]
/home/mmaatouk/tmp/selfrag_llama2_7b.q4_k_m.gguf


In [18]:
from pathlib import Path

model_path = Path(download_dir) / "selfrag_llama2_7b.q4_k_m.gguf"
query_engine = SelfRAGQueryEngine(str(model_path), retriever, verbose=True)

llama_model_loader: loaded meta data with 21 key-value pairs and 291 tensors from /home/mmaatouk/tmp/selfrag_llama2_7b.q4_k_m.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = LLaMA v2
llama_model_loader: - kv   2:                       llama.context_length u32              = 4096
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 11008
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.

## Try out some Queries

Now let's try out our `SelfRAGQueryEngine`!


In [19]:
# No retreival example
response = query_engine.query("Which genre the book pride and prejudice?")


llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       6.87 ms /    22 runs   (    0.31 ms per token,  3201.40 tokens per second)
llama_print_timings: prompt eval time =    1582.02 ms /    24 tokens (   65.92 ms per token,    15.17 tokens per second)
llama_print_timings:        eval time =    2685.22 ms /    21 runs   (  127.87 ms per token,     7.82 tokens per second)
llama_print_timings:       total time =    4364.67 ms /    45 tokens


[1;3;32mFinal answer: The book "Pride and Prejudice" is a romantic novel by Jane Austen.
[0m

In [20]:
# Retreival example
response = query_engine.query("How tall is the smallest penguins?")

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =      16.08 ms /    50 runs   (    0.32 ms per token,  3108.68 tokens per second)
llama_print_timings: prompt eval time =    1005.45 ms /    16 tokens (   62.84 ms per token,    15.91 tokens per second)
llama_print_timings:        eval time =    6345.52 ms /    49 runs   (  129.50 ms per token,     7.72 tokens per second)
llama_print_timings:       total time =    7517.03 ms /    65 tokens


[1;3;34mRetreival required
[0m[1;3;34mReceived: 10 documents
[0m[1;3;34mStart evaluation
[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =      13.51 ms /    43 runs   (    0.31 ms per token,  3183.53 tokens per second)
llama_print_timings: prompt eval time =    2447.83 ms /    39 tokens (   62.76 ms per token,    15.93 tokens per second)
llama_print_timings:        eval time =    5438.94 ms /    42 runs   (  129.50 ms per token,     7.72 tokens per second)
llama_print_timings:       total time =    8188.26 ms /    81 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>The smallest penguin species, the Little Blue Penguin, stands just about 40 cm tall and is found along the coastlines of southern Australia and New Zealand.</paragraph>
Prediction: [Relevant]The smallest penguin species is the Little Blue Penguin (also known as the Fairy Penguin), which can grow to be around 40 centimeters in height.[Fully supported][Utility:5]
Score: 2.4709723458196087
[0m[1;3;34m1/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       8.51 ms /    26 runs   (    0.33 ms per token,  3054.15 tokens per second)
llama_print_timings: prompt eval time =    2431.51 ms /    37 tokens (   65.72 ms per token,    15.22 tokens per second)
llama_print_timings:        eval time =    3271.24 ms /    25 runs   (  130.85 ms per token,     7.64 tokens per second)
llama_print_timings:       total time =    5901.59 ms /    62 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Emperor penguins, the tallest of all penguin species, can dive deeper than any other bird, reaching depths of over 500 meters.</paragraph>
Prediction: [Relevant]The smallest penguin species is the Emperor Penguin (Aptenodytes forsteri).[Fully supported][Utility:5]
Score: 2.1767850110288887
[0m[1;3;34m2/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       8.62 ms /    26 runs   (    0.33 ms per token,  3016.59 tokens per second)
llama_print_timings: prompt eval time =    2846.05 ms /    43 tokens (   66.19 ms per token,    15.11 tokens per second)
llama_print_timings:        eval time =    3340.62 ms /    25 runs   (  133.62 ms per token,     7.48 tokens per second)
llama_print_timings:       total time =    6433.70 ms /    68 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>A group of penguins, known as a 'waddle' on land, shuffled across the Antarctic ice, their tuxedo-like plumage standing out against the snow.</paragraph>
Prediction: [Relevant]The smallest penguin species is the African or little penguin (Eudyptula minor).[No support / Contradictory][Utility:5]
Score: 1.5998614571701189
[0m[1;3;34m3/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       6.24 ms /    18 runs   (    0.35 ms per token,  2885.54 tokens per second)
llama_print_timings: prompt eval time =    2461.25 ms /    37 tokens (   66.52 ms per token,    15.03 tokens per second)
llama_print_timings:        eval time =    2272.68 ms /    17 runs   (  133.69 ms per token,     7.48 tokens per second)
llama_print_timings:       total time =    4892.65 ms /    54 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Despite their upright stance, penguins are birds that cannot fly; their wings have evolved into flippers, making them expert swimmers.</paragraph>
Prediction: [Relevant]The height of a penguin varies depending on the species.[No support / Contradictory][Utility:5]
Score: 1.4486356991581153
[0m[1;3;34m4/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =      13.34 ms /    39 runs   (    0.34 ms per token,  2923.10 tokens per second)
llama_print_timings: prompt eval time =    2735.91 ms /    41 tokens (   66.73 ms per token,    14.99 tokens per second)
llama_print_timings:        eval time =    5088.15 ms /    38 runs   (  133.90 ms per token,     7.47 tokens per second)
llama_print_timings:       total time =    8140.45 ms /    79 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>The fastest species, the Gentoo penguin, can swim up to 36 kilometers per hour, using their flippers and streamlined bodies to slice through the water.</paragraph>
Prediction: [Relevant]The smallest penguin species is the African or little penguin (also known as the jackass penguin).[No support / Contradictory][Continue to Use Evidence]They are about 17 inches tall.[Utility:5]
Score: 1.4687150930489146
[0m[1;3;34m5/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =      13.66 ms /    38 runs   (    0.36 ms per token,  2781.64 tokens per second)
llama_print_timings: prompt eval time =    3413.28 ms /    50 tokens (   68.27 ms per token,    14.65 tokens per second)
llama_print_timings:        eval time =    4859.28 ms /    37 runs   (  131.33 ms per token,     7.61 tokens per second)
llama_print_timings:       total time =    8526.62 ms /    87 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Penguins' black and white coloring is a form of camouflage called countershading; from above, their black back blends with the ocean depths, and from below, their white belly matches the bright surface.</paragraph>
Prediction: [Relevant]The smallest penguin species is the African or little penguin (Eudyptula minor), which can grow to be about 17 inches tall.[No support / Contradictory][Utility:5]
Score: 1.6357659323827645
[0m[1;3;34m6/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       8.49 ms /    25 runs   (    0.34 ms per token,  2943.25 tokens per second)
llama_print_timings: prompt eval time =    1744.16 ms /    24 tokens (   72.67 ms per token,    13.76 tokens per second)
llama_print_timings:        eval time =    3862.56 ms /    24 runs   (  160.94 ms per token,     6.21 tokens per second)
llama_print_timings:       total time =    5867.77 ms /    48 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Penguins are social birds; many species form large colonies for breeding, which can number in the tens of thousands.</paragraph>
Prediction: [Relevant]The smallest penguin is the African or little penguin (Eudyptula minor).[No support / Contradictory][Utility:5]
Score: 1.4935304338695037
[0m[1;3;34m7/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       9.34 ms /    26 runs   (    0.36 ms per token,  2782.83 tokens per second)
llama_print_timings: prompt eval time =    2438.93 ms /    33 tokens (   73.91 ms per token,    13.53 tokens per second)
llama_print_timings:        eval time =    3521.54 ms /    25 runs   (  140.86 ms per token,     7.10 tokens per second)
llama_print_timings:       total time =    6158.74 ms /    58 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Intriguingly, penguins have excellent hearing and rely on distinct calls to identify their mates and chicks amidst the noisy colonies.</paragraph>
Prediction: [Relevant]The smallest penguin species is the African or little penguin (Eudyptula minor).[No support / Contradictory][Utility:5]
Score: 1.4074488783945505
[0m[1;3;34m8/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       9.03 ms /    26 runs   (    0.35 ms per token,  2878.02 tokens per second)
llama_print_timings: prompt eval time =    2850.73 ms /    41 tokens (   69.53 ms per token,    14.38 tokens per second)
llama_print_timings:        eval time =    3430.31 ms /    25 runs   (  137.21 ms per token,     7.29 tokens per second)
llama_print_timings:       total time =    6558.69 ms /    66 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>During the breeding season, male Emperor penguins endure the harsh Antarctic winter for months, fasting and incubating their eggs, while females hunt at sea.</paragraph>
Prediction: [Relevant]The smallest penguin species is the Emperor Penguin (Aptenodytes forsteri).[No support / Contradictory][Utility:5]
Score: 1.415058228804781
[0m[1;3;34m9/10 paragraphs done

[0m

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1582.98 ms
llama_print_timings:      sample time =       7.27 ms /    20 runs   (    0.36 ms per token,  2752.55 tokens per second)
llama_print_timings: prompt eval time =    2766.95 ms /    37 tokens (   74.78 ms per token,    13.37 tokens per second)
llama_print_timings:        eval time =    2538.61 ms /    19 runs   (  133.61 ms per token,     7.48 tokens per second)
llama_print_timings:       total time =    5471.43 ms /    56 tokens


[1;3;34mInput: ### Instruction:
How tall is the smallest penguins?

### Response:
[Retrieval]<paragraph>Penguins consume a variety of seafood; their diet mainly consists of fish, squid, and krill, which they catch on their diving expeditions.</paragraph>
Prediction: [Relevant]The height of the smallest penguin species can vary depending on the species.[No support / Contradictory][Utility:5]
Score: 1.4213598342974365
[0m[1;3;34m10/10 paragraphs done

[0m[1;3;34mEnd evaluation
[0m[1;3;34mSelected the best answer: [Relevant]The smallest penguin species is the Little Blue Penguin (also known as the Fairy Penguin), which can grow to be around 40 centimeters in height.[Fully supported][Utility:5]
[0m[1;3;32mFinal answer: The smallest penguin species is the Little Blue Penguin (also known as the Fairy Penguin), which can grow to be around 40 centimeters in height.
[0m