In [1]:
# Retrieval Augmented Generation

import os
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [3]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()


In [4]:
documents

[Document(id_='26d40bba-0160-4fc3-a7fe-83888bd2e65c', embedding=None, metadata={'page_label': '1', 'file_name': 'ROUGE.pdf', 'file_path': 'data\\ROUGE.pdf', 'file_type': 'application/pdf', 'file_size': 77520, 'creation_date': '2024-01-30', 'last_modified_date': '2023-12-30', 'last_accessed_date': '2024-01-30'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='ROUGE : A Package for Automatic Evaluation of Summaries  \nChin -Yew Lin  \nInformation Sciences Institute  \nUniversity of Southern California  \n4676 Admiralty Way  \nMarina del Rey, CA  90292  \ncyl@isi.edu  \n \nAbstract  \nROUGE  stands for Recall -Oriented Unde rstudy for \nGisting Evaluation. It includes  measures  to aut o-\nmatically determine the quality of a summary by \ncomparing it to ot

In [5]:
index = VectorStoreIndex.from_documents(documents, show_progress=True)

  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 38/38 [00:00<00:00, 50.03it/s]
Generating embeddings: 100%|██████████| 61/61 [00:06<00:00,  9.15it/s]


In [6]:
index

<llama_index.indices.vector_store.base.VectorStoreIndex at 0x2924fa45d80>

In [7]:
query_engine = index.as_query_engine()

In [8]:
query_engine

<llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x2924fa47d60>

In [9]:
response = query_engine.query("What is ROUGE ?")

In [10]:
print(response)

ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It is a package used for the automatic evaluation of summaries. ROUGE includes measures that compare a computer-generated summary to ideal summaries created by humans to determine the quality of the summary. These measures count the number of overlapping units such as n-grams, word sequences, and word pairs. ROUGE introduces four different measures: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S. These measures have been used in large-scale summarization evaluations sponsored by NIST.


In [11]:
response = query_engine.query("What is SuperGLUE ?")

In [12]:
print(response)

SuperGLUE is a benchmark that is designed to evaluate the performance of language understanding systems. It consists of a set of difficult language understanding tasks, a software toolkit, and a public leaderboard. The goal of SuperGLUE is to provide a more challenging benchmark compared to the previous GLUE benchmark, which has been surpassed by the performance of non-expert humans. SuperGLUE aims to test a system's ability to understand and reason about English texts, and the tasks included in the benchmark are designed to be solvable by most college-educated English speakers without requiring domain-specific knowledge.


In [14]:
from llama_index.response.pprint_utils import pprint_response
pprint_response(response, show_source=True)

Final Response: SuperGLUE is a benchmark that is designed to evaluate
the performance of language understanding systems. It consists of a
set of difficult language understanding tasks, a software toolkit, and
a public leaderboard. The goal of SuperGLUE is to provide a more
challenging benchmark compared to the previous GLUE benchmark, which
has been surpassed by the performance of non-expert humans. SuperGLUE
aims to test a system's ability to understand and reason about English
texts, and the tasks included in the benchmark are designed to be
solvable by most college-educated English speakers without requiring
domain-specific knowledge.
______________________________________________________________________
Source Node 1/2
Node ID: 16621d78-75f8-4aa4-b9f1-5579d68d7bad
Similarity: 0.8214322531137586
Text: SuperGLUE: A Stickier Benchmark for General-Purpose Language
Understanding Systems Alex Wang∗ New York UniversityYada
Pruksachatkun∗ New York UniversityNikita Nangia∗ New York Universi

In [15]:
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.indices.postprocessor import SimilarityPostprocessor

retriever = VectorIndexRetriever(index= index, similarity_top_k= 3)

query_engine = RetrieverQueryEngine(retriever= retriever)

In [16]:
response = query_engine.query("What is SuperGLUE ?")

In [17]:
pprint_response(response, show_source=True)
print(response)

Final Response: SuperGLUE is a new benchmark that is designed to
provide a more rigorous test of language understanding. It consists of
a public leaderboard with eight language understanding tasks,
including tasks like coreference resolution and question answering.
SuperGLUE aims to pose more challenging tasks that go beyond the scope
of current state-of-the-art systems, but are solvable by most college-
educated English speakers. It also includes comprehensive human
baselines for all benchmark tasks and comes with an improved code
support toolkit for pretraining, multi-task learning, and transfer
learning in NLP.
______________________________________________________________________
Source Node 1/3
Node ID: 16621d78-75f8-4aa4-b9f1-5579d68d7bad
Similarity: 0.8214322531137586
Text: SuperGLUE: A Stickier Benchmark for General-Purpose Language
Understanding Systems Alex Wang∗ New York UniversityYada
Pruksachatkun∗ New York UniversityNikita Nangia∗ New York University
Amanpreet Singh∗ Face

In [18]:
postprocessor = SimilarityPostprocessor(similarity_cutoff= 0.80)


In [19]:
query_engine = RetrieverQueryEngine(retriever= retriever, node_postprocessors= [postprocessor])

In [20]:
response = query_engine.query("What is use of SuperGLUE ?")

In [21]:
pprint_response(response, show_source=True)
print(response)

Final Response: SuperGLUE is a benchmark that is designed to provide a
simple and robust evaluation metric for any method capable of being
applied to a broad range of language understanding tasks. It offers a
new set of more difficult language understanding tasks, along with a
software toolkit and a public leaderboard. The goal of SuperGLUE is to
test a system's ability to understand and reason about texts written
in English, while also being beyond the scope of current state-of-the-
art systems. It excludes tasks that require domain-specific knowledge
and aims to be solvable by most college-educated English speakers. The
SuperGLUE benchmark can be used to evaluate custom models and training
methods on the benchmark tasks, and it supports popular pretrained
models such as OpenAI GPT and BERT.
______________________________________________________________________
Source Node 1/3
Node ID: 16621d78-75f8-4aa4-b9f1-5579d68d7bad
Similarity: 0.8123687903359365
Text: SuperGLUE: A Stickier Benc

In [None]:
import os.path
from llama_index import (VectorStoreIndex, SimpleDirectoryReader,
                         StorageContext, load_index_from_storage,
                         )

# check if storage already exists
PERSIT_DIR = "./storage"
if not os.path.exists(PERSIT_DIR):
    # load the documents and create the index
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    # store it for later
    index.storage_context.persist(persist_dir= PERSIT_DIR)
else:
    # load the existing index
    storage_context = StorageContext.from_defaults(persist_dir=PERSIT_DIR)
    index = load_index_from_storage(storage_context)

# query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is SuperGLUE and its use ?")
print(response)