[Tutorial link](https://docs.llamaindex.ai/en/stable/understanding/querying/querying/)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import sys
import os
import openai
from dotenv import load_dotenv, find_dotenv

In [12]:
def navigate_up(path, levels):
    """Navigate up `levels` directories from the given path."""
    for _ in range(levels):
        path = os.path.dirname(path)
    return path

In [13]:
# Get the current working directory
llamaindex_dir = os.getcwd()
# Get the parent directory
llamaindex_dir = os.path.dirname(llamaindex_dir)

sys.path.append(llamaindex_dir + "/utils")
sys.path.append(navigate_up(llamaindex_dir, 2) + "/law-sec-insights/backend")
# sys.path

_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

In [14]:
from llamaindex_utils import *

In [15]:
import logging
import sys

# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.basicConfig(stream=sys.stdout, level=logging.WARNING)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

# Load data and build an index + Storing your index + Query your data

In [6]:
# from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage, get_response_synthesizer
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.postprocessor import SimilarityPostprocessor

# documents = SimpleDirectoryReader(llamaindex_dir + "/data").load_data()
# index = VectorStoreIndex.from_documents(documents)

In [7]:
# check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
    # load the documents and create the index
    documents = SimpleDirectoryReader(llamaindex_dir + "/data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    # store it for later
    # saving the embeddings to disk
    # By default, this will save the data to the directory storage, but you can change that by passing a persist_dir parameter.
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    # load the existing index
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)

```python
# Either way we can now query the index
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
```

How to get lot of data when you have relevant results but potentially no data if you have nothing relevant
- we customize our retriever to use a different number for top_k
  - For a custom retriever, we use `RetrieverQueryEngine`.
- and add a post-processing step that requires that the retrieved nodes reach a minimum similarity score to be included
  - For the post-processing step, we use `SimilarityPostprocessor`
 

[Response Synthesizer](https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/): A Response Synthesizer is what generates a response from an LLM, using a user query and a given set of text chunks. The output of a response synthesizer is a Response object. When used in a query engine, the response synthesizer is used after nodes are retrieved from a retriever, and after any node-postprocessors are ran.

In [8]:
# build index
# index = VectorStoreIndex.from_documents(documents)

# configure retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

# configure response synthesizer
response_synthesizer = get_response_synthesizer()

# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)

# query
response = query_engine.query("What did the author do growing up?")
print(response)

The author worked on writing and programming before college.


In [9]:
# help(RetrieverQueryEngine)

# Try in agent

In [49]:
from llama_index.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=query_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                # NOT WORKING
                # "Provides information about Lyft financials for year 2021. "
                # "Use a detailed plain text question as input to the tool."
                # WORKING!!!
                "Paul Graham essay on What I Worked On"
            ),
        ),
    )
]

In [50]:
from llama_index.agent import OpenAIAgent

In [51]:
agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)
# PB: THE AGENT CAN FAIL WHEN CHOOSING THE TOOL

In [53]:
# agent.chat_repl()
response = agent.chat("What did Paul Grahams do growing up?")

STARTING TURN 1
---------------

=== Calling Function ===
Calling function: lyft_10k with args: {
  "input": "Paul Graham"
}
Got output: Paul Graham is recognized for his contributions to software development and entrepreneurship, including projects like Viaweb and the development of Bel, a Lisp dialect. Additionally, Graham is known for his thought-provoking essays covering a wide range of topics in the tech industry and startup ecosystem.

STARTING TURN 2
---------------



In [54]:
print(str(response))

Growing up, Paul Graham had a diverse range of interests and experiences. He developed a passion for programming at a young age and became proficient in languages like BASIC, Pascal, and Lisp. He also excelled in mathematics and participated in math competitions. Graham was an avid reader and writer, which led him to pursue a degree in English at Cornell University. He also engaged in entrepreneurial ventures, starting small businesses and running a summer camp. During his academic years at Harvard University, he studied philosophy and computer science, conducting research and innovation projects. Graham's notable contributions include projects like Viaweb, the first web-based application, and the development of Bel, a Lisp dialect. He is also known for his thought-provoking essays on various topics in the tech industry and entrepreneurship.


In [56]:
# response

# Configuring retriever

There are a huge variety of retrievers that you can learn about in our [module guide on retrievers](https://docs.llamaindex.ai/en/stable/module_guides/querying/retriever/).

# Configuring node postprocessors

The full list of node postprocessors is documented in the [Node Postprocessor Reference](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/).

# Configuring response synthesis

After a retriever fetches relevant nodes, a BaseSynthesizer synthesizes the final response by combining the information.

# Structured Outputs

# Creating your own Query Pipeline