[Tutorial link](https://docs.llamaindex.ai/en/stable/understanding/querying/querying/)

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import sys
import os
import openai
from dotenv import load_dotenv, find_dotenv

In [3]:
# Get the current working directory
llamaindex_dir = os.getcwd()
# Get the parent directory
llamaindex_dir = os.path.dirname(llamaindex_dir)

sys.path.append(llamaindex_dir + "/utils")
# sys.path

_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

In [4]:
from llamaindex_utils import *

In [5]:
import logging
import sys

# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.basicConfig(stream=sys.stdout, level=logging.WARNING)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

# Load data and build an index + Storing your index + Query your data

In [6]:
# from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage, get_response_synthesizer
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.postprocessor import SimilarityPostprocessor

# documents = SimpleDirectoryReader(llamaindex_dir + "/data").load_data()
# index = VectorStoreIndex.from_documents(documents)

In [7]:
# check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
    # load the documents and create the index
    documents = SimpleDirectoryReader(llamaindex_dir + "/data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    # store it for later
    # saving the embeddings to disk
    # By default, this will save the data to the directory storage, but you can change that by passing a persist_dir parameter.
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    # load the existing index
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)

```python
# Either way we can now query the index
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
```

How to get lot of data when you have relevant results but potentially no data if you have nothing relevant
- we customize our retriever to use a different number for top_k
  - For a custom retriever, we use `RetrieverQueryEngine`.
- and add a post-processing step that requires that the retrieved nodes reach a minimum similarity score to be included
  - For the post-processing step, we use `SimilarityPostprocessor`
 

[Response Synthesizer](https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/): A Response Synthesizer is what generates a response from an LLM, using a user query and a given set of text chunks. The output of a response synthesizer is a Response object. When used in a query engine, the response synthesizer is used after nodes are retrieved from a retriever, and after any node-postprocessors are ran.

In [8]:
# build index
# index = VectorStoreIndex.from_documents(documents)

# configure retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

# configure response synthesizer
response_synthesizer = get_response_synthesizer()

# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)

# query
response = query_engine.query("What did the author do growing up?")
print(response)

The author engaged in writing and programming activities during their formative years.


In [9]:
# help(RetrieverQueryEngine)

# Configuring retriever

There are a huge variety of retrievers that you can learn about in our [module guide on retrievers](https://docs.llamaindex.ai/en/stable/module_guides/querying/retriever/).

# Configuring node postprocessors

The full list of node postprocessors is documented in the [Node Postprocessor Reference](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/).

# Configuring response synthesis

After a retriever fetches relevant nodes, a BaseSynthesizer synthesizes the final response by combining the information.

# Structured Outputs

# Creating your own Query Pipeline