# Self Correcting Query Engines - Evaluation & Retry

In this notebook, we showcase several advanced, self-correcting query engines.  
They leverage the latest LLM's ability to evaluate its own output, and then self-correct to give better responses. 

In [1]:
# Uncomment to add your OpenAI API key
# import os
# os.environ['OPENAI_API_KEY'] = "INSERT OPENAI KEY"

In [2]:
# Uncomment for debug level logging
# import logging
# import sys

# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

## Setup

First we ingest the document.

In [1]:
from llama_index.indices.vector_store.base import VectorStoreIndex
from llama_index.readers.file.base import SimpleDirectoryReader

In [2]:
documents = SimpleDirectoryReader("../data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(documents)
query = "What did the author do growing up?"

Let's what the response from the default query engine looks like

In [3]:
base_query_engine = index.as_query_engine()
response = base_query_engine.query(query)
print(response)


The author grew up writing essays, learning Italian, exploring Florence, painting people, learning about computers, attending RISD, living in a rent-stabilized apartment, building an online store builder, editing Lisp expressions, publishing essays online, writing essays, painting still life, working on spam filters, cooking for groups, and buying a building in Cambridge.


## Retry Query Engine

The retry query engine uses an evaluator to improve the response from a base query engine.  

It does the following:
1. first queries the base query engine, then
2. use the evaluator to decided if the response passes.
3. If the response passes, then return response,
4. Otherwise, transform the original query with the evaluation result (query, response, and feedback) into a new query, 
5. Repeat up to max_retries

In [7]:
from llama_index.query_engine import RetryQueryEngine
from llama_index.evaluation import QueryResponseEvaluator

query_response_evaluator = QueryResponseEvaluator()
retry_query_engine = RetryQueryEngine(base_query_engine, query_response_evaluator)
retry_response = retry_query_engine.query(query)
print(retry_response)


The author grew up writing essays, learning Italian, exploring Florence, painting people, working with computers, attending RISD, living in a rent-controlled apartment, building an online store builder, editing code, launching software, publishing essays online, writing essays, painting still life, working on spam filters, cooking for groups, and buying a building in Cambridge.


## Retry Source Query Engine

The Source Retry modifies the query source nodes by filtering the existing source nodes for the query based on llm node evaluation.

In [8]:
from llama_index.query_engine import RetrySourceQueryEngine

retry_source_query_engine = RetrySourceQueryEngine(
    base_query_engine, query_response_evaluator
)
retry_source_response = retry_source_query_engine.query(query)
print(retry_source_response)


The author grew up writing essays, learning Italian, exploring Florence, painting people, working with computers, attending RISD, living in a rent-stabilized apartment, building an online store builder, editing Lisp expressions, publishing essays online, writing essays, painting still life, working on spam filters, cooking for groups, and buying a building in Cambridge.


## Retry Guideline Query Engine

This module tries to use guidelines to direct the evaluator's behavior. You can customize your own guidelines.

In [13]:
from llama_index.evaluation.guideline_eval import GuidelineEvaluator, DEFAULT_GUIDELINES
from llama_index.response.schema import Response
from llama_index.indices.query.query_transform.feedback_transform import (
    FeedbackQueryTransformation,
)
from llama_index.query_engine.retry_query_engine import (
    RetryGuidelineQueryEngine,
)

# Guideline eval
guideline_eval = GuidelineEvaluator(
    guidelines=DEFAULT_GUIDELINES + "\nThe response should not be overly long.\n"
    "The response should try to summarize where possible.\n"
)  # just for example

Let's look like what happens under the hood.

In [14]:
typed_response = response if isinstance(response, Response) else response.get_response()
eval = guideline_eval.evaluate_response(query, typed_response)
print(f"Guideline eval evaluation result: {eval.feedback}")

feedback_query_transform = FeedbackQueryTransformation(resynthesize_query=True)
transformed_query = feedback_query_transform.run(query, {"evaluation": eval})
print(f"Transformed query: {transformed_query.query_str}")

Guideline eval evaluation result: The response is too long and should be summarized. It should also include specific numbers or statistics when possible.
Transformed query: Here is a previous bad answer.

The author grew up writing essays, learning Italian, exploring Florence, painting people, learning about computers, attending RISD, living in a rent-stabilized apartment, building an online store builder, editing Lisp expressions, publishing essays online, writing essays, painting still life, working on spam filters, cooking for groups, and buying a building in Cambridge.
Here is some feedback from the evaluator about the response given.
The response is too long and should be summarized. It should also include specific numbers or statistics when possible.
Now answer the question.

What experiences did the author have growing up?


Now let's run the full query engine

In [18]:
retry_guideline_query_engine = RetryGuidelineQueryEngine(
    base_query_engine, guideline_eval, resynthesize_query=True
)
retry_guideline_response = retry_guideline_query_engine.query(query)
print(retry_guideline_response)


The author gained a wide range of skills and experiences growing up, from creative pursuits such as painting and writing, to technical skills such as coding and launching software. They also gained an understanding of Italian language and culture, explored the city of Florence, and gained experience living in a rent-stabilized apartment. These experiences enabled the author to develop the skills necessary to launch a successful online store builder and publish essays online, which allowed them to reach a wider audience and gain recognition for their work. Specifically, they gained an understanding of the web infrastructure, the ability to write code in Lisp, and the ability to market their products and services. They also developed an understanding of the importance of working on projects that weren't prestigious, as well as the ability to cook for large groups of people. Finally, they gained the confidence to take risks and pursue their passions, which ultimately led to their success