# LLM and RAG Evaluation

Sources: [1](https://docs.llamaindex.ai/en/stable/module_guides/evaluating/), [2](), [3](), [4](), [5]()  

LLMs are trained on enormous bodies of data but they aren’t trained on your data. Retrieval-Augmented Generation (RAG) solves this problem by adding your data to the data LLMs already have access to. You will see references to RAG frequently in this documentation.  
In RAG, your data is loaded and prepared for queries or “indexed”. User queries act on the index, which filters your data down to the most relevant context. This context and your query then go to the LLM along with a prompt, and the LLM provides a response.  
Even if what you’re building is a chatbot or an agent, you’ll want to know RAG techniques for getting data into your application.  

Evaluation and benchmarking are crucial concepts in LLM development. To improve the performance of an LLM app (RAG, agents), you must have a way to measure it.

LlamaIndex offers key modules to measure the quality of generated results. We also offer key modules to measure retrieval quality.

## 1.Response Evaluation:  
Does the response match the retrieved context? Does it also match the query? Does it match the reference answer or guidelines?

## 2. Retrieval Evaluation:  
Are the retrieved sources relevant to the query?

This section describes how the evaluation components within LlamaIndex work.

---  
## 1. Response Evaluation

Evaluation of generated results can be difficult, since unlike traditional machine learning the predicted result isn't a single number, and it can be hard to define quantitative metrics for this problem. LlamaIndex offers LLM-based evaluation modules to measure the quality of results. This uses a "gold" LLM (e.g. GPT-4) to decide whether the predicted answer is correct in a variety of ways. Note that many of these current evaluation modules do not require ground-truth labels. Evaluation can be done with some combination of the query, context, response, and combine these with LLM calls.

These evaluation modules are in the following forms:

+ #### Correctness: Whether the generated answer matches that of the reference answer given the query (requires labels).
+ #### Semantic Similarity Whether the predicted answer is semantically similar to the reference answer (requires labels).
+ #### Faithfulness: Evaluates if the answer is faithful to the retrieved contexts (in other words, whether if there's hallucination).
+ #### Context Relevancy: Whether retrieved context is relevant to the query.
+ #### Answer Relevancy: Whether the generated answer is relevant to the query.
+ #### Guideline Adherence: Whether the predicted answer adheres to specific guidelines.

+ #### Question Generation: In addition to evaluating queries, LlamaIndex can also use your data to generate questions to evaluate on. This means that you can automatically generate questions, and then run an evaluation pipeline to test if the LLM can actually answer questions accurately using your data.

---
## 2. Retrieval Evaluation (TBD)

We also provide modules to help evaluate retrieval independently.

The concept of retrieval evaluation is not new; given a dataset of questions and ground-truth rankings, we can evaluate retrievers using ranking metrics like mean-reciprocal rank (MRR), hit-rate, precision, and more.

The core retrieval evaluation steps revolve around the following:

+ #### Dataset generation: Given an unstructured text corpus, synthetically generate (question, context) pairs.  
+ #### Retrieval Evaluation: Given a retriever and a set of questions, evaluate retrieved results using ranking metrics.  
--- 

#### Installing Packages

In [1]:
!pip install -q openai
!pip install -q llama-index
!pip install -q llama-index-experimental
!pip install -q llama-index-llms-openai

#### Importing Packages

In [2]:
import os
import openai

#os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import sys
import shutil
import glob
from pathlib import Path

import warnings
warnings.filterwarnings('ignore')

import pandas as pd


import llama_index

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine

## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

In [3]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

In [4]:
models = """gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, 
gpt-4-vision-preview, gpt-4-1106-vision-preview, gpt-4-turbo-2024-04-09, gpt-4-turbo, gpt-4-0613, 
gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, 
gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, 
text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, 
curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-1106, gpt-35-turbo-0613, 
gpt-35-turbo-16k-0613""".split()
models = [m.strip(", ") for m in models]
models

['gpt-4',
 'gpt-4-32k',
 'gpt-4-1106-preview',
 'gpt-4-0125-preview',
 'gpt-4-turbo-preview',
 'gpt-4-vision-preview',
 'gpt-4-1106-vision-preview',
 'gpt-4-turbo-2024-04-09',
 'gpt-4-turbo',
 'gpt-4-0613',
 'gpt-4-32k-0613',
 'gpt-4-0314',
 'gpt-4-32k-0314',
 'gpt-3.5-turbo',
 'gpt-3.5-turbo-16k',
 'gpt-3.5-turbo-0125',
 'gpt-3.5-turbo-1106',
 'gpt-3.5-turbo-0613',
 'gpt-3.5-turbo-16k-0613',
 'gpt-3.5-turbo-0301',
 'text-davinci-003',
 'text-davinci-002',
 'gpt-3.5-turbo-instruct',
 'text-ada-001',
 'text-babbage-001',
 'text-curie-001',
 'ada',
 'babbage',
 'curie',
 'davinci',
 'gpt-35-turbo-16k',
 'gpt-35-turbo',
 'gpt-35-turbo-0125',
 'gpt-35-turbo-1106',
 'gpt-35-turbo-0613',
 'gpt-35-turbo-16k-0613']

In [6]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

#model="gpt-4o"
model="gpt-4o-mini"

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
embed_model = Settings.embed_model
Settings.llm = OpenAI(temperature=0, model=model)
llm = Settings.llm

In [7]:
import nest_asyncio
nest_asyncio.apply()

# 1. Response Evaluation

## 1.1 Correctness  
The CorrectnessEvaluator evaluates the relevance and correctness of a generated answer against a reference answer.

In [8]:
from llama_index.core.evaluation import CorrectnessEvaluator
evaluator = CorrectnessEvaluator(llm=llm)

In [9]:
query = ("Can you explain the theory of relativity proposed by Albert Einstein in detail?")

reference = """
Certainly! Albert Einstein's theory of relativity consists of two main components: special relativity and general relativity. Special relativity, 
published in 1905, introduced the concept that the laws of physics are the same for all non-accelerating observers and that the speed of light in a 
vacuum is a constant, regardless of the motion of the source or observer. It also gave rise to the famous equation E=mc², which relates energy (E) and mass (m).
General relativity, published in 1915, extended these ideas to include the effects of gravity. According to general relativity, gravity is not a force between 
masses, as described by Newton's theory of gravity, but rather the result of the warping of space and time by mass and energy. Massive objects, such as 
planets and stars, cause a curvature in spacetime, and smaller objects follow curved paths in response to this curvature. This concept is often illustrated 
using the analogy of a heavy ball placed on a rubber sheet, causing it to create a depression that other objects (representing smaller masses) naturally move 
towards.
In essence, general relativity provided a new understanding of gravity, explaining phenomena like the bending of light by gravity (gravitational lensing) and the precession of the orbit of Mercury. It has been confirmed through numerous experiments and observations and has become a fundamental theory in modern physics.
"""

response = """
Certainly! Albert Einstein's theory of relativity consists of two main components: special relativity and general relativity. Special relativity, 
published in 1905, introduced the concept that the laws of physics are the same for all non-accelerating observers and that the speed of light in a 
vacuum is a constant, regardless of the motion of the source or observer. It also gave rise to the famous equation E=mc², which relates energy (E) 
and mass (m).
However, general relativity, published in 1915, extended these ideas to include the effects of magnetism. According to general relativity, 
gravity is not a force between masses but rather the result of the warping of space and time by magnetic fields generated by massive objects. 
Massive objects, such as planets and stars, create magnetic fields that cause a curvature in spacetime, and smaller objects follow curved paths 
in response to this magnetic curvature. This concept is often illustrated using the analogy of a heavy ball placed on a rubber sheet with magnets 
underneath, causing it to create a depression that other objects (representing smaller masses) naturally move towards due to magnetic attraction.
"""

In [10]:
result = evaluator.evaluate(query=query, response=response, reference=reference,)
print(result.score)
print(result.feedback)

None
4.0
The generated answer provides a detailed explanation of Albert Einstein's theory of relativity, covering both special relativity and general relativity. It correctly mentions the key concepts such as the constancy of the speed of light, the equation E=mc², and the warping of space and time by mass and energy. However, there is a mistake in stating that general relativity includes the effects of magnetism instead of gravity. This error impacts the overall correctness of the answer, but the relevance and coverage of the topic warrant a score of 4.


## 1.2 Semantic Similarity  
The SemanticSimilarityEvaluator evaluates the quality of a question answering system via semantic similarity.  
Concretely, it calculates the similarity score between embeddings of the generated answer and the reference answer.  

In [11]:
from llama_index.core.evaluation import SemanticSimilarityEvaluator
evaluator = SemanticSimilarityEvaluator()

In [12]:
# This evaluator only uses `response` and `reference`, passing in query does not influence the evaluation
# query = 'What is the color of the sky'

response = "The sky is typically blue"
reference = """The color of the sky can vary depending on several factors, including time of day, weather conditions, and location.

During the day, when the sun is in the sky, the sky often appears blue. 
This is because of a phenomenon called Rayleigh scattering, where molecules and particles in the Earth's atmosphere scatter sunlight in all directions, and blue light is scattered more than other colors because it travels as shorter, smaller waves. 
This is why we perceive the sky as blue on a clear day.
"""

result = await evaluator.aevaluate(response=response, reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)  # default similarity threshold is 0.8

Score:  0.8741614884630503
Passing:  True


In [13]:
response = "Sorry, I do not have sufficient context to answer this question."
reference = """The color of the sky can vary depending on several factors, including time of day, weather conditions, and location.

During the day, when the sun is in the sky, the sky often appears blue. 
This is because of a phenomenon called Rayleigh scattering, where molecules and particles in the Earth's atmosphere scatter sunlight in all directions, and blue light is scattered more than other colors because it travels as shorter, smaller waves. 
This is why we perceive the sky as blue on a clear day.
"""

result = await evaluator.aevaluate(response=response, reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)  # default similarity threshold is 0.8

Score:  0.7213441101430746
Passing:  False


#### Customization

In [14]:
from llama_index.core.embeddings import resolve_embed_model
evaluator = SemanticSimilarityEvaluator(embed_model=embed_model, similarity_threshold=0.6,)

In [15]:
response = "The sky is yellow."
reference = "The sky is blue."

result = await evaluator.aevaluate(response=response, reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)

Score:  0.9406303029427779
Passing:  True


We note here that a high score does not imply the answer is always correct.   
Embedding similarity primarily captures the notion of "relevancy". Since both the response and reference discuss "the sky" and colors, they are semantically similar.

## 1.3 Faithfulness

The `FaithfulnessEvaluator` module measures if the response from a query engine matches any source nodes.  
This is useful for measuring if the response was hallucinated.  
The data is extracted from the [New York City](https://en.wikipedia.org/wiki/New_York_City) wikipedia page.

In [16]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    Response,
)
from llama_index.core.evaluation import FaithfulnessEvaluator
from llama_index.core.node_parser import SentenceSplitter
import pandas as pd
pd.set_option("display.max_colwidth", 0)

In [31]:
evaluator = FaithfulnessEvaluator(llm=llm)

In [18]:
documents = SimpleDirectoryReader(input_files=["../Data/nyc_text.txt"]).load_data()
splitter = SentenceSplitter(chunk_size=512)
vector_index = VectorStoreIndex.from_documents(documents, transformations=[splitter])

In [19]:
from llama_index.core.evaluation import EvaluationResult

In [20]:
# define jupyter display function
def display_eval_df(response: Response, eval_result: EvaluationResult) -> None:
    if response.source_nodes == []:
        print("no response!")
        return
    eval_df = pd.DataFrame(
        {
            "Response": str(response),
            "Source": response.source_nodes[0].node.text[:1000] + "...",
            "Evaluation Result": "Pass" if eval_result.passing else "Fail",
            "Reasoning": eval_result.feedback,
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

To run evaluations you can call the `.evaluate_response()` function on the `Response` object return from the query to run the evaluations.  
Lets evaluate the outputs of the vector_index.

In [32]:
query_engine = vector_index.as_query_engine()
response_vector = query_engine.query("How did New York City get its name?")
eval_result = evaluator.evaluate_response(response=response_vector)
display_eval_df(response_vector, eval_result)

Unnamed: 0,Response,Source,Evaluation Result,Reasoning
0,"New York City was named after King Charles II of England granted the lands to his brother, the Duke of York.","The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 p...",Pass,YES


#### Benchmark on Generated Question

Now lets generate a few more questions so that we have more to evaluate with and run a small benchmark.

In [25]:
#from llama_index.core.evaluation import DatasetGenerator
from llama_index.core.llama_dataset.generator import RagDatasetGenerator

#question_generator = DatasetGenerator.from_documents(documents)
question_generator = RagDatasetGenerator.from_documents(documents)
eval_questions = question_generator.generate_questions_from_nodes()

In [30]:
eval_questions.to_pandas().head(2)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,What are the five boroughs of New York City and which counties do they correspond to?,"[New York, often called New York City or NYC, is the most populous city in the United States. With a 2020 population of 8,804,190 distributed over 300.46 square miles (778.2 km2), New York City is the most densely populated major city in the United States and more than twice as populous as Los Angeles, the nation's second-largest city. New York City is located at the southern tip of New York State. It constitutes the geographical and demographic center of both the Northeast megalopolis and the New York metropolitan area, the largest metropolitan area in the U.S. by both population and urban area. With over 20.1 million people in its metropolitan statistical area and 23.5 million in its combined statistical area as of 2020, New York is one of the world's most populous megacities, and over 58 million people live within 250 mi (400 km) of the city. New York City is a global cultural, financial, entertainment, and media center with a significant influence on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports. Home to the headquarters of the United Nations, New York is an important center for international diplomacy, and is sometimes described as the capital of the world.Situated on one of the world's largest natural harbors and extending into the Atlantic Ocean, New York City comprises five boroughs, each of which is coextensive with a respective county of the state of New York. The five boroughs, which were created in 1898 when local governments were consolidated into a single municipal entity, are: Brooklyn (in Kings County), Queens (in Queens County), Manhattan (in New York County), The Bronx (in Bronx County), and Staten Island (in Richmond County).As of 2021, the New York metropolitan area is the largest metropolitan economy in the world with a gross metropolitan product of over $2.4 trillion. If the New York metropolitan area were a sovereign state, it would have the eighth-largest economy in the world. New York City is an established safe haven for global investors. New York is home to the highest number of billionaires, individuals of ultra-high net worth (greater than US$30 million), and millionaires of any city in the world.\nThe city and its metropolitan area constitute the premier gateway for legal immigration to the United States. As many as 800 languages are spoken in New York, making it the most linguistically diverse city in the world. New York City is home to more than 3.2 million residents born outside the U.S., the largest foreign-born population of any city in the world as of 2016.New York City traces its origins to a trading post founded on the southern tip of Manhattan Island by Dutch colonists in approximately 1624. The settlement was named New Amsterdam (Dutch: Nieuw Amsterdam) in 1626 and was chartered as a city in 1653. The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 people from 48 cities worldwide, citing its cultural diversity.Many districts and monuments in New York City are major landmarks, including three of the world's ten most visited tourist attractions in 2013. A record 66.6 million tourists visited New York City in 2019. Times Square is the brightly illuminated hub of the Broadway Theater District, one of the world's busiest pedestrian intersections and a major center of the world's entertainment industry. Many of the city's landmarks, skyscrapers, and parks are known around the world, and the city's fast pace led to the phrase New York minute.]",,,ai (gpt-3.5-turbo-0125)
1,How did New York City originate and what were its different names throughout history?,"[New York, often called New York City or NYC, is the most populous city in the United States. With a 2020 population of 8,804,190 distributed over 300.46 square miles (778.2 km2), New York City is the most densely populated major city in the United States and more than twice as populous as Los Angeles, the nation's second-largest city. New York City is located at the southern tip of New York State. It constitutes the geographical and demographic center of both the Northeast megalopolis and the New York metropolitan area, the largest metropolitan area in the U.S. by both population and urban area. With over 20.1 million people in its metropolitan statistical area and 23.5 million in its combined statistical area as of 2020, New York is one of the world's most populous megacities, and over 58 million people live within 250 mi (400 km) of the city. New York City is a global cultural, financial, entertainment, and media center with a significant influence on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports. Home to the headquarters of the United Nations, New York is an important center for international diplomacy, and is sometimes described as the capital of the world.Situated on one of the world's largest natural harbors and extending into the Atlantic Ocean, New York City comprises five boroughs, each of which is coextensive with a respective county of the state of New York. The five boroughs, which were created in 1898 when local governments were consolidated into a single municipal entity, are: Brooklyn (in Kings County), Queens (in Queens County), Manhattan (in New York County), The Bronx (in Bronx County), and Staten Island (in Richmond County).As of 2021, the New York metropolitan area is the largest metropolitan economy in the world with a gross metropolitan product of over $2.4 trillion. If the New York metropolitan area were a sovereign state, it would have the eighth-largest economy in the world. New York City is an established safe haven for global investors. New York is home to the highest number of billionaires, individuals of ultra-high net worth (greater than US$30 million), and millionaires of any city in the world.\nThe city and its metropolitan area constitute the premier gateway for legal immigration to the United States. As many as 800 languages are spoken in New York, making it the most linguistically diverse city in the world. New York City is home to more than 3.2 million residents born outside the U.S., the largest foreign-born population of any city in the world as of 2016.New York City traces its origins to a trading post founded on the southern tip of Manhattan Island by Dutch colonists in approximately 1624. The settlement was named New Amsterdam (Dutch: Nieuw Amsterdam) in 1626 and was chartered as a city in 1653. The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 people from 48 cities worldwide, citing its cultural diversity.Many districts and monuments in New York City are major landmarks, including three of the world's ten most visited tourist attractions in 2013. A record 66.6 million tourists visited New York City in 2019. Times Square is the brightly illuminated hub of the Broadway Theater District, one of the world's busiest pedestrian intersections and a major center of the world's entertainment industry. Many of the city's landmarks, skyscrapers, and parks are known around the world, and the city's fast pace led to the phrase New York minute.]",,,ai (gpt-3.5-turbo-0125)


In [42]:
[e.query for e in eval_questions.examples[0:5]]

['What are the five boroughs of New York City and which counties do they correspond to?',
 'How did New York City originate and what were its different names throughout history?',
 'What factors contribute to New York City being considered a global cultural, financial, and media center?',
 'What factors contribute to New York City being considered the "greatest city in the world" according to a survey of over 30,000 people?',
 'How did New York City get its name, and who was it named in honor of?']

In [33]:
import asyncio

def evaluate_query_engine(query_engine, questions):
    c = [query_engine.aquery(q) for q in questions]
    results = asyncio.run(asyncio.gather(*c))
    print("finished query")

    total_correct = 0
    for r in results:
        # evaluate with gpt 4
        eval_result = (
            1 if evaluator_gpt4.evaluate_response(response=r).passing else 0
        )
        total_correct += eval_result

    return total_correct, len(results)

In [43]:
vector_query_engine = vector_index.as_query_engine()
correct, total = evaluate_query_engine(vector_query_engine, [e.query for e in eval_questions.examples[0:5]]) #eval_questions[:5])

print(f"score: {correct}/{total}")

finished query
score: 5/5


## 1.4 Guideline Adherence  
GuidelineEvaluator evaluates a question answer system given user specified guidelines.

In [44]:
from llama_index.core.evaluation import GuidelineEvaluator
import nest_asyncio
nest_asyncio.apply()

In [45]:
GUIDELINES = ["The response should fully answer the query.",
              "The response should avoid being vague or ambiguous.",
              "The response should be specific and use statistics or numbers when possible.",
             ]

evaluators = [GuidelineEvaluator(llm=llm, guidelines=guideline) for guideline in GUIDELINES]

In [46]:
sample_data = {
    "query": "Tell me about global warming.",
    "contexts": [
        (
            "Global warming refers to the long-term increase in Earth's"
            " average surface temperature due to human activities such as the"
            " burning of fossil fuels and deforestation."
        ),
        (
            "It is a major environmental issue with consequences such as"
            " rising sea levels, extreme weather events, and disruptions to"
            " ecosystems."
        ),
        (
            "Efforts to combat global warming include reducing carbon"
            " emissions, transitioning to renewable energy sources, and"
            " promoting sustainable practices."
        ),
    ],
    "response": (
        "Global warming is a critical environmental issue caused by human"
        " activities that lead to a rise in Earth's temperature. It has"
        " various adverse effects on the planet."
    ),
}

In [47]:
for guideline, evaluator in zip(GUIDELINES, evaluators):
    eval_result = evaluator.evaluate(
        query=sample_data["query"],
        contexts=sample_data["contexts"],
        response=sample_data["response"],
    )
    print("=====")
    print(f"Guideline: {guideline}")
    print(f"Pass: {eval_result.passing}")
    print(f"Feedback: {eval_result.feedback}")

=====
Guideline: The response should fully answer the query.
Pass: False
Feedback: The response does not fully answer the query. It briefly mentions that global warming is caused by human activities and has adverse effects, but it lacks depth and detail. More information about the causes, impacts, and potential solutions to global warming should be included to fully address the query.
=====
Guideline: The response should avoid being vague or ambiguous.
Pass: False
Feedback: The response is too vague and lacks specific details about the causes and effects of global warming. It would be helpful to provide more in-depth information to fully address the query.
=====
Guideline: The response should be specific and use statistics or numbers when possible.
Pass: False
Feedback: The response does not provide specific information or use statistics to support the statement about global warming. It would be more effective to include data on temperature increases, greenhouse gas emissions, or other

## 1.5 Question Generation  

We walk through the process of generating a list of questions that could be asked about your data.  
This is useful for setting up an evaluation pipeline using the FaithfulnessEvaluator and RelevancyEvaluator evaluation tools.

In [48]:
#import logging
#import sys
#import pandas as pd

#logging.basicConfig(stream=sys.stdout, level=logging.INFO)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [49]:
#from llama_index.core.evaluation import DatasetGenerator, 
from llama_index.core.evaluation import RelevancyEvaluator
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.core.llama_dataset import LabelledRagDataset
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Response

In [50]:
reader = SimpleDirectoryReader(input_files=[f"../Data/paul_graham_essay.txt"])
documents = reader.load_data()

In [51]:
data_generator = RagDatasetGenerator.from_documents(documents)

In [52]:
#eval_questions = data_generator.generate_questions_from_nodes()
eval_questions = LabelledRagDataset.from_json("../Data/rag_dataset.json")

In [54]:
eval_questions.to_pandas().head(5)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"""Describe the author's early experiences with programming. What was the first machine he used and what challenges did he face while trying to write programs on it?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
1,"""What were the author's initial career aspirations before college and how did his experiences in college change his perspective and career path?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
2,"""Discuss the influence of the novel 'The Moon is a Harsh Mistress' and the PBS documentary featuring Terry Winograd on the author's decision to switch to AI. Why did these two sources inspire him?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
3,"""In the context, the author mentions a novel by Heinlein that influenced his interest in AI. What is the name of this novel and how did it inspire the author's interest in AI?""","[I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words.\n\nThere weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. The commonly used programming languages then were pretty primitive, and programmers' ideas correspondingly so. The default language at Cornell was a Pascal-like language called PL/I, and the situation was similar elsewhere. Learning Lisp expanded my concept of a program so fast that it was years before I started to have a sense of where the new limits were. This was more like it; this was what I had expected college to do. It wasn't happening in a class, like it was supposed to, but that was ok. For the next couple years I was on a roll. I knew what I was going to do.\n\nFor my undergraduate thesis, I reverse-engineered SHRDLU. My God did I love working on that program. It was a pleasing bit of code, but what made it even more exciting was my belief — hard to imagine now, but not unique in 1985 — that it was already climbing the lower slopes of intelligence.\n\nI had gotten into a program at Cornell that didn't make you choose a major. You could take whatever classes you liked, and choose whatever you liked to put on your degree. I of course chose ""Artificial Intelligence."" When I got the actual physical diploma, I was dismayed to find that the quotes had been included, which made them read as scare-quotes. At the time this bothered me, but now it seems amusingly accurate, for reasons I was about to discover.\n\nI applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n\nI don't remember the moment it happened, or if there even was a specific moment, but during the first year of grad school I realized that AI, as practiced at the time, was a hoax. By which I mean the sort of AI in which a program that's told ""the dog is sitting on the chair"" translates this into some formal representation and adds it to the list of things it knows.\n\nWhat these programs really showed was that there's a subset of natural language that's a formal language. But a very proper subset. It was clear that there was an unbridgeable gap between what they could do and actually understanding natural language. It was not, in fact, simply a matter of teaching SHRDLU more words. That whole way of doing AI, with explicit data structures representing concepts, was not going to work. Its brokenness did, as so often happens, generate a lot of opportunities to write papers about various band-aids that could be applied to it, but it was never going to get us Mike.\n\nSo I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association with AI, even though that was the main reason people cared about it at the time. So I decided to focus on Lisp. In fact, I decided to write a book about Lisp hacking. It's scary to think how little I knew about Lisp hacking when I started writing that book. But there's nothing like writing a book about something to help you learn it. The book, On Lisp, wasn't published till 1993, but I wrote much of it in grad school.\n\nComputer Science is an uneasy alliance between two halves, theory and systems. The theory people prove things, and the systems people build things. I wanted to build things.]",,,ai (gpt-4)
4,"""The author discusses his experience with learning Lisp and how it expanded his concept of a program. Can you explain what Lisp is and why it was considered the language of AI during that time?""","[I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words.\n\nThere weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. The commonly used programming languages then were pretty primitive, and programmers' ideas correspondingly so. The default language at Cornell was a Pascal-like language called PL/I, and the situation was similar elsewhere. Learning Lisp expanded my concept of a program so fast that it was years before I started to have a sense of where the new limits were. This was more like it; this was what I had expected college to do. It wasn't happening in a class, like it was supposed to, but that was ok. For the next couple years I was on a roll. I knew what I was going to do.\n\nFor my undergraduate thesis, I reverse-engineered SHRDLU. My God did I love working on that program. It was a pleasing bit of code, but what made it even more exciting was my belief — hard to imagine now, but not unique in 1985 — that it was already climbing the lower slopes of intelligence.\n\nI had gotten into a program at Cornell that didn't make you choose a major. You could take whatever classes you liked, and choose whatever you liked to put on your degree. I of course chose ""Artificial Intelligence."" When I got the actual physical diploma, I was dismayed to find that the quotes had been included, which made them read as scare-quotes. At the time this bothered me, but now it seems amusingly accurate, for reasons I was about to discover.\n\nI applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n\nI don't remember the moment it happened, or if there even was a specific moment, but during the first year of grad school I realized that AI, as practiced at the time, was a hoax. By which I mean the sort of AI in which a program that's told ""the dog is sitting on the chair"" translates this into some formal representation and adds it to the list of things it knows.\n\nWhat these programs really showed was that there's a subset of natural language that's a formal language. But a very proper subset. It was clear that there was an unbridgeable gap between what they could do and actually understanding natural language. It was not, in fact, simply a matter of teaching SHRDLU more words. That whole way of doing AI, with explicit data structures representing concepts, was not going to work. Its brokenness did, as so often happens, generate a lot of opportunities to write papers about various band-aids that could be applied to it, but it was never going to get us Mike.\n\nSo I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association with AI, even though that was the main reason people cared about it at the time. So I decided to focus on Lisp. In fact, I decided to write a book about Lisp hacking. It's scary to think how little I knew about Lisp hacking when I started writing that book. But there's nothing like writing a book about something to help you learn it. The book, On Lisp, wasn't published till 1993, but I wrote much of it in grad school.\n\nComputer Science is an uneasy alliance between two halves, theory and systems. The theory people prove things, and the systems people build things. I wanted to build things.]",,,ai (gpt-4)


In [55]:
[e.query for e in eval_questions.examples[0:5]]

['"Describe the author\'s early experiences with programming. What was the first machine he used and what challenges did he face while trying to write programs on it?"',
 '"What were the author\'s initial career aspirations before college and how did his experiences in college change his perspective and career path?"',
 '"Discuss the influence of the novel \'The Moon is a Harsh Mistress\' and the PBS documentary featuring Terry Winograd on the author\'s decision to switch to AI. Why did these two sources inspire him?"',
 '"In the context, the author mentions a novel by Heinlein that influenced his interest in AI. What is the name of this novel and how did it inspire the author\'s interest in AI?"',
 '"The author discusses his experience with learning Lisp and how it expanded his concept of a program. Can you explain what Lisp is and why it was considered the language of AI during that time?"']

#### Saving our Dataset

In [None]:
eval_questions.save_json("../Data/rag_dataset.json")

#### Reading the saved Dataset

In [None]:
eval_questions = LabelledRagDataset.from_json("../Data/rag_dataset.json")

#### Using the Dataset for Evaluation (creating an index with the same source of the questions)

In [56]:
evaluator = RelevancyEvaluator(llm=llm)

In [57]:
# create vector index
vector_index = VectorStoreIndex.from_documents(documents)

In [58]:
# define jupyter display function
def display_eval_df(query: str, response: Response, eval_result: str) -> None:
    eval_df = pd.DataFrame(
        {
            "Query": query,
            "Response": str(response.response),
            "Source": (response.source_nodes[0].node.get_content()[:1000] + "..."),
            "Evaluation Result": eval_result,
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

In [59]:
eval_questions[1].query

'"What were the author\'s initial career aspirations before college and how did his experiences in college change his perspective and career path?"'

In [76]:
#eval_result.passing
#eval_result.score
#eval_result.feedback

In [77]:
query_engine = vector_index.as_query_engine()
response_vector = query_engine.query(eval_questions[2].query)
eval_result = evaluator.evaluate_response(query=eval_questions[1].query, response=response_vector)
display_eval_df(eval_questions[1].query, response_vector, eval_result.feedback)

Unnamed: 0,Query,Response,Source,Evaluation Result
0,"""What were the author's initial career aspirations before college and how did his experiences in college change his perspective and career path?""","The novel 'The Moon is a Harsh Mistress' and the PBS documentary featuring Terry Winograd influenced the author's decision to switch to AI by presenting compelling visions of intelligent computers. The novel showcased an intelligent computer named Mike, drawing the author into its world and sparking the belief that such technology was on the horizon. Additionally, the PBS documentary demonstrated Winograd using SHRDLU, further fueling the author's enthusiasm for AI by suggesting that advancements in the field were imminent. These sources inspired the author by portraying the potential for intelligent machines and the exciting possibilities that AI held for the future.","I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words. There weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. ...",YES


# 2 Dataset generation  
## 2.1 Benchmarking RAG Pipelines With A `LabelledRagDatatset`

The `LabelledRagDataset` is meant to be used for evaluating any given RAG pipeline, for which there could be several configurations (i.e. choosing the `LLM`, values for the `similarity_top_k`, `chunk_size`, and others). 
This relates to traditional machine learning datasets, where `X` features are meant to predict a ground-truth label `y`. In this case, we use the `query` as well as the retrieved `contexts` as the "features" and the answer to the query, called `reference_answer` as the ground-truth label.

And of course, such datasets are comprised of observations or examples. In the case of `LabelledRagDataset`, these are made up with a set of `LabelledRagDataExample`'s.

Let's construct a `LabelledRagDataset` from scratch. Please note that the alternative to this would be to simply download a community supplied `LabelledRagDataset` from `llama-hub` in order to evaluate/benchmark your own RAG pipeline on it.

In [80]:
#%pip install -q llama-index-llms-openai
#%pip install -q llama-index-readers-wikipedia

In [81]:
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedByType,
    CreatedBy,
)

# constructing a LabelledRagDataExample
query = "This is a test query, is it not?"
query_by = CreatedBy(type=CreatedByType.AI, model_name="gpt-4")
reference_answer = "Yes it is."
reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
reference_contexts = ["This is a sample context"]

rag_example = LabelledRagDataExample(
    query=query,
    query_by=query_by,
    reference_contexts=reference_contexts,
    reference_answer=reference_answer,
    reference_answer_by=reference_answer_by,
)

The `LabelledRagDataExample` is a Pydantic `Model` and so, going from `json` or `dict` (and vice-versa) is possible.

In [82]:
print(rag_example.json())

{"query": "This is a test query, is it not?", "query_by": {"model_name": "gpt-4", "type": "ai"}, "reference_contexts": ["This is a sample context"], "reference_answer": "Yes it is.", "reference_answer_by": {"model_name": "", "type": "human"}}


In [83]:
LabelledRagDataExample.parse_raw(rag_example.json())

LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

In [84]:
rag_example.dict()

{'query': 'This is a test query, is it not?',
 'query_by': {'model_name': 'gpt-4', 'type': <CreatedByType.AI: 'ai'>},
 'reference_contexts': ['This is a sample context'],
 'reference_answer': 'Yes it is.',
 'reference_answer_by': {'model_name': '',
  'type': <CreatedByType.HUMAN: 'human'>}}

In [85]:
LabelledRagDataExample.parse_obj(rag_example.dict())

LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

#### Let's create a second example, so we can have a (slightly) more interesting `LabelledRagDataset`.

In [86]:
query = "This is a test query, is it so?"
reference_answer = "I think yes, it is."
reference_contexts = ["This is a second sample context"]

rag_example_2 = LabelledRagDataExample(
    query=query,
    query_by=query_by,
    reference_contexts=reference_contexts,
    reference_answer=reference_answer,
    reference_answer_by=reference_answer_by,
)

### The `LabelledRagDataset` Class

In [88]:
from llama_index.core.llama_dataset import LabelledRagDataset

new_rag_dataset = LabelledRagDataset(examples=[rag_example, rag_example_2])
new_rag_dataset.to_pandas()

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"This is a test query, is it not?",[This is a sample context],Yes it is.,human,ai (gpt-4)
1,"This is a test query, is it so?",[This is a second sample context],"I think yes, it is.",human,ai (gpt-4)


To persist and load the dataset to and from disk, there are the `save_json` and `from_json` methods.

In [89]:
new_rag_dataset.save_json("../Data/new_rag_dataset.json")
reload_rag_dataset = LabelledRagDataset.from_json("../Data/rag_dataset.json")

#### Building a synthetic `LabelledRagDataset` over Wikipedia 

For this section, we'll first create a `LabelledRagDataset` using a synthetic generator. Ultimately, we will use GPT-4 to produce both the `query` and `reference_answer` for the synthetic `LabelledRagDataExample`'s.

NOTE: if one has queries, reference answers, and contexts over a text corpus, then it is not necessary to use data synthesis to be able to predict and subsequently evaluate said predictions.

In [104]:
from tqdm.asyncio import tqdm_asyncio
import nest_asyncio
nest_asyncio.apply()

In [91]:
!pip install -q wikipedia

In [92]:
# wikipedia pages
from llama_index.readers.wikipedia import WikipediaReader
from llama_index.core import VectorStoreIndex

cities = ["Vienna",]

documents = WikipediaReader().load_data(
    pages=[f"History of {x}" for x in cities]
)
index = VectorStoreIndex.from_documents(documents)

The `RagDatasetGenerator` can be built over a set of documents to generate `LabelledRagDataExample`'s.

In [93]:
# generate questions against chunks
#from llama_index.core.llama_dataset.generator import RagDatasetGenerator
#from llama_index.llms.openai import OpenAI

# set context for llm provider
#llm = OpenAI(model="gpt-3.5-turbo", temperature=0.3)

# instantiate a DatasetGenerator
dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=2,  # set the number of questions per nodes
    show_progress=True,
)

Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

In [94]:
len(dataset_generator.nodes)

8

In [95]:
# since there are 8 nodes, there should be a total of 16 questions
rag_dataset = dataset_generator.generate_dataset_from_nodes()

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00,  2.20it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.33s/it]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.20s/it]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00,  1.79s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

In [97]:
rag_dataset.to_pandas().head(5)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"What were the significant events surrounding the Third Crusade that took place near Vienna, and how did they impact the city's development?","[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl.]","The significant events surrounding the Third Crusade that took place near Vienna included King Richard the Lionheart being discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna. This event led to an enormous ransom of 50,000 Silver Marks, which allowed for the creation of a mint and the construction of city walls around the year 1200. The construction of city walls helped in fortifying Vienna and enhancing its defenses. Additionally, the ransom money likely contributed to the city's economic development and growth.",ai (gpt-3.5-turbo-0125),ai (gpt-3.5-turbo-0125)
1,"How did Vienna establish itself as an important trading site in the 11th century, and what privileges did the city receive that contributed to its economic growth during the Middle Ages?","[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl.]","Vienna established itself as an important trading site in the 11th century through its strategic location at the border with Hungary and its well-ordered settlement status as indicated in the Exchange of Mautern. The city received the rights of a city and a staple port (Stapelrecht) in 1221, which required all traders passing through Vienna to offer their goods in the city. This privilege allowed the Viennese to act as middlemen in trade, creating a network of far-reaching trade relations along the Danube basin and to Venice. Additionally, Vienna's role as a staple port contributed to its economic growth during the Middle Ages by making it one of the most important cities in the Holy Roman Empire.",ai (gpt-3.5-turbo-0125),ai (gpt-3.5-turbo-0125)
2,Describe the significance of the First Turkish Siege of Vienna in 1529 and its impact on the city's fortifications.,"[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]","The First Turkish Siege of Vienna in 1529 was significant as it marked the first time the city was besieged by the Ottoman Turks. Despite the unsuccessful siege, it highlighted the need for new fortifications to protect the city. Following the siege, Vienna was expanded into a fortress in 1548, with the addition of eleven bastions, a moat, and a glacis. These new fortifications played a crucial role in the Second Turkish Siege of 1683, allowing the city to withstand the attack for two months until the Turkish army was defeated. The fortifications ultimately helped in pushing back the Ottoman Empire in the following decades, making the First Turkish Siege a turning point in the Turkish Wars and emphasizing the importance of strong defenses for the city of Vienna.",ai (gpt-3.5-turbo-0125),ai (gpt-3.5-turbo-0125)
3,"How did Rudolf IV of Austria contribute to the development of Vienna, particularly in terms of education and architecture?","[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]","Rudolf IV of Austria contributed to the development of Vienna by founding the University of Vienna in 1365 and beginning the construction of the gothic nave in the Stephansdom. This shows his dedication to education and architecture, as he established a prestigious institution and initiated the building of a significant architectural structure in the city.",ai (gpt-3.5-turbo-0125),ai (gpt-3.5-turbo-0125)
4,"How did the population of Vienna change during the 18th century, and what were some of the key developments in terms of infrastructure and urban planning during this time?","[=== 18th century ===\n\nThe following period was characterised by extensive building activities. In the course of reconstruction, Vienna was largely turned into a baroque city. The most important architects were Johann Bernhard Fischer von Erlach and Johann Lukas von Hildebrandt. Most construction happened in the suburbs (Vorstädte), as the nobility began to cover the surrounding land with garden palaces, known as Palais. The best known are the Palais Liechtenstein, Palais Modena, Schönbrunn Palace, Palais Schwarzenberg, and the Belvedere (the garden palais of Prince Eugene of Savoy). In 1704, an outer fortification, the Linienwall, was built around the Vorstädte.\nAfter the extensive plague epidemics of 1679 and 1713, the population began to grow steadily. It is estimated that 150,000 people lived in Vienna in 1724, and 200,000 in 1790. At that time, the first factories were built, starting in Leopoldstadt. Leopoldstadt also became a site where many Jews lived, as they had been driven out of their 50-year-old ghetto in 1670. Hygienic problems began to become noticeable: sewers and street cleaning began to develop. Also in this time, the first house numbers (the Konskriptionsnummern) were issued, and the government postal system began to develop.\nUnder Emperor Joseph II, the city administration was modernized in 1783: officials in charge of only the city were introduced, and the Magistrate was created (More information about the Magistrate of the City of Vienna specifically can be found in German at de:Magistrat der Stadt Wien.). At the same time, the graveyards within the city were closed.\n\n\n=== 19th century ===\n\nDuring the Napoleonic Wars, Vienna was taken by the French twice, in 1805 and 1809. The first conquest happened without a battle. Three French marshals crossed the strongly defended Taborbrücke (Tábor bridge), the only Danube bridge at that time, and convinced the Austrian commander that the war was already over. In the meantime, the French army easily entered the city and was greeted by the population with interest rather than rejection. Napoleon allowed 10,000 men of the Vienna national guard to remain armed and left the arsenal to them when he left, as complete as he had found it.\nHowever, the second occupation happened only after heavy fire. Shortly after, Napoleon suffered his first large defeat at Aspern, nearby. Less than two months later, his army crossed the Danube again and fought the Battle of Wagram on the same terrain as the previous Battle of Aspern. This second battle resulted in a victory for the French, and Austria soon surrendered, ending the War of the Fifth Coalition. In 1810, Salomon Mayer Rothschild arrived in Vienna from Frankfurt and sets up a bank named ""Mayer von Rothschild und Söhne"". The Emperor of Austria in 1823, made the five Rothschild brothers barons. The Rothschild family became famous as bankers in the major countries of Europe, and the Rothschild banking family of Austria remained prominent until the Creditanstalt bank in Vienna was confiscated by the Nazis in 1938.\nAfter Napoleon's final defeat, the Congress of Vienna took place from September 18, 1814 to June 9, 1815, in which the political map of Europe was redrawn. The congress members indulged in many social events, which induced the witty Charles Joseph, Prince de Ligne to famously say: Le congres danse beaucoup, mais il ne marche pas (""The congress dances, but does not progress""). The events cost Austria a great deal of money, which was reflected in mockery about the major participants:\n\nAlexander of Russia: loves for all\nFrederick William of Prussia: thinks for all\nFrederick of Denmark: speaks for all\nMaximilian of Bavaria: drinks for all\nFrederick of Württemberg: eats for all\nEmperor Francis of Austria: pays for all\nThe first half of the century was characterised by intensive industrialization, with Vienna being the center of the railway network after 1837.\nThe French February Revolution of 1848 had an effect as far away as Vienna: on March 13, the March Revolution, which forced long-serving chancellor Metternich to resign.\nDuring the 19th century, Vienna, along with Budapest, became one of the main centers of the Aromanian diaspora. The Aromanian population of these cities stands out for one of the first ones to develop a strictly Aromanian identity.]","During the 18th century, the population of Vienna grew steadily, estimated to be 150,000 in 1724 and 200,000 in 1790. This growth was accompanied by key developments in infrastructure and urban planning. The city saw extensive building activities, with Vienna being largely transformed into a baroque city. The nobility began constructing garden palaces in the suburbs, such as Palais Liechtenstein, Schönbrunn Palace, and the Belvedere. Additionally, the Linienwall, an outer fortification, was built around the Vorstädte in 1704. Hygienic improvements began to be implemented, including the development of sewers and street cleaning. The city administration was modernized under Emperor Joseph II in 1783, with the introduction of officials specifically in charge of the city and the creation of the Magistrate. This period also saw the closure of graveyards within the city and the issuance of the first house numbers.",ai (gpt-3.5-turbo-0125),ai (gpt-3.5-turbo-0125)


In [98]:
rag_dataset.save_json("../Data/Vienna_dataset.json")

### 1.7 Context Relevancy and Answer Relevancy  
`AnswerRelevancyEvaluator` and `ContextRelevancyEvaluator` give a measure on the relevancy of a generated answer and retrieved contexts, respectively, to a given user query. Both of these evaluators return a `score` that is between 0 and 1 as well as a generated `feedback` explaining the score. Note that, higher score means higher relevancy. In particular, we prompt the judge LLM to take a step-by-step approach in providing a relevancy score, asking it to answer the following two questions of a generated answer to a query for answer relevancy (for context relevancy these are slightly adjusted):

1. Does the provided response match the subject matter of the user's query?
2. Does the provided response attempt to address the focus or perspective on the subject matter taken on by the user's query?

Each question is worth 1 point and so a perfect evaluation would yield a score of 2/2.

First, we build a RAG over the same source documents used to created the `rag_dataset`.

In [99]:
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

With our RAG (i.e `query_engine`) defined, we can make predictions (i.e., generate responses to the query) with it over the `rag_dataset`.

In [100]:
prediction_dataset = await rag_dataset.amake_predictions_with(
    predictor=query_engine, batch_size=100, show_progress=True
)

Batch processing of predictions: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:08<00:00,  1.99it/s]


### Evaluating Answer and Context Relevancy Separately

We first need to define our evaluators (i.e. `AnswerRelevancyEvaluator` & `ContextRelevancyEvaluator`):

In [101]:
# instantiate the gpt-4 judges
#from llama_index.llms.openai import OpenAI
from llama_index.core.evaluation import (
    AnswerRelevancyEvaluator,
    ContextRelevancyEvaluator,
)

judges = {}

judges["answer_relevancy"] = AnswerRelevancyEvaluator(
    llm=OpenAI(temperature=0, model="gpt-3.5-turbo"),
)

judges["context_relevancy"] = ContextRelevancyEvaluator(
    llm=OpenAI(temperature=0, model="gpt-4"),
)

Now, we can use our evaluator to make evaluations by looping through all of the <example, prediction> pairs.

In [118]:
eval_tasks = []
for example, prediction in zip(
    rag_dataset.examples, prediction_dataset.predictions
):
    eval_tasks.append(
        judges["answer_relevancy"].aevaluate(
            query=example.query,
            response=prediction.response,
            sleep_time_in_seconds=10.0,
        )
    )
    eval_tasks.append(
        judges["context_relevancy"].aevaluate(
            query=example.query,
            contexts=prediction.contexts,
            sleep_time_in_seconds=10.0,
        )
    )

In [119]:
eval_results1 = await tqdm_asyncio.gather(*eval_tasks[:8])
eval_results2 = await tqdm_asyncio.gather(*eval_tasks[8:16])
eval_results3 = await tqdm_asyncio.gather(*eval_tasks[16:24])
eval_results4 = await tqdm_asyncio.gather(*eval_tasks[24:])
eval_results = eval_results1 + eval_results2+ eval_results3 + eval_results4




  0%|                                                                                                                                                                                        | 0/8 [00:00<?, ?it/s][A[A[A


 12%|██████████████████████                                                                                                                                                          | 1/8 [00:06<00:46,  6.64s/it][A[A[A


 38%|██████████████████████████████████████████████████████████████████                                                                                                              | 3/8 [00:07<00:09,  1.92s/it][A[A[A


 50%|████████████████████████████████████████████████████████████████████████████████████████                                                                                        | 4/8 [00:09<00:08,  2.11s/it][A[A[A


 62%|████████████████████████████████████████████████████████████████████████████████████████████████████

In [120]:
evals = {
    "answer_relevancy": eval_results[::2],
    "context_relevancy": eval_results[1::2],
}

### Taking a look at the evaluation results

Here we use a utility function to convert the list of `EvaluationResult` objects into something more notebook friendly. This utility will provide two DataFrames, one deep one containing all of the evaluation results, and another one which aggregates via taking the mean of all the scores, per evaluation method.

In [121]:
from llama_index.core.evaluation.notebook_utils import get_eval_results_df
import pandas as pd

deep_dfs = {}
mean_dfs = {}
for metric in evals.keys():
    deep_df, mean_df = get_eval_results_df(
        names=["baseline"] * len(evals[metric]),
        results_arr=evals[metric],
        metric=metric,
    )
    deep_dfs[metric] = deep_df
    mean_dfs[metric] = mean_df

In [122]:
mean_scores_df = pd.concat(
    [mdf.reset_index() for _, mdf in mean_dfs.items()],
    axis=0,
    ignore_index=True,
)
mean_scores_df = mean_scores_df.set_index("index")
mean_scores_df.index = mean_scores_df.index.set_names(["metrics"])
mean_scores_df

rag,baseline
metrics,Unnamed: 1_level_1
mean_answer_relevancy_score,1.0
mean_context_relevancy_score,0.664062


The above utility also provides the mean score across all of the evaluations in `mean_df`.

We can get a look at the raw distribution of the scores by invoking `value_counts()` on the `deep_df`.

In [123]:
deep_dfs["answer_relevancy"]["scores"].value_counts()

1.0    16
Name: scores, dtype: int64

In [124]:
deep_dfs["context_relevancy"]["scores"].value_counts()

0.750    7
0.375    2
1.000    2
0.625    2
0.875    1
0.000    1
0.500    1
Name: scores, dtype: int64

It looks like for the most part, the default RAG does fairly well in terms of generating answers that are relevant to the query. Getting a closer look is made possible by viewing the records of any of the `deep_df`'s.

In [139]:
deep_dfs["context_relevancy"].head(2)

Unnamed: 0,rag,query,answer,contexts,scores,feedbacks
0,baseline,"What were the significant events surrounding the Third Crusade that took place near Vienna, and how did they impact the city's development?",,"[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl., == Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]",0.75,"The retrieved context does match the subject matter of the user's query. It provides a detailed history of Vienna, including the significant events surrounding the Third Crusade. The context mentions that during the Third Crusade, King Richard the Lionheart was captured near Vienna, which led to a large ransom that funded the construction of city walls and a mint. This event significantly impacted the city's development. However, the context does not provide a comprehensive answer to the user's query as it does not detail other significant events surrounding the Third Crusade that took place near Vienna. \n\n1. Does the retrieved context match the subject matter of the user's query? Yes (2/2)\n2. Can the retrieved context be used exclusively to provide a full answer to the user's query? Partially (1/2)\n\n[RESULT] 3/4"
1,baseline,"How did Vienna establish itself as an important trading site in the 11th century, and what privileges did the city receive that contributed to its economic growth during the Middle Ages?",,"[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl., == Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]",0.75,"The retrieved context does match the subject matter of the user's query. It provides a detailed history of Vienna, including its establishment as an important trading site in the 11th century and the privileges it received that contributed to its economic growth during the Middle Ages. The context mentions that Vienna was an important site of trade as early as the 11th century and that in 1221, Vienna received the rights of a city and as a staple port. This meant that all traders passing through Vienna had to offer their goods in the city, allowing the Viennese to act as middlemen in trade. \n\nHowever, the context does not provide a full answer to the user's query. While it does mention that Vienna became an important trading site in the 11th century and received certain privileges, it does not explain how Vienna established itself as a trading site. The context also does not provide a comprehensive list of the privileges that the city received during the Middle Ages that contributed to its economic growth. \n\n[RESULT] 3/4"


And, of course you can apply any filters as you like. For example, if you want to look at the examples that yielded less than perfect results.

In [138]:
cond = deep_dfs["context_relevancy"]["scores"] < 0.6
deep_dfs["context_relevancy"][cond].head(5)

Unnamed: 0,rag,query,answer,contexts,scores,feedbacks
3,baseline,"How did Rudolf IV of Austria contribute to the development of Vienna, particularly in terms of education and architecture?",,"[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades., === 18th century ===\n\nThe following period was characterised by extensive building activities. In the course of reconstruction, Vienna was largely turned into a baroque city. The most important architects were Johann Bernhard Fischer von Erlach and Johann Lukas von Hildebrandt. Most construction happened in the suburbs (Vorstädte), as the nobility began to cover the surrounding land with garden palaces, known as Palais. The best known are the Palais Liechtenstein, Palais Modena, Schönbrunn Palace, Palais Schwarzenberg, and the Belvedere (the garden palais of Prince Eugene of Savoy). In 1704, an outer fortification, the Linienwall, was built around the Vorstädte.\nAfter the extensive plague epidemics of 1679 and 1713, the population began to grow steadily. It is estimated that 150,000 people lived in Vienna in 1724, and 200,000 in 1790. At that time, the first factories were built, starting in Leopoldstadt. Leopoldstadt also became a site where many Jews lived, as they had been driven out of their 50-year-old ghetto in 1670. Hygienic problems began to become noticeable: sewers and street cleaning began to develop. Also in this time, the first house numbers (the Konskriptionsnummern) were issued, and the government postal system began to develop.\nUnder Emperor Joseph II, the city administration was modernized in 1783: officials in charge of only the city were introduced, and the Magistrate was created (More information about the Magistrate of the City of Vienna specifically can be found in German at de:Magistrat der Stadt Wien.). At the same time, the graveyards within the city were closed.\n\n\n=== 19th century ===\n\nDuring the Napoleonic Wars, Vienna was taken by the French twice, in 1805 and 1809. The first conquest happened without a battle. Three French marshals crossed the strongly defended Taborbrücke (Tábor bridge), the only Danube bridge at that time, and convinced the Austrian commander that the war was already over. In the meantime, the French army easily entered the city and was greeted by the population with interest rather than rejection. Napoleon allowed 10,000 men of the Vienna national guard to remain armed and left the arsenal to them when he left, as complete as he had found it.\nHowever, the second occupation happened only after heavy fire. Shortly after, Napoleon suffered his first large defeat at Aspern, nearby. Less than two months later, his army crossed the Danube again and fought the Battle of Wagram on the same terrain as the previous Battle of Aspern. This second battle resulted in a victory for the French, and Austria soon surrendered, ending the War of the Fifth Coalition. In 1810, Salomon Mayer Rothschild arrived in Vienna from Frankfurt and sets up a bank named ""Mayer von Rothschild und Söhne"". The Emperor of Austria in 1823, made the five Rothschild brothers barons. The Rothschild family became famous as bankers in the major countries of Europe, and the Rothschild banking family of Austria remained prominent until the Creditanstalt bank in Vienna was confiscated by the Nazis in 1938.\nAfter Napoleon's final defeat, the Congress of Vienna took place from September 18, 1814 to June 9, 1815, in which the political map of Europe was redrawn. The congress members indulged in many social events, which induced the witty Charles Joseph, Prince de Ligne to famously say: Le congres danse beaucoup, mais il ne marche pas (""The congress dances, but does not progress""). The events cost Austria a great deal of money, which was reflected in mockery about the major participants:\n\nAlexander of Russia: loves for all\nFrederick William of Prussia: thinks for all\nFrederick of Denmark: speaks for all\nMaximilian of Bavaria: drinks for all\nFrederick of Württemberg: eats for all\nEmperor Francis of Austria: pays for all\nThe first half of the century was characterised by intensive industrialization, with Vienna being the center of the railway network after 1837.\nThe French February Revolution of 1848 had an effect as far away as Vienna: on March 13, the March Revolution, which forced long-serving chancellor Metternich to resign.\nDuring the 19th century, Vienna, along with Budapest, became one of the main centers of the Aromanian diaspora. The Aromanian population of these cities stands out for one of the first ones to develop a strictly Aromanian identity.]",0.375,"The retrieved context does match the subject matter of the user's query. It provides information about Rudolf IV of Austria, his economic policy, and his contributions to the development of Vienna, particularly in terms of education and architecture. Rudolf IV is credited for founding the University of Vienna in 1365 and beginning the construction of the gothic nave in the Stephansdom. However, the context does not provide a full answer to the user's query as it does not delve into the details of how these contributions impacted the development of Vienna. It also does not provide a comprehensive overview of Rudolf IV's contributions to education and architecture in Vienna. \n\n[RESULT] 1.5/4"
12,baseline,"In what ways did the cultural and political landscape of Vienna change during the late nineteenth and early twentieth centuries, as discussed in the readings?",,"[Schorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna."" Modern Intellectual History 9.2 (2012): 361-388.\nWistrich, Robert S. ""Karl Lueger and the Ambiguities of Viennese Antisemitism."" Jewish Social Studies 45.3/4 (1983): 251-262. online\nYales, W. E. Theatre in Vienna: A Critical History, 1776-1995 (Cambridge University Press, 1996)\n\n\n=== Historiography and Memory ===\nArens, Katherine. Belle Necropolis: Ghosts of Imperial Vienna (2014), art and memory\nBeller, Steven. Rethinking Vienna 1900 (2001)\nJovanović, Miloš. ""Whitewashed empire: Historical narrative and place marketing in Vienna."" History and Anthropology 30.4 (2019): 460-476.\nPirker, Peter, Johannes Kramer, and Mathias Lichtenwagner. ""Transnational memory spaces in the making: World War II and holocaust remembrance in Vienna."" International Journal of Politics, Culture, and Society 32.4 (2019): 439-458. online\n\n\n== External links ==\n Media related to History of Vienna at Wikimedia Commons\n\nGeschichtewiki.wien.gv.at - Vienna History Wiki operated by the city of Vienna, == See also ==\nTimeline of Vienna\nHistory of Austria\nDistricts of Vienna\n\n\n== References ==\n\n\n== Further reading ==\n\nBaranello, Micaela. The Operetta Empire: Music Theater in Early Twentieth-Century Vienna (U of California Press, 2021).\nBeller, Steven. Vienna and the Jews 1867-1938: A Cultural History (Cambridge, 1989).\nBowman, William D. Priest and Parish in Vienna, 1780 to 1880 (2000).\nBoyer, John W. Culture and Political Crisis in Vienna: Christian Socialism in Power, 1897-1918 (U of Chicago Press, 1995).\nBoyer, John. Political Radicalism in Late Imperial Vienna: Origins of the Christian Social Movement, 1848-1897 (U of Chicago Press, 1981).\nBuklijas, Tatjana. ""Surgery and national identity in late nineteenth-century Vienna."" Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 38.4 (2007): 756-774. online\nCoen, Deborah R. Vienna in the age of uncertainty: science, liberalism, and private life (U of Chicago Press, 2008).\nCsendes, Peter. Historical Dictionary of Vienna (Scarecrow Press, 1999).\nEmerson, Charles. 1913: In Search of the World Before the Great War (2013) compares Vienna to 20 major world cities on the eve of World War I; pp 87–109.\nGeehr, Richard S. Karl Lueger: Mayor of Fin de Siècle Vienna (Wayne State University Press, 1990)\nHamann, Brigette. Hitler's Vienna: A Dictator's Apprenticeship (Oxford P, 1999).\nHanák, Péter. The garden and the workshop: essays on the cultural history of Vienna and Budapest (Princeton University Press, 2014)\nHealy, Maureen. Vienna and the Fall of the Habsburg Empire: Total War and Everyday Life in World War I (2004).\nKarnes, Kevin C. ""Wagner, Klimt, and the Metaphysics of Creativity in fin-de-siècle Vienna."" Journal of the American Musicological Society 62.3 (2009): 647-697. online\nKarnes, Kevin. Music, criticism, and the challenge of history: Shaping modern musical thought in late nineteenth century Vienna (Oxford UP, 2008).\nKarnes, Kevin. A kingdom not of this world: Wagner, the arts, and utopian visions in fin-de-siècle Vienna (Oxford UP, 2013).\nMay, A.J. Vienna in the Age of Franz Joseph (U of Oklahoma Press, 1968).\nMillar, Simon and Peter Dennis. Vienna 1683: Christian Europe Repels the Ottomans (Osprey, 2008)\nMorton, Frederik. A Nervous Splendour: Vienna 1888-1889 (Little, Brown, 1979).\nOffenberger, Ilana Fritz. The Jews of Nazi Vienna, 1938-1945: Rescue and Destruction (Springer, 2017).\nParsons, Nicholas. Vienna: A Cultural History (2008).\nRampley, Matthew. The Vienna School of Art History: Empire and the Politics of Scholarship, 1847-1918 (Penn State Press, 2013).\nRegal, Wolfgang and Michael Nanut. Vienna A Doctor’s Guide: 15 walking tours through Vienna’s medical history (2007)\nRozenblit, Marsha. The Jews of Vienna, 1867-1914: Assimilation and Identity (State University of New York Press, 1984).\nSchorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna.""]",0.375,"1. The retrieved context does match the subject matter of the user's query. The context provides a list of readings and resources that discuss the cultural and political landscape of Vienna during the late nineteenth and early twentieth centuries. The titles of the readings suggest they cover topics such as politics, culture, art, and history, which are relevant to the user's query. \n\n2. However, the retrieved context cannot be used exclusively to provide a full answer to the user's query. While the context provides a list of readings that discuss the subject matter, it does not provide any specific information or details about how the cultural and political landscape of Vienna changed during the specified time period. The user would need to read or research these sources to find the answer to their query.\n\n[RESULT] 1.5"
14,baseline,"How did the Vienna Philharmonic Orchestra contribute to the cultural history of Vienna, as discussed in Wagner-Trenkwitz's book ""A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra""?",,"[Schorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna."" Modern Intellectual History 9.2 (2012): 361-388.\nWistrich, Robert S. ""Karl Lueger and the Ambiguities of Viennese Antisemitism."" Jewish Social Studies 45.3/4 (1983): 251-262. online\nYales, W. E. Theatre in Vienna: A Critical History, 1776-1995 (Cambridge University Press, 1996)\n\n\n=== Historiography and Memory ===\nArens, Katherine. Belle Necropolis: Ghosts of Imperial Vienna (2014), art and memory\nBeller, Steven. Rethinking Vienna 1900 (2001)\nJovanović, Miloš. ""Whitewashed empire: Historical narrative and place marketing in Vienna."" History and Anthropology 30.4 (2019): 460-476.\nPirker, Peter, Johannes Kramer, and Mathias Lichtenwagner. ""Transnational memory spaces in the making: World War II and holocaust remembrance in Vienna."" International Journal of Politics, Culture, and Society 32.4 (2019): 439-458. online\n\n\n== External links ==\n Media related to History of Vienna at Wikimedia Commons\n\nGeschichtewiki.wien.gv.at - Vienna History Wiki operated by the city of Vienna, == See also ==\nTimeline of Vienna\nHistory of Austria\nDistricts of Vienna\n\n\n== References ==\n\n\n== Further reading ==\n\nBaranello, Micaela. The Operetta Empire: Music Theater in Early Twentieth-Century Vienna (U of California Press, 2021).\nBeller, Steven. Vienna and the Jews 1867-1938: A Cultural History (Cambridge, 1989).\nBowman, William D. Priest and Parish in Vienna, 1780 to 1880 (2000).\nBoyer, John W. Culture and Political Crisis in Vienna: Christian Socialism in Power, 1897-1918 (U of Chicago Press, 1995).\nBoyer, John. Political Radicalism in Late Imperial Vienna: Origins of the Christian Social Movement, 1848-1897 (U of Chicago Press, 1981).\nBuklijas, Tatjana. ""Surgery and national identity in late nineteenth-century Vienna."" Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 38.4 (2007): 756-774. online\nCoen, Deborah R. Vienna in the age of uncertainty: science, liberalism, and private life (U of Chicago Press, 2008).\nCsendes, Peter. Historical Dictionary of Vienna (Scarecrow Press, 1999).\nEmerson, Charles. 1913: In Search of the World Before the Great War (2013) compares Vienna to 20 major world cities on the eve of World War I; pp 87–109.\nGeehr, Richard S. Karl Lueger: Mayor of Fin de Siècle Vienna (Wayne State University Press, 1990)\nHamann, Brigette. Hitler's Vienna: A Dictator's Apprenticeship (Oxford P, 1999).\nHanák, Péter. The garden and the workshop: essays on the cultural history of Vienna and Budapest (Princeton University Press, 2014)\nHealy, Maureen. Vienna and the Fall of the Habsburg Empire: Total War and Everyday Life in World War I (2004).\nKarnes, Kevin C. ""Wagner, Klimt, and the Metaphysics of Creativity in fin-de-siècle Vienna."" Journal of the American Musicological Society 62.3 (2009): 647-697. online\nKarnes, Kevin. Music, criticism, and the challenge of history: Shaping modern musical thought in late nineteenth century Vienna (Oxford UP, 2008).\nKarnes, Kevin. A kingdom not of this world: Wagner, the arts, and utopian visions in fin-de-siècle Vienna (Oxford UP, 2013).\nMay, A.J. Vienna in the Age of Franz Joseph (U of Oklahoma Press, 1968).\nMillar, Simon and Peter Dennis. Vienna 1683: Christian Europe Repels the Ottomans (Osprey, 2008)\nMorton, Frederik. A Nervous Splendour: Vienna 1888-1889 (Little, Brown, 1979).\nOffenberger, Ilana Fritz. The Jews of Nazi Vienna, 1938-1945: Rescue and Destruction (Springer, 2017).\nParsons, Nicholas. Vienna: A Cultural History (2008).\nRampley, Matthew. The Vienna School of Art History: Empire and the Politics of Scholarship, 1847-1918 (Penn State Press, 2013).\nRegal, Wolfgang and Michael Nanut. Vienna A Doctor’s Guide: 15 walking tours through Vienna’s medical history (2007)\nRozenblit, Marsha. The Jews of Vienna, 1867-1914: Assimilation and Identity (State University of New York Press, 1984).\nSchorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna.""]",0.0,"1. The retrieved context does not match the subject matter of the user's query. The user's query is specifically about the Vienna Philharmonic Orchestra's contribution to the cultural history of Vienna as discussed in Wagner-Trenkwitz's book ""A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra"". The context provided does not provide any information about the orchestra's contribution to Vienna's cultural history. It only mentions the book title and author, but does not provide any content or discussion from the book. Therefore, the context is not relevant to the user's query. Score: 0/2.\n\n2. The retrieved context cannot be used exclusively to provide a full answer to the user's query. As mentioned above, the context does not provide any information about the Vienna Philharmonic Orchestra's contribution to the cultural history of Vienna. It only lists the book title and author, but does not provide any content or discussion from the book. Therefore, the context cannot be used to answer the user's query. Score: 0/2.\n\n[RESULT] 0.0"
15,baseline,"In what ways do scholars like Jovanović and Pirker explore the themes of historical narrative, memory, and remembrance in relation to Vienna's past, as seen in their respective articles on place marketing and transnational memory spaces?",,"[Schorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna."" Modern Intellectual History 9.2 (2012): 361-388.\nWistrich, Robert S. ""Karl Lueger and the Ambiguities of Viennese Antisemitism."" Jewish Social Studies 45.3/4 (1983): 251-262. online\nYales, W. E. Theatre in Vienna: A Critical History, 1776-1995 (Cambridge University Press, 1996)\n\n\n=== Historiography and Memory ===\nArens, Katherine. Belle Necropolis: Ghosts of Imperial Vienna (2014), art and memory\nBeller, Steven. Rethinking Vienna 1900 (2001)\nJovanović, Miloš. ""Whitewashed empire: Historical narrative and place marketing in Vienna."" History and Anthropology 30.4 (2019): 460-476.\nPirker, Peter, Johannes Kramer, and Mathias Lichtenwagner. ""Transnational memory spaces in the making: World War II and holocaust remembrance in Vienna."" International Journal of Politics, Culture, and Society 32.4 (2019): 439-458. online\n\n\n== External links ==\n Media related to History of Vienna at Wikimedia Commons\n\nGeschichtewiki.wien.gv.at - Vienna History Wiki operated by the city of Vienna, == See also ==\nTimeline of Vienna\nHistory of Austria\nDistricts of Vienna\n\n\n== References ==\n\n\n== Further reading ==\n\nBaranello, Micaela. The Operetta Empire: Music Theater in Early Twentieth-Century Vienna (U of California Press, 2021).\nBeller, Steven. Vienna and the Jews 1867-1938: A Cultural History (Cambridge, 1989).\nBowman, William D. Priest and Parish in Vienna, 1780 to 1880 (2000).\nBoyer, John W. Culture and Political Crisis in Vienna: Christian Socialism in Power, 1897-1918 (U of Chicago Press, 1995).\nBoyer, John. Political Radicalism in Late Imperial Vienna: Origins of the Christian Social Movement, 1848-1897 (U of Chicago Press, 1981).\nBuklijas, Tatjana. ""Surgery and national identity in late nineteenth-century Vienna."" Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 38.4 (2007): 756-774. online\nCoen, Deborah R. Vienna in the age of uncertainty: science, liberalism, and private life (U of Chicago Press, 2008).\nCsendes, Peter. Historical Dictionary of Vienna (Scarecrow Press, 1999).\nEmerson, Charles. 1913: In Search of the World Before the Great War (2013) compares Vienna to 20 major world cities on the eve of World War I; pp 87–109.\nGeehr, Richard S. Karl Lueger: Mayor of Fin de Siècle Vienna (Wayne State University Press, 1990)\nHamann, Brigette. Hitler's Vienna: A Dictator's Apprenticeship (Oxford P, 1999).\nHanák, Péter. The garden and the workshop: essays on the cultural history of Vienna and Budapest (Princeton University Press, 2014)\nHealy, Maureen. Vienna and the Fall of the Habsburg Empire: Total War and Everyday Life in World War I (2004).\nKarnes, Kevin C. ""Wagner, Klimt, and the Metaphysics of Creativity in fin-de-siècle Vienna."" Journal of the American Musicological Society 62.3 (2009): 647-697. online\nKarnes, Kevin. Music, criticism, and the challenge of history: Shaping modern musical thought in late nineteenth century Vienna (Oxford UP, 2008).\nKarnes, Kevin. A kingdom not of this world: Wagner, the arts, and utopian visions in fin-de-siècle Vienna (Oxford UP, 2013).\nMay, A.J. Vienna in the Age of Franz Joseph (U of Oklahoma Press, 1968).\nMillar, Simon and Peter Dennis. Vienna 1683: Christian Europe Repels the Ottomans (Osprey, 2008)\nMorton, Frederik. A Nervous Splendour: Vienna 1888-1889 (Little, Brown, 1979).\nOffenberger, Ilana Fritz. The Jews of Nazi Vienna, 1938-1945: Rescue and Destruction (Springer, 2017).\nParsons, Nicholas. Vienna: A Cultural History (2008).\nRampley, Matthew. The Vienna School of Art History: Empire and the Politics of Scholarship, 1847-1918 (Penn State Press, 2013).\nRegal, Wolfgang and Michael Nanut. Vienna A Doctor’s Guide: 15 walking tours through Vienna’s medical history (2007)\nRozenblit, Marsha. The Jews of Vienna, 1867-1914: Assimilation and Identity (State University of New York Press, 1984).\nSchorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna.""]",0.5,"1. The retrieved context does match the subject matter of the user's query. The user's query is about how scholars like Jovanović and Pirker explore the themes of historical narrative, memory, and remembrance in relation to Vienna's past. The context provided includes the titles of the articles written by Jovanović and Pirker, which are ""Whitewashed empire: Historical narrative and place marketing in Vienna"" and ""Transnational memory spaces in the making: World War II and holocaust remembrance in Vienna"" respectively. These titles suggest that the articles do indeed explore the themes of historical narrative, memory, and remembrance in relation to Vienna's past. Therefore, the context is relevant to the user's query. (2 points)\n\n2. However, the retrieved context cannot be used exclusively to provide a full answer to the user's query. While the context does provide the titles of the articles written by Jovanović and Pirker, it does not provide any information about the ways in which these scholars explore the themes of historical narrative, memory, and remembrance in their articles. Therefore, the context does not provide a full answer to the user's query. (0 points)\n\n[RESULT] 2.0"


## 2.2 Retrieval Evaluation  

Given a retriever and a set of questions, evaluate retrieved results using ranking metrics.

https://docs.llamaindex.ai/en/stable/module_guides/evaluating/
https://docs.llamaindex.ai/en/stable/module_guides/evaluating/usage_pattern_retrieval/
https://docs.llamaindex.ai/en/stable/module_guides/evaluating/usage_pattern/