# LLM and RAG Evaluation

Sources: [1](https://docs.llamaindex.ai/en/stable/module_guides/evaluating/), [2](https://docs.llamaindex.ai/en/stable/examples/evaluation/faithfulness_eval/)  

LLMs are trained on enormous bodies of data but they aren’t trained on your data. Retrieval-Augmented Generation (RAG) solves this problem by adding your data to the data LLMs already have access to. You will see references to RAG frequently in this documentation.  
In RAG, your data is loaded and prepared for queries or “indexed”. User queries act on the index, which filters your data down to the most relevant context. This context and your query then go to the LLM along with a prompt, and the LLM provides a response.  
Even if what you’re building is a chatbot or an agent, you’ll want to know RAG techniques for getting data into your application.  

Evaluation and benchmarking are crucial concepts in LLM development. To improve the performance of an LLM app (RAG, agents), you must have a way to measure it.

LlamaIndex offers key modules to measure the quality of generated results. We also offer key modules to measure retrieval quality.

## 1.Response Evaluation:  
Does the response match the retrieved context? Does it also match the query? Does it match the reference answer or guidelines?

## 2. Retrieval Evaluation:  
Are the retrieved sources relevant to the query?

This section describes how the evaluation components within LlamaIndex work.

---  
## 1. Response Evaluation

Evaluation of generated results can be difficult, since unlike traditional machine learning the predicted result isn't a single number, and it can be hard to define quantitative metrics for this problem. LlamaIndex offers LLM-based evaluation modules to measure the quality of results. This uses a "gold" LLM (e.g. GPT-4) to decide whether the predicted answer is correct in a variety of ways. Note that many of these current evaluation modules do not require ground-truth labels. Evaluation can be done with some combination of the query, context, response, and combine these with LLM calls.

These evaluation modules are in the following forms:

+ #### Correctness: Whether the generated answer matches that of the reference answer given the query (requires labels).
+ #### Semantic Similarity Whether the predicted answer is semantically similar to the reference answer (requires labels).
+ #### Faithfulness: Evaluates if the answer is faithful to the retrieved contexts (in other words, whether if there's hallucination).
+ #### Context Relevancy: Whether retrieved context is relevant to the query.
+ #### Answer Relevancy: Whether the generated answer is relevant to the query.
+ #### Guideline Adherence: Whether the predicted answer adheres to specific guidelines.

+ #### Question Generation: In addition to evaluating queries, LlamaIndex can also use your data to generate questions to evaluate on. This means that you can automatically generate questions, and then run an evaluation pipeline to test if the LLM can actually answer questions accurately using your data.

---
## 2. Retrieval Evaluation (TBD)

We also provide modules to help evaluate retrieval independently.

The concept of retrieval evaluation is not new; given a dataset of questions and ground-truth rankings, we can evaluate retrievers using ranking metrics like mean-reciprocal rank (MRR), hit-rate, precision, and more.

The core retrieval evaluation steps revolve around the following:

+ #### Dataset generation: Given an unstructured text corpus, synthetically generate (question, context) pairs.  
+ #### Retrieval Evaluation: Given a retriever and a set of questions, evaluate retrieved results using ranking metrics.  
--- 

#### Installing Packages

In [104]:
import warnings
warnings.filterwarnings("ignore")

In [105]:
!pip install -qU openai
!pip install -qU llama-index
!pip install -qU llama-index-experimental
!pip install -qU llama-index-llms-openai

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-index 0.12.10 requires llama-index-core<0.13.0,>=0.12.10, but you have llama-index-core 0.11.23 which is incompatible.
llama-index 0.12.10 requires llama-index-llms-openai<0.4.0,>=0.3.0, but you have llama-index-llms-openai 0.2.16 which is incompatible.
llama-index-retrievers-bm25 0.5.0 requires llama-index-core<0.13.0,>=0.12.0, but you have llama-index-core 0.11.23 which is incompatible.
llama-index-readers-wikipedia 0.3.0 requires llama-index-core<0.13.0,>=0.12.0, but you have llama-index-core 0.11.23 which is incompatible.
llama-index-readers-llama-parse 0.4.0 requires llama-index-core<0.13.0,>=0.12.0, but you have llama-index-core 0.11.23 which is incompatible.
llama-index-readers-file 0.4.0 requires llama-index-core<0.13.0,>=0.12.0, but you have llama-index-core 0.11.23 which is incompatible.
llama-inde

#### Importing Packages

In [2]:
import os
import openai

#os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import sys
import shutil
import glob
from pathlib import Path

import warnings
warnings.filterwarnings('ignore')

import pandas as pd


import llama_index

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine

## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

In [3]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

In [4]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

#model="gpt-3.5-turbo"
model="gpt-4o-mini"
#model="gpt-4o"

Settings.llm = OpenAI(temperature=0, model=model)
llm = Settings.llm

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
embed_model = Settings.embed_model

In [5]:
import nest_asyncio
nest_asyncio.apply()

# 1. Response Evaluation

## 1.1 Correctness  
The CorrectnessEvaluator evaluates the relevance and correctness of a generated answer against a reference answer.

In [6]:
from llama_index.core.evaluation import CorrectnessEvaluator
evaluator = CorrectnessEvaluator(llm=llm)

In [7]:
query = ("Can you explain the theory of relativity proposed by Albert Einstein in detail?")

reference = """
Certainly! Albert Einstein's theory of relativity consists of two main components: special relativity and general relativity. Special relativity, 
published in 1905, introduced the concept that the laws of physics are the same for all non-accelerating observers and that the speed of light in a 
vacuum is a constant, regardless of the motion of the source or observer. It also gave rise to the famous equation E=mc², which relates energy (E) and mass (m).
General relativity, published in 1915, extended these ideas to include the effects of gravity. According to general relativity, gravity is not a force between 
masses, as described by Newton's theory of gravity, but rather the result of the warping of space and time by mass and energy. Massive objects, such as 
planets and stars, cause a curvature in spacetime, and smaller objects follow curved paths in response to this curvature. This concept is often illustrated 
using the analogy of a heavy ball placed on a rubber sheet, causing it to create a depression that other objects (representing smaller masses) naturally move 
towards.
In essence, general relativity provided a new understanding of gravity, explaining phenomena like the bending of light by gravity (gravitational lensing) and the precession of the orbit of Mercury. It has been confirmed through numerous experiments and observations and has become a fundamental theory in modern physics.
"""

response = """
Certainly! Albert Einstein's theory of relativity consists of two main components: special relativity and general relativity. Special relativity, 
published in 1905, introduced the concept that the laws of physics are the same for all non-accelerating observers and that the speed of light in a 
vacuum is a constant, regardless of the motion of the source or observer. It also gave rise to the famous equation E=mc², which relates energy (E) 
and mass (m).
However, general relativity, published in 1915, extended these ideas to include the effects of magnetism. According to general relativity, 
gravity is not a force between masses but rather the result of the warping of space and time by magnetic fields generated by massive objects. 
Massive objects, such as planets and stars, create magnetic fields that cause a curvature in spacetime, and smaller objects follow curved paths 
in response to this magnetic curvature. This concept is often illustrated using the analogy of a heavy ball placed on a rubber sheet with magnets 
underneath, causing it to create a depression that other objects (representing smaller masses) naturally move towards due to magnetic attraction.
"""

In [8]:
result = evaluator.evaluate(query=query, 
                            response=response, 
                            reference=reference,)
print(result.score)
print(result.feedback)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2.0
The generated answer is relevant to the user query as it discusses the theory of relativity, but it contains significant inaccuracies, particularly in its explanation of general relativity, which incorrectly attributes the warping of spacetime to magnetic fields instead of mass and energy. This misrepresentation affects the correctness of the answer.


## 1.2 Semantic Similarity  
The SemanticSimilarityEvaluator evaluates the quality of a question answering system via semantic similarity.  
Concretely, it calculates the similarity score between embeddings of the generated answer and the reference answer.  

In [9]:
from llama_index.core.evaluation import SemanticSimilarityEvaluator
evaluator = SemanticSimilarityEvaluator()

In [10]:
# This evaluator only uses `response` and `reference`. The passing in query does not influence the evaluation
# query = 'What is the color of the sky'

response = "The sky is typically blue"

reference = """The color of the sky can vary depending on several factors, including time of day, weather conditions, 
and location. During the day, when the sun is in the sky, the sky often appears blue. 
This is because of a phenomenon called Rayleigh scattering, where molecules and particles in the Earth's atmosphere 
scatter sunlight in all directions, and blue light is scattered more than other colors because it travels as shorter,
smaller waves. This is why we perceive the sky as blue on a clear day.
"""

result = await evaluator.aevaluate(response=response, 
                                   reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)  # default similarity threshold is 0.8

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Score:  0.8739485436676929
Passing:  True


In [11]:
response = "Sorry, I do not have sufficient context to answer this question."

reference = """The color of the sky can vary depending on several factors, including time of day, weather conditions, 
and location. During the day, when the sun is in the sky, the sky often appears blue. 
This is because of a phenomenon called Rayleigh scattering, where molecules and particles in the Earth's atmosphere 
scatter sunlight in all directions, and blue light is scattered more than other colors because it travels as shorter,
smaller waves. This is why we perceive the sky as blue on a clear day.
"""

result = await evaluator.aevaluate(response=response, reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)  # default similarity threshold is 0.8

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Score:  0.7216314937446562
Passing:  False


### Customization of matching function and threshold:

In [13]:
from llama_index.core.embeddings import resolve_embed_model
evaluator = SemanticSimilarityEvaluator(embed_model=embed_model, 
                                        similarity_threshold=0.6,)

In [14]:
response = "The sky is yellow."
reference = "The sky is blue."

result = await evaluator.aevaluate(response=response, reference=reference,)
print("Score: ", result.score)
print("Passing: ", result.passing)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Score:  0.9405787817542284
Passing:  True


We note here that a high score does not imply the answer is always correct.   
Embedding similarity primarily captures the notion of "relevancy". 

Since both the response and reference discuss "the sky" and colors, they are semantically similar.

## 1.3 Faithfulness

The `FaithfulnessEvaluator` module measures if the response from a query engine matches any source nodes.  
This is useful for measuring if the response was hallucinated.  

In [20]:
from llama_index.core import VectorStoreIndex
from llama_index.core import SimpleDirectoryReader
from llama_index.core import Response

from llama_index.core.evaluation import FaithfulnessEvaluator
from llama_index.core.evaluation import EvaluationResult
from llama_index.core.node_parser import SentenceSplitter
import pandas as pd

pd.set_option("display.max_colwidth", 0)

#### The data is extracted from the [New York City](https://en.wikipedia.org/wiki/New_York_City) wikipedia page.

In [22]:
documents = SimpleDirectoryReader(input_files=["../../Data/nyc_text.txt"]).load_data()
splitter = SentenceSplitter(chunk_size=512)
vector_index = VectorStoreIndex.from_documents(documents, transformations=[splitter])

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


#### Defining a jupyter display function

In [23]:
def display_eval_df(response: Response, eval_result: EvaluationResult) -> None:
    if response.source_nodes == []:
        print("no response!")
        return
    eval_df = pd.DataFrame(
        {
            "Response": str(response),
            "Source": response.source_nodes[0].node.text[:1000] + "...",
            "Evaluation Result": "Pass" if eval_result.passing else "Fail",
            "Reasoning": eval_result.feedback,
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

To run evaluations you can call the `.evaluate_response()` function on the `Response` object return from the query to run the evaluations.  
Lets evaluate the outputs of the vector_index.

In [21]:
evaluator = FaithfulnessEvaluator(llm=llm)

In [37]:
query_engine = vector_index.as_query_engine()
response_vector = query_engine.query("How did New York City get its name?")
eval_result = evaluator.evaluate_response(response=response_vector)
display_eval_df(response_vector, eval_result)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Unnamed: 0,Response,Source,Evaluation Result,Reasoning
0,"New York City was named in honor of the Duke of York, who would later become King James II of England. The name was given in 1664 when England seized the territory from Dutch control, and the Duke was appointed as the proprietor of the former territory of New Netherland, which included the city of New Amsterdam.","The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 p...",Pass,YES


#### Benchmark on Generated Questions

Now lets generate a few more questions so that we have more to evaluate with and run a small benchmark.

In [25]:
#from llama_index.core.evaluation import DatasetGenerator
from llama_index.core.llama_dataset.generator import RagDatasetGenerator

#question_generator = DatasetGenerator.from_documents(documents)
question_generator = RagDatasetGenerator.from_documents(documents)
eval_questions = question_generator.generate_questions_from_nodes()

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/comp

In [26]:
eval_questions.to_pandas().head(2)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"What are the five boroughs of New York City, and how are they related to the counties of New York State?","[New York, often called New York City or NYC, is the most populous city in the United States. With a 2020 population of 8,804,190 distributed over 300.46 square miles (778.2 km2), New York City is the most densely populated major city in the United States and more than twice as populous as Los Angeles, the nation's second-largest city. New York City is located at the southern tip of New York State. It constitutes the geographical and demographic center of both the Northeast megalopolis and the New York metropolitan area, the largest metropolitan area in the U.S. by both population and urban area. With over 20.1 million people in its metropolitan statistical area and 23.5 million in its combined statistical area as of 2020, New York is one of the world's most populous megacities, and over 58 million people live within 250 mi (400 km) of the city. New York City is a global cultural, financial, entertainment, and media center with a significant influence on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports. Home to the headquarters of the United Nations, New York is an important center for international diplomacy, and is sometimes described as the capital of the world.Situated on one of the world's largest natural harbors and extending into the Atlantic Ocean, New York City comprises five boroughs, each of which is coextensive with a respective county of the state of New York. The five boroughs, which were created in 1898 when local governments were consolidated into a single municipal entity, are: Brooklyn (in Kings County), Queens (in Queens County), Manhattan (in New York County), The Bronx (in Bronx County), and Staten Island (in Richmond County).As of 2021, the New York metropolitan area is the largest metropolitan economy in the world with a gross metropolitan product of over $2.4 trillion. If the New York metropolitan area were a sovereign state, it would have the eighth-largest economy in the world. New York City is an established safe haven for global investors. New York is home to the highest number of billionaires, individuals of ultra-high net worth (greater than US$30 million), and millionaires of any city in the world.\r\nThe city and its metropolitan area constitute the premier gateway for legal immigration to the United States. As many as 800 languages are spoken in New York, making it the most linguistically diverse city in the world. New York City is home to more than 3.2 million residents born outside the U.S., the largest foreign-born population of any city in the world as of 2016.New York City traces its origins to a trading post founded on the southern tip of Manhattan Island by Dutch colonists in approximately 1624. The settlement was named New Amsterdam (Dutch: Nieuw Amsterdam) in 1626 and was chartered as a city in 1653. The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 people from 48 cities worldwide, citing its cultural diversity.Many districts and monuments in New York City are major landmarks, including three of the world's ten most visited tourist attractions in 2013. A record 66.6 million tourists visited New York City in 2019. Times Square is the brightly illuminated hub of the Broadway Theater District, one of the world's busiest pedestrian intersections and a major center of the world's entertainment industry. Many of the city's landmarks, skyscrapers, and parks are known around the world, and the city's fast pace led to the phrase New York minute.]",,,ai (gpt-4o-mini)
1,Discuss the historical significance of the Statue of Liberty in relation to immigration in the United States.,"[New York, often called New York City or NYC, is the most populous city in the United States. With a 2020 population of 8,804,190 distributed over 300.46 square miles (778.2 km2), New York City is the most densely populated major city in the United States and more than twice as populous as Los Angeles, the nation's second-largest city. New York City is located at the southern tip of New York State. It constitutes the geographical and demographic center of both the Northeast megalopolis and the New York metropolitan area, the largest metropolitan area in the U.S. by both population and urban area. With over 20.1 million people in its metropolitan statistical area and 23.5 million in its combined statistical area as of 2020, New York is one of the world's most populous megacities, and over 58 million people live within 250 mi (400 km) of the city. New York City is a global cultural, financial, entertainment, and media center with a significant influence on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports. Home to the headquarters of the United Nations, New York is an important center for international diplomacy, and is sometimes described as the capital of the world.Situated on one of the world's largest natural harbors and extending into the Atlantic Ocean, New York City comprises five boroughs, each of which is coextensive with a respective county of the state of New York. The five boroughs, which were created in 1898 when local governments were consolidated into a single municipal entity, are: Brooklyn (in Kings County), Queens (in Queens County), Manhattan (in New York County), The Bronx (in Bronx County), and Staten Island (in Richmond County).As of 2021, the New York metropolitan area is the largest metropolitan economy in the world with a gross metropolitan product of over $2.4 trillion. If the New York metropolitan area were a sovereign state, it would have the eighth-largest economy in the world. New York City is an established safe haven for global investors. New York is home to the highest number of billionaires, individuals of ultra-high net worth (greater than US$30 million), and millionaires of any city in the world.\r\nThe city and its metropolitan area constitute the premier gateway for legal immigration to the United States. As many as 800 languages are spoken in New York, making it the most linguistically diverse city in the world. New York City is home to more than 3.2 million residents born outside the U.S., the largest foreign-born population of any city in the world as of 2016.New York City traces its origins to a trading post founded on the southern tip of Manhattan Island by Dutch colonists in approximately 1624. The settlement was named New Amsterdam (Dutch: Nieuw Amsterdam) in 1626 and was chartered as a city in 1653. The city came under British control in 1664 and was renamed New York after King Charles II of England granted the lands to his brother, the Duke of York. The city was regained by the Dutch in July 1673 and was renamed New Orange for one year and three months; the city has been continuously named New York since November 1674. New York City was the capital of the United States from 1785 until 1790, and has been the largest U.S. city since 1790. The Statue of Liberty greeted millions of immigrants as they came to the U.S. by ship in the late 19th and early 20th centuries, and is a symbol of the U.S. and its ideals of liberty and peace. In the 21st century, New York City has emerged as a global node of creativity, entrepreneurship, and as a symbol of freedom and cultural diversity. The New York Times has won the most Pulitzer Prizes for journalism and remains the U.S. media's ""newspaper of record"". In 2019, New York City was voted the greatest city in the world in a survey of over 30,000 people from 48 cities worldwide, citing its cultural diversity.Many districts and monuments in New York City are major landmarks, including three of the world's ten most visited tourist attractions in 2013. A record 66.6 million tourists visited New York City in 2019. Times Square is the brightly illuminated hub of the Broadway Theater District, one of the world's busiest pedestrian intersections and a major center of the world's entertainment industry. Many of the city's landmarks, skyscrapers, and parks are known around the world, and the city's fast pace led to the phrase New York minute.]",,,ai (gpt-4o-mini)


In [30]:
for q in [e.query for e in eval_questions.examples[0:5]]:
    print(f"- {q}")

- What are the five boroughs of New York City, and how are they related to the counties of New York State?
- Discuss the historical significance of the Statue of Liberty in relation to immigration in the United States.
- How does New York City's metropolitan economy compare to that of other global cities, and what would its gross metropolitan product rank if it were a sovereign state?
- What significant cultural and economic roles does New York City play on a global scale, as mentioned in the context information?
- Describe the early history of New York City, including the indigenous people who inhabited the area and the first documented European exploration.


In [38]:
import asyncio

def evaluate_query_engine(query_engine, questions):
    c = [query_engine.aquery(q) for q in questions]
    results = asyncio.run(asyncio.gather(*c))
    print("finished query")

    total_correct = 0
    for r in results:
        # evaluate with gpt 4
        eval_result = (1 if evaluator.evaluate_response(response=r).passing else 0)
        total_correct += eval_result

    return total_correct, len(results)

In [39]:
vector_query_engine = vector_index.as_query_engine()
correct, total = evaluate_query_engine(vector_query_engine, [e.query for e in eval_questions.examples[0:5]]) #eval_questions[:5])

print(f"score: {correct}/{total}")

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http

#### The response was faithful regarding the context 80% of the time.

## 1.4 Guideline Adherence  
GuidelineEvaluator evaluates a question answer system given user specified guidelines.

In [40]:
from llama_index.core.evaluation import GuidelineEvaluator
import nest_asyncio
nest_asyncio.apply()

In [41]:
GUIDELINES = ["The response should fully answer the query.",
              "The response should avoid being vague or ambiguous.",
              "The response should be specific and use statistics or numbers when possible.",
             ]

evaluators = [GuidelineEvaluator(llm=llm, guidelines=guideline) for guideline in GUIDELINES]

In [42]:
sample_data = {
    "query": "Tell me about global warming.",
    "contexts": [
        (
            "Global warming refers to the long-term increase in Earth's"
            " average surface temperature due to human activities such as the"
            " burning of fossil fuels and deforestation."
        ),
        (
            "It is a major environmental issue with consequences such as"
            " rising sea levels, extreme weather events, and disruptions to"
            " ecosystems."
        ),
        (
            "Efforts to combat global warming include reducing carbon"
            " emissions, transitioning to renewable energy sources, and"
            " promoting sustainable practices."
        ),
    ],
    "response": (
        "Global warming is a critical environmental issue caused by human"
        " activities that lead to a rise in Earth's temperature. It has"
        " various adverse effects on the planet."
    ),
}

In [43]:
for guideline, evaluator in zip(GUIDELINES, evaluators):
    eval_result = evaluator.evaluate(
        query=sample_data["query"],
        contexts=sample_data["contexts"],
        response=sample_data["response"],
    )
    print("=====")
    print(f"Guideline: {guideline}")
    print(f"Pass: {eval_result.passing}")
    print(f"Feedback: {eval_result.feedback}")

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=====
Guideline: The response should fully answer the query.
Pass: False
Feedback: The response provides a brief overview of global warming but lacks depth and detail. It mentions that global warming is caused by human activities and has adverse effects, but it does not specify what those activities are or what the effects might be. A more comprehensive answer would include information on greenhouse gas emissions, the role of fossil fuels, specific impacts on ecosystems and weather patterns, and potential solutions or mitigation strategies.
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=====
Guideline: The response should avoid being vague or ambiguous.
Pass: False
Feedback: The response is some

## 1.5 Question Generation  

We will walk again through the process of generating a list of questions that could be asked about your data.  
This is useful for setting up an evaluation pipeline using the **FaithfulnessEvaluator** and **RelevancyEvaluator** evaluation tools.

In [48]:
#from llama_index.core.evaluation import DatasetGenerator, 
from llama_index.core.evaluation import RelevancyEvaluator
from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.core.llama_dataset import LabelledRagDataset
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Response

#import logging
#import sys
#import pandas as pd

#logging.basicConfig(stream=sys.stdout, level=logging.INFO)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [49]:
reader = SimpleDirectoryReader(input_files=[f"../../Data/paul_graham_essay.txt"])
documents = reader.load_data()

In [50]:
data_generator = RagDatasetGenerator.from_documents(documents)

In [51]:
#eval_questions = data_generator.generate_questions_from_nodes()
eval_questions = LabelledRagDataset.from_json("../../Data/rag_dataset.json")

In [52]:
eval_questions.to_pandas().head(5)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"""Describe the author's early experiences with programming. What was the first machine he used and what challenges did he face while trying to write programs on it?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
1,"""What were the author's initial career aspirations before college and how did his experiences in college change his perspective and career path?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
2,"""Discuss the influence of the novel 'The Moon is a Harsh Mistress' and the PBS documentary featuring Terry Winograd on the author's decision to switch to AI. Why did these two sources inspire him?""","[What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called ""data processing."" This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.\n\nI was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.\n\nWith microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]\n\nThe first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.\n\nComputers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.\n\nThough I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.\n\nI couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most.]",,,ai (gpt-4)
3,"""In the context, the author mentions a novel by Heinlein that influenced his interest in AI. What is the name of this novel and how did it inspire the author's interest in AI?""","[I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words.\n\nThere weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. The commonly used programming languages then were pretty primitive, and programmers' ideas correspondingly so. The default language at Cornell was a Pascal-like language called PL/I, and the situation was similar elsewhere. Learning Lisp expanded my concept of a program so fast that it was years before I started to have a sense of where the new limits were. This was more like it; this was what I had expected college to do. It wasn't happening in a class, like it was supposed to, but that was ok. For the next couple years I was on a roll. I knew what I was going to do.\n\nFor my undergraduate thesis, I reverse-engineered SHRDLU. My God did I love working on that program. It was a pleasing bit of code, but what made it even more exciting was my belief — hard to imagine now, but not unique in 1985 — that it was already climbing the lower slopes of intelligence.\n\nI had gotten into a program at Cornell that didn't make you choose a major. You could take whatever classes you liked, and choose whatever you liked to put on your degree. I of course chose ""Artificial Intelligence."" When I got the actual physical diploma, I was dismayed to find that the quotes had been included, which made them read as scare-quotes. At the time this bothered me, but now it seems amusingly accurate, for reasons I was about to discover.\n\nI applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n\nI don't remember the moment it happened, or if there even was a specific moment, but during the first year of grad school I realized that AI, as practiced at the time, was a hoax. By which I mean the sort of AI in which a program that's told ""the dog is sitting on the chair"" translates this into some formal representation and adds it to the list of things it knows.\n\nWhat these programs really showed was that there's a subset of natural language that's a formal language. But a very proper subset. It was clear that there was an unbridgeable gap between what they could do and actually understanding natural language. It was not, in fact, simply a matter of teaching SHRDLU more words. That whole way of doing AI, with explicit data structures representing concepts, was not going to work. Its brokenness did, as so often happens, generate a lot of opportunities to write papers about various band-aids that could be applied to it, but it was never going to get us Mike.\n\nSo I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association with AI, even though that was the main reason people cared about it at the time. So I decided to focus on Lisp. In fact, I decided to write a book about Lisp hacking. It's scary to think how little I knew about Lisp hacking when I started writing that book. But there's nothing like writing a book about something to help you learn it. The book, On Lisp, wasn't published till 1993, but I wrote much of it in grad school.\n\nComputer Science is an uneasy alliance between two halves, theory and systems. The theory people prove things, and the systems people build things. I wanted to build things.]",,,ai (gpt-4)
4,"""The author discusses his experience with learning Lisp and how it expanded his concept of a program. Can you explain what Lisp is and why it was considered the language of AI during that time?""","[I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.\n\nAI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words.\n\nThere weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. The commonly used programming languages then were pretty primitive, and programmers' ideas correspondingly so. The default language at Cornell was a Pascal-like language called PL/I, and the situation was similar elsewhere. Learning Lisp expanded my concept of a program so fast that it was years before I started to have a sense of where the new limits were. This was more like it; this was what I had expected college to do. It wasn't happening in a class, like it was supposed to, but that was ok. For the next couple years I was on a roll. I knew what I was going to do.\n\nFor my undergraduate thesis, I reverse-engineered SHRDLU. My God did I love working on that program. It was a pleasing bit of code, but what made it even more exciting was my belief — hard to imagine now, but not unique in 1985 — that it was already climbing the lower slopes of intelligence.\n\nI had gotten into a program at Cornell that didn't make you choose a major. You could take whatever classes you liked, and choose whatever you liked to put on your degree. I of course chose ""Artificial Intelligence."" When I got the actual physical diploma, I was dismayed to find that the quotes had been included, which made them read as scare-quotes. At the time this bothered me, but now it seems amusingly accurate, for reasons I was about to discover.\n\nI applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.\n\nI don't remember the moment it happened, or if there even was a specific moment, but during the first year of grad school I realized that AI, as practiced at the time, was a hoax. By which I mean the sort of AI in which a program that's told ""the dog is sitting on the chair"" translates this into some formal representation and adds it to the list of things it knows.\n\nWhat these programs really showed was that there's a subset of natural language that's a formal language. But a very proper subset. It was clear that there was an unbridgeable gap between what they could do and actually understanding natural language. It was not, in fact, simply a matter of teaching SHRDLU more words. That whole way of doing AI, with explicit data structures representing concepts, was not going to work. Its brokenness did, as so often happens, generate a lot of opportunities to write papers about various band-aids that could be applied to it, but it was never going to get us Mike.\n\nSo I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association with AI, even though that was the main reason people cared about it at the time. So I decided to focus on Lisp. In fact, I decided to write a book about Lisp hacking. It's scary to think how little I knew about Lisp hacking when I started writing that book. But there's nothing like writing a book about something to help you learn it. The book, On Lisp, wasn't published till 1993, but I wrote much of it in grad school.\n\nComputer Science is an uneasy alliance between two halves, theory and systems. The theory people prove things, and the systems people build things. I wanted to build things.]",,,ai (gpt-4)


In [54]:
for q in [e.query for e in eval_questions.examples[0:5]]:
    print(f"- {q}")

- "Describe the author's early experiences with programming. What was the first machine he used and what challenges did he face while trying to write programs on it?"
- "What were the author's initial career aspirations before college and how did his experiences in college change his perspective and career path?"
- "Discuss the influence of the novel 'The Moon is a Harsh Mistress' and the PBS documentary featuring Terry Winograd on the author's decision to switch to AI. Why did these two sources inspire him?"
- "In the context, the author mentions a novel by Heinlein that influenced his interest in AI. What is the name of this novel and how did it inspire the author's interest in AI?"
- "The author discusses his experience with learning Lisp and how it expanded his concept of a program. Can you explain what Lisp is and why it was considered the language of AI during that time?"


#### Saving our Dataset

In [55]:
eval_questions.save_json("../../Data/rag_dataset.json")

#### Reading the saved Dataset

In [57]:
eval_questions = LabelledRagDataset.from_json("../../Data/rag_dataset.json")

#### Using the Dataset for Evaluation (creating an index with the same source of the questions)

In [58]:
evaluator = RelevancyEvaluator(llm=llm)

In [59]:
# create vector index
vector_index = VectorStoreIndex.from_documents(documents)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


#### Defining another jupyter display function

In [70]:
def display_eval_df(query: str, response: Response, eval_result: str) -> None:
    eval_df = pd.DataFrame(
        {
            "Query": query,
            "Response": str(response.response),
            "Source": (response.source_nodes[0].node.get_content()[:1000] + "..."),
            "Evaluation Result": eval_result,
        },
        index=[0],
    )
    eval_df = eval_df.style.set_properties(
        **{
            "inline-size": "600px",
            "overflow-wrap": "break-word",
        },
        subset=["Response", "Source"]
    )
    display(eval_df)

In [66]:
eval_questions[1].query

'"What were the author\'s initial career aspirations before college and how did his experiences in college change his perspective and career path?"'

In [68]:
query_engine = vector_index.as_query_engine()
response_vector = query_engine.query(eval_questions[2].query)
eval_result = evaluator.evaluate_response(query=eval_questions[1].query, response=response_vector)
display_eval_df(eval_questions[1].query, response_vector, eval_result.feedback)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Unnamed: 0,Query,Response,Source,Evaluation Result
0,"""What were the author's initial career aspirations before college and how did his experiences in college change his perspective and career path?""","The novel ""The Moon is a Harsh Mistress"" and the PBS documentary featuring Terry Winograd had a significant impact on the author's decision to switch to AI. The novel, which features an intelligent computer named Mike, captivated the author and created a sense of anticipation about the future of AI. It painted a vivid picture of a world where intelligent machines could exist, leading him to believe that such advancements were imminent. Similarly, the PBS documentary showcasing Winograd using SHRDLU, an early natural language processing program, further fueled his enthusiasm. The demonstration suggested that the development of intelligent systems was not just a distant dream but something that could be realized in the near future. The author was inspired by the idea that with enough knowledge and effort, it was possible to teach machines to understand and process human language, which aligned with his growing interest in AI. Together, these sources instilled a sense of excitement and possibility, prompting the author to shift his focus from philosophy, which he found unfulfilling, to the burgeoning field of artificial intelligence.","I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words. There weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of ...",YES


In [69]:
print(eval_result.passing)
print(eval_result.score)
print(eval_result.feedback)

True
1.0
YES


# 2 Dataset generation  
## 2.1 Benchmarking RAG Pipelines with a `LabelledRagDatatset`

The `LabelledRagDataset` is meant to be used for evaluating any given RAG pipeline, for which there could be several configurations (i.e. choosing the `LLM`, values for the `similarity_top_k`, `chunk_size`, and others). 
This relates to traditional machine learning datasets, where `X` features are meant to predict a ground-truth label `y`. In this case, we use the `query` as well as the retrieved `contexts` as the "features" and the answer to the query, called `reference_answer` as the ground-truth label.

And of course, such datasets are comprised of observations or examples. In the case of `LabelledRagDataset`, these are made up with a set of `LabelledRagDataExample`'s.

Let's construct a `LabelledRagDataset` from scratch. Please note that the alternative to this would be to simply download a community supplied `LabelledRagDataset` from `llama-hub` in order to evaluate/benchmark your own RAG pipeline on it.

In [72]:
from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedByType,
    CreatedBy,
)

# constructing a LabelledRagDataExample
query = "This is a test query, is it not?"
query_by = CreatedBy(type=CreatedByType.AI, model_name="gpt-4")
reference_answer = "Yes it is."
reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
reference_contexts = ["This is a sample context"]

rag_example = LabelledRagDataExample(
    query=query,
    query_by=query_by,
    reference_contexts=reference_contexts,
    reference_answer=reference_answer,
    reference_answer_by=reference_answer_by,
)

The `LabelledRagDataExample` is a Pydantic `Model` and so, going from `json` or `dict` (and vice-versa) is possible.

JSON view

In [73]:
print(rag_example.json())

{"query":"This is a test query, is it not?","query_by":{"model_name":"gpt-4","type":"ai"},"reference_contexts":["This is a sample context"],"reference_answer":"Yes it is.","reference_answer_by":{"model_name":"","type":"human"}}


In [74]:
LabelledRagDataExample.parse_raw(rag_example.json())

LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

Dictionary view

In [75]:
rag_example.dict()

{'query': 'This is a test query, is it not?',
 'query_by': {'model_name': 'gpt-4', 'type': <CreatedByType.AI: 'ai'>},
 'reference_contexts': ['This is a sample context'],
 'reference_answer': 'Yes it is.',
 'reference_answer_by': {'model_name': '',
  'type': <CreatedByType.HUMAN: 'human'>}}

In [76]:
LabelledRagDataExample.parse_obj(rag_example.dict())

LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

#### Let's create a second example, so we can have a (slightly) more interesting `LabelledRagDataset`.

In [77]:
query = "This is a test query, is it so?"
reference_answer = "I think yes, it is."
reference_contexts = ["This is a second sample context"]

rag_example_2 = LabelledRagDataExample(
    query=query,
    query_by=query_by,
    reference_contexts=reference_contexts,
    reference_answer=reference_answer,
    reference_answer_by=reference_answer_by,
)

### The `LabelledRagDataset` Class

In [78]:
from llama_index.core.llama_dataset import LabelledRagDataset

new_rag_dataset = LabelledRagDataset(examples=[rag_example, rag_example_2])
new_rag_dataset.to_pandas()

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,"This is a test query, is it not?",[This is a sample context],Yes it is.,human,ai (gpt-4)
1,"This is a test query, is it so?",[This is a second sample context],"I think yes, it is.",human,ai (gpt-4)


To persist and load the dataset to and from disk, there are the `save_json` and `from_json` methods.

In [79]:
new_rag_dataset.save_json("../../Data/new_rag_dataset.json")
reload_rag_dataset = LabelledRagDataset.from_json("../../Data/rag_dataset.json")

#### Building a synthetic `LabelledRagDataset` over Wikipedia 

For this section, we'll first create a `LabelledRagDataset` using a synthetic generator. Ultimately, we will use GPT-4 to produce both the `query` and `reference_answer` for the synthetic `LabelledRagDataExample`'s.

NOTE: if one has queries, reference answers, and contexts over a text corpus, then it is not necessary to use data synthesis to be able to predict and subsequently evaluate said predictions.

In [80]:
from tqdm.asyncio import tqdm_asyncio
import nest_asyncio
nest_asyncio.apply()

In [81]:
#%pip install -q llama-index-llms-openai
%pip install -qU llama-index-readers-wikipedia
%pip install -q wikipedia

In [85]:
#from llama_index.llms.openai import OpenAI
#from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.readers.wikipedia import WikipediaReader
from llama_index.core import VectorStoreIndex

#### Creating a small Corpus from wikipedia

In [86]:
cities = ["Vienna",]
documents = WikipediaReader().load_data(pages=[f"History of {x}" for x in cities])
index = VectorStoreIndex.from_documents(documents)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


The `RagDatasetGenerator` can be built over a set of documents to generate `LabelledRagDataExample`'s.

#### Generating questions against chunks  
#### Instantiating a DatasetGenerator

In [87]:
dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=2,  # set the number of questions per nodes
    show_progress=True,
)

Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

In [88]:
len(dataset_generator.nodes)

8

In [89]:
# since there are 8 nodes, there should be a total of 16 questions
rag_dataset = dataset_generator.generate_dataset_from_nodes()

  0%|          | 0/8 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 12%|█▎        | 1/8 [00:01<00:08,  1.25s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 25%|██▌       | 2/8 [00:01<00:04,  1.27it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 38%|███▊      | 3/8 [00:01<00:02,  1.89it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 62%|██████▎   | 5/8 [00:02<00:01,  2.10it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 75%|███████▌  | 6/8 [00:03<00:01,  1.97it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 88%|████████▊ | 7/8 [00:04<00:00,  1.74it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 8/8 [00:04<00:00,  1.83it/s]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:03<00:03,  3.93s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:04<00:00,  2.29s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:04<00:04,  4.36s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:06<00:00,  3.28s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:06<00:06,  6.03s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:09<00:00,  4.93s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:05<00:05,  5.31s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:07<00:00,  3.52s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:06<00:06,  6.45s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:11<00:00,  5.62s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:06<00:06,  6.82s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:08<00:00,  4.34s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:06<00:06,  6.55s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:08<00:00,  4.11s/it]
  0%|          | 0/2 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 1/2 [00:07<00:07,  7.21s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 2/2 [00:12<00:00,  6.20s/it]


In [90]:
rag_dataset.to_pandas().head(5)

Unnamed: 0,query,reference_contexts,reference_answer,reference_answer_by,query_by
0,Discuss the significance of the year 1155 in the history of Vienna and explain how it contributed to the city's development under the Babenberg dynasty.,"[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl.]","The year 1155 is significant in the history of Vienna as it marks the moment when Margrave Henry II of Austria made Vienna his capital. This decision was pivotal for the city's development under the Babenberg dynasty for several reasons.\n\nFirstly, by establishing Vienna as the capital, Henry II elevated its status from a regional settlement to a central political hub. This move not only solidified the Babenberg family's power but also positioned Vienna as a focal point for governance and administration in the region. As the seat of the dukes, Vienna became the center of political activity, which attracted nobility, merchants, and artisans, fostering economic growth and urban development.\n\nSecondly, the elevation of Vienna to capital status coincided with the raising of Austria to a duchy in 1156 through the Privilegium Minus. This change further enhanced Vienna's importance, as it became the administrative heart of the newly established duchy. The presence of the ducal court and the associated administrative functions stimulated the local economy and encouraged the establishment of institutions that would support trade and commerce.\n\nAdditionally, the Babenberg dynasty's focus on Vienna led to significant infrastructural developments. The city began to grow in size and complexity, with the construction of city walls and other fortifications around the year 1200, which were essential for protecting the burgeoning population and trade activities. The establishment of a mint following the capture of King Richard the Lionheart during the Third Crusade also contributed to the city's economic prosperity, allowing Vienna to engage more actively in trade.\n\nIn summary, the year 1155 was a turning point for Vienna, as it became the capital of the Babenberg dynasty. This status not only enhanced its political significance but also laid the groundwork for economic growth, urban development, and the establishment of vital trade networks, ultimately shaping Vienna into one of the most important cities in the Holy Roman Empire.",ai (gpt-4o-mini),ai (gpt-4o-mini)
1,"What were the implications of Vienna receiving the rights of a city and staple port in 1221, and how did this status affect its role in trade within the Holy Roman Empire?","[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl.]","The granting of city rights and staple port status to Vienna in 1221 had significant implications for its development and role in trade within the Holy Roman Empire. \n\nFirstly, receiving city rights meant that Vienna was recognized as a self-governing entity with certain privileges, which included the ability to establish its own laws and governance structures. This autonomy allowed Vienna to develop a more organized and structured urban environment, fostering a sense of community and stability that was attractive to traders and settlers.\n\nThe staple port status (Stapelrecht) was particularly crucial for Vienna's economic growth. It mandated that all traders passing through the city had to offer their goods for sale there before continuing on their journeys. This requirement positioned Vienna as a central hub in trade networks, allowing it to act as a middleman in the exchange of goods. As a result, the city became a vital point for commerce, facilitating trade not only along the Danube basin but also extending to Venice and beyond.\n\nThis enhanced trade activity contributed to Vienna's emergence as one of the most important cities in the Holy Roman Empire. The influx of goods and merchants led to increased economic prosperity, which in turn attracted more people to the city, further boosting its population and economic significance. The establishment of a network of far-reaching trade relations helped Vienna to solidify its status as a cultural and economic center in the region.\n\nIn summary, the rights granted to Vienna in 1221 transformed it into a key player in the trade dynamics of the Holy Roman Empire, fostering economic growth, urban development, and a strategic position in regional commerce.",ai (gpt-4o-mini),ai (gpt-4o-mini)
2,"Discuss the significance of Rudolf IV of Austria's contributions to Vienna during the Habsburg rule, particularly in relation to the establishment of the University of Vienna and the construction of the Stephansdom.","[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\n\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]","Rudolf IV of Austria, known as ""the Founder,"" played a pivotal role in shaping Vienna during the Habsburg rule through his significant contributions to the city's cultural and architectural landscape. His establishment of the University of Vienna in 1365 marked a crucial step in the development of higher education in the region, positioning Vienna as an intellectual center in Central Europe. This institution not only fostered academic growth but also attracted scholars and students, enhancing the city's reputation and contributing to its prosperity.\n\nIn addition to his educational reforms, Rudolf IV initiated the construction of the gothic nave of the Stephansdom, which became a symbol of Vienna's religious and cultural identity. This project was significant not only for its architectural grandeur but also for its role in establishing a metropolitan chapter, effectively elevating the status of the church in Vienna. The cathedral became a focal point for the community, reflecting the city's aspirations and ambitions during a time when it was striving to assert its importance within the Holy Roman Empire.\n\nRudolf IV's prudent economic policies further supported these initiatives, leading to increased prosperity in Vienna. His contributions laid the groundwork for the city's development as a major urban center, influencing its social, cultural, and political landscape for generations to come. Overall, Rudolf IV's legacy is marked by his efforts to enhance Vienna's educational and architectural stature, which were instrumental in the city's evolution during the Habsburg era.",ai (gpt-4o-mini),ai (gpt-4o-mini)
3,"Analyze the impact of the First Turkish Siege of 1529 on Vienna's fortifications and military strategy, and explain how these developments influenced the outcome of the Second Turkish Siege in 1683.","[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\n\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]","The First Turkish Siege of Vienna in 1529 had a profound impact on the city's fortifications and military strategy, which ultimately influenced the outcome of the Second Turkish Siege in 1683.\n\nDuring the First Siege, Vienna's medieval walls were put to the test against the Ottoman forces. Although the city managed to withstand the siege, it became evident that the existing fortifications were inadequate to repel a determined and well-equipped enemy. The siege highlighted the need for modernized defenses, as the city's defenses barely held against the Turkish attacks, which were exacerbated by epidemics and harsh winter conditions that forced the Ottomans to retreat.\n\nIn response to the vulnerabilities exposed during the First Siege, significant improvements were made to Vienna's fortifications. By 1548, under the guidance of military architect Sebastian Schrantz, the city was transformed into a formidable fortress. This included the construction of eleven bastions, a surrounding moat, and a glacis—a clear area that allowed defenders to fire upon attackers without obstruction. These enhancements were crucial in creating a more defensible position for the city, allowing it to withstand future sieges more effectively.\n\nWhen the Second Turkish Siege occurred in 1683, the improvements made to Vienna's fortifications played a critical role in the city's defense. The fortified walls and bastions allowed the defenders to hold out against the Ottoman forces for two months, demonstrating the effectiveness of the new military strategy and fortifications. The city's ability to maintain its defenses until the arrival of reinforcements, particularly the Polish army led by King John III Sobieski, was pivotal in the eventual defeat of the Ottoman forces.\n\nIn summary, the First Turkish Siege of 1529 served as a catalyst for significant advancements in Vienna's military strategy and fortifications. The lessons learned from that siege directly influenced the city's preparedness and resilience during the Second Turkish Siege in 1683, ultimately contributing to the successful defense of Vienna and marking a turning point in the conflict between the Habsburgs and the Ottoman Empire.",ai (gpt-4o-mini),ai (gpt-4o-mini)
4,"Discuss the impact of Emperor Joseph II's administration reforms in 1783 on the city of Vienna, particularly in relation to urban management and public health.","[=== 18th century ===\n\nThe following period was characterised by extensive building activities. In the course of reconstruction, Vienna was largely turned into a baroque city. The most important architects were Johann Bernhard Fischer von Erlach and Johann Lukas von Hildebrandt. Most construction happened in the suburbs (Vorstädte), as the nobility began to cover the surrounding land with garden palaces, known as Palais. The best known are the Palais Liechtenstein, Palais Modena, Schönbrunn Palace, Palais Schwarzenberg, and the Belvedere (the garden palais of Prince Eugene of Savoy). In 1704, an outer fortification, the Linienwall, was built around the Vorstädte.\nAfter the extensive plague epidemics of 1679 and 1713, the population began to grow steadily. It is estimated that 150,000 people lived in Vienna in 1724, and 200,000 in 1790. At that time, the first factories were built, starting in Leopoldstadt. Leopoldstadt also became a site where many Jews lived, as they had been driven out of their 50-year-old ghetto in 1670. Hygienic problems began to become noticeable: sewers and street cleaning began to develop. Also in this time, the first house numbers (the Konskriptionsnummern) were issued, and the government postal system began to develop.\nUnder Emperor Joseph II, the city administration was modernized in 1783: officials in charge of only the city were introduced, and the Magistrate was created (More information about the Magistrate of the City of Vienna specifically can be found in German at de:Magistrat der Stadt Wien.). At the same time, the graveyards within the city were closed.\n\n\n=== 19th century ===\n\nDuring the Napoleonic Wars, Vienna was taken by the French twice, in 1805 and 1809. The first conquest happened without a battle. Three French marshals crossed the strongly defended Taborbrücke (Tábor bridge), the only Danube bridge at that time, and convinced the Austrian commander that the war was already over. In the meantime, the French army easily entered the city and was greeted by the population with interest rather than rejection. Napoleon allowed 10,000 men of the Vienna national guard to remain armed and left the arsenal to them when he left, as complete as he had found it.\nHowever, the second occupation happened only after heavy fire. Shortly after, Napoleon suffered his first large defeat at Aspern, nearby. Less than two months later, his army crossed the Danube again and fought the Battle of Wagram on the same terrain as the previous Battle of Aspern. This second battle resulted in a victory for the French, and Austria soon surrendered, ending the War of the Fifth Coalition. In 1810, Salomon Mayer Rothschild arrived in Vienna from Frankfurt and sets up a bank named ""Mayer von Rothschild und Söhne"". The Emperor of Austria in 1823, made the five Rothschild brothers barons. The Rothschild family became famous as bankers in the major countries of Europe, and the Rothschild banking family of Austria remained prominent until the Creditanstalt bank in Vienna was confiscated by the Nazis in 1938.\nAfter Napoleon's final defeat, the Congress of Vienna took place from September 18, 1814 to June 9, 1815, in which the political map of Europe was redrawn. The congress members indulged in many social events, which induced the witty Charles Joseph, Prince de Ligne to famously say: Le congres danse beaucoup, mais il ne marche pas (""The congress dances, but does not progress""). The events cost Austria a great deal of money, which was reflected in mockery about the major participants:\n\nAlexander of Russia: loves for all\nFrederick William of Prussia: thinks for all\nFrederick of Denmark: speaks for all\nMaximilian of Bavaria: drinks for all\nFrederick of Württemberg: eats for all\nEmperor Francis of Austria: pays for all\n\nThe first half of the century was characterised by intensive industrialization, with Vienna being the center of the railway network after 1837.\nThe French February Revolution of 1848 had an effect as far away as Vienna: on March 13, the March Revolution, which forced long-serving chancellor Metternich to resign.\nDuring the 19th century, Vienna, along with Budapest, became one of the main centers of the Aromanian diaspora. The Aromanian population of these cities stands out for one of the first ones to develop a strictly Aromanian identity.]","Emperor Joseph II's administration reforms in 1783 had a significant impact on the city of Vienna, particularly in the areas of urban management and public health. One of the key changes was the modernization of the city administration, which involved the introduction of officials specifically responsible for managing the city. This restructuring led to the establishment of the Magistrate, an administrative body that played a crucial role in overseeing various aspects of urban life.\n\nThe reforms aimed to improve the efficiency and effectiveness of governance in Vienna, which was becoming increasingly important as the population grew and urban issues became more pronounced. With an estimated population of 200,000 by 1790, the city faced challenges related to infrastructure, sanitation, and public health. The introduction of a dedicated city administration allowed for more focused attention on these issues.\n\nIn terms of public health, the reforms coincided with the growing recognition of hygienic problems in the city. The previous extensive plague epidemics had highlighted the need for better sanitation and health measures. Under Joseph II's administration, efforts began to develop sewers and improve street cleaning, which were essential for reducing the spread of disease and improving the overall living conditions in Vienna.\n\nAdditionally, the closure of graveyards within the city limits was a significant public health measure. This action not only addressed the health risks associated with overcrowded burial sites but also reflected a shift towards more modern urban planning practices that prioritized the well-being of residents.\n\nOverall, Emperor Joseph II's reforms laid the groundwork for a more organized and health-conscious approach to urban management in Vienna, addressing the pressing needs of a growing population and setting the stage for further developments in public health and city planning in the years to come.",ai (gpt-4o-mini),ai (gpt-4o-mini)


In [91]:
rag_dataset.save_json("../../Data/Vienna_dataset.json")

### 1.7 Context Relevancy and Answer Relevancy  
`AnswerRelevancyEvaluator` and `ContextRelevancyEvaluator` give a measure on the relevancy of a generated answer and retrieved contexts, respectively, to a given user query. Both of these evaluators return a `score` that is between 0 and 1 as well as a generated `feedback` explaining the score. Note that, higher score means higher relevancy. In particular, we prompt the judge LLM to take a step-by-step approach in providing a relevancy score, asking it to answer the following two questions of a generated answer to a query for answer relevancy (for context relevancy these are slightly adjusted):

1. Does the provided response match the subject matter of the user's query?
2. Does the provided response attempt to address the focus or perspective on the subject matter taken on by the user's query?

Each question is worth 1 point and so a perfect evaluation would yield a score of 2/2.

First, we build a RAG over the same source documents used to created the `rag_dataset`.

In [92]:
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


With our RAG (i.e `query_engine`) defined, we can make predictions (i.e., generate responses to the query) with it over the `rag_dataset`.

In [93]:
prediction_dataset = await rag_dataset.amake_predictions_with(
    predictor=query_engine, batch_size=100, show_progress=True
)

Batch processing of predictions:   0%|          | 0/16 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.open

Batch processing of predictions:   6%|▋         | 1/16 [00:04<01:08,  4.56s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  12%|█▎        | 2/16 [00:04<00:28,  2.03s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  19%|█▉        | 3/16 [00:05<00:15,  1.20s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  25%|██▌       | 4/16 [00:05<00:11,  1.09it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  31%|███▏      | 5/16 [00:05<00:07,  1.55it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  38%|███▊      | 6/16 [00:06<00:05,  1.79it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  50%|█████     | 8/16 [00:06<00:03,  2.59it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  62%|██████▎   | 10/16 [00:06<00:01,  3.38it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  69%|██████▉   | 11/16 [00:07<00:01,  3.02it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  75%|███████▌  | 12/16 [00:07<00:01,  2.35it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  88%|████████▊ | 14/16 [00:08<00:00,  3.16it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions:  94%|█████████▍| 15/16 [00:09<00:00,  1.96it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Batch processing of predictions: 100%|██████████| 16/16 [00:10<00:00,  1.58it/s]


### Evaluating Answer and Context Relevancy Separately

We first need to define our evaluators (i.e. `AnswerRelevancyEvaluator` & `ContextRelevancyEvaluator`):

In [94]:
# instantiate the gpt-4 judges
#from llama_index.llms.openai import OpenAI
from llama_index.core.evaluation import (
    AnswerRelevancyEvaluator,
    ContextRelevancyEvaluator,
)

judges = {}

judges["answer_relevancy"] = AnswerRelevancyEvaluator(
    llm=OpenAI(temperature=0, model="gpt-3.5-turbo"),
)

judges["context_relevancy"] = ContextRelevancyEvaluator(
    llm=OpenAI(temperature=0, model="gpt-4"),
)

Now, we can use our evaluator to make evaluations by looping through all of the <example, prediction> pairs.

In [95]:
eval_tasks = []
for example, prediction in zip(
    rag_dataset.examples, prediction_dataset.predictions
):
    eval_tasks.append(
        judges["answer_relevancy"].aevaluate(
            query=example.query,
            response=prediction.response,
            sleep_time_in_seconds=10.0,
        )
    )
    eval_tasks.append(
        judges["context_relevancy"].aevaluate(
            query=example.query,
            contexts=prediction.contexts,
            sleep_time_in_seconds=10.0,
        )
    )

In [96]:
eval_results1 = await tqdm_asyncio.gather(*eval_tasks[:8])
eval_results2 = await tqdm_asyncio.gather(*eval_tasks[8:16])
eval_results3 = await tqdm_asyncio.gather(*eval_tasks[16:24])
eval_results4 = await tqdm_asyncio.gather(*eval_tasks[24:])
eval_results = eval_results1 + eval_results2+ eval_results3 + eval_results4

  0%|          | 0/8 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 12%|█▎        | 1/8 [00:11<01:20, 11.48s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 25%|██▌       | 2/8 [00:12<00:32,  5.39s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 38%|███▊      | 3/8 [00:13<00:15,  3.15s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 4/8 [00:13<00:09,  2.26s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 62%|██████▎   | 5/8 [00:16<00:06,  2.20s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 88%|████████▊ | 7/8 [00:18<00:01,  1.59s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 8/8 [00:23<00:00,  2.95s/it]
  0%|          | 0/8 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 7.932000 seconds
Retrying request to /chat/completions in 7.932000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 7.722000 seconds
Retrying request to /chat/completions in 7.722000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 12%|█▎        | 1/8 [00:11<01:18, 11.26s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 25%|██▌       | 2/8 [00:11<00:28,  4.79s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 38%|███▊      | 3/8 [00:11<00:13,  2.74s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 4/8 [00:12<00:07,  1.88s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 62%|██████▎   | 5/8 [00:17<00:08,  2.87s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 0.630000 seconds
Retrying request to /chat/completions in 0.630000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 75%|███████▌  | 6/8 [00:18<00:04,  2.37s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 12.576000 seconds
Retrying request to /chat/completions in 12.576000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 88%|████████▊ | 7/8 [00:25<00:03,  3.79s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 8/8 [00:38<00:00,  4.79s/it]
  0%|          | 0/8 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 11.202000 seconds
Retrying request to /chat/completions in 11.202000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 11.058000 seconds
Retrying request to /chat/completions in 11.058000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 10.686000 seconds
Retrying request to /chat/c

 12%|█▎        | 1/8 [00:11<01:21, 11.66s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 25%|██▌       | 2/8 [00:12<00:30,  5.01s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 4/8 [00:12<00:08,  2.01s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 62%|██████▎   | 5/8 [00:16<00:08,  2.78s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 13.794000 seconds
Retrying request to /chat/completions in 13.794000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 13.788000 seconds
Retrying request to /chat/completions in 13.788000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 75%|███████▌  | 6/8 [00:26<00:09,  4.72s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 12.972000 seconds
Retrying request to /chat/completions in 12.972000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 88%|████████▊ | 7/8 [00:42<00:08,  8.30s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 8/8 [00:55<00:00,  6.98s/it]
  0%|          | 0/8 [00:00<?, ?it/s]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 0.312000 seconds
Retrying request to /chat/completions in 0.312000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 0.084000 seconds
Retrying request to /chat/completions in 0.084000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 0.156000 seconds
Retrying request to /chat/comple

 12%|█▎        | 1/8 [00:11<01:19, 11.41s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 38%|███▊      | 3/8 [00:12<00:15,  3.18s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 50%|█████     | 4/8 [00:12<00:09,  2.30s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 62%|██████▎   | 5/8 [00:16<00:08,  2.69s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


 75%|███████▌  | 6/8 [00:18<00:05,  2.55s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 3.054000 seconds
Retrying request to /chat/completions in 3.054000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 2.982000 seconds
Retrying request to /chat/completions in 2.982000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
Retrying llama_index.llms.openai.base.OpenAI._achat in 0.7888028408087993 seconds as it raised RateLimitError: Erro

 88%|████████▊ | 7/8 [00:29<00:05,  5.28s/it]

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 1.548000 seconds
Retrying request to /chat/completions in 1.548000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


100%|██████████| 8/8 [00:44<00:00,  5.58s/it]


In [97]:
evals = {
    "answer_relevancy": eval_results[::2],
    "context_relevancy": eval_results[1::2],
}

### Taking a look at the evaluation results

Here we use a utility function to convert the list of `EvaluationResult` objects into something more notebook friendly. This utility will provide two DataFrames, one deep one containing all of the evaluation results, and another one which aggregates via taking the mean of all the scores, per evaluation method.

In [98]:
from llama_index.core.evaluation.notebook_utils import get_eval_results_df
import pandas as pd

deep_dfs = {}
mean_dfs = {}
for metric in evals.keys():
    deep_df, mean_df = get_eval_results_df(
        names=["baseline"] * len(evals[metric]),
        results_arr=evals[metric],
        metric=metric,
    )
    deep_dfs[metric] = deep_df
    mean_dfs[metric] = mean_df

In [99]:
mean_scores_df = pd.concat(
    [mdf.reset_index() for _, mdf in mean_dfs.items()],
    axis=0,
    ignore_index=True,
)
mean_scores_df = mean_scores_df.set_index("index")
mean_scores_df.index = mean_scores_df.index.set_names(["metrics"])
mean_scores_df

rag,baseline
metrics,Unnamed: 1_level_1
mean_answer_relevancy_score,1.0
mean_context_relevancy_score,0.507812


The above utility also provides the mean score across all of the evaluations in `mean_df`.

We can get a look at the raw distribution of the scores by invoking `value_counts()` on the `deep_df`.

In [100]:
deep_dfs["answer_relevancy"]["scores"].value_counts()

scores
1.0    16
Name: count, dtype: int64

In [101]:
deep_dfs["context_relevancy"]["scores"].value_counts()

scores
0.750    4
0.250    3
0.625    2
0.375    2
0.000    2
1.000    1
0.875    1
0.500    1
Name: count, dtype: int64

It looks like for the most part, the default RAG does fairly well in terms of generating answers that are relevant to the query. Getting a closer look is made possible by viewing the records of any of the `deep_df`'s.

In [102]:
deep_dfs["context_relevancy"].head(2)

Unnamed: 0,rag,query,answer,contexts,scores,feedbacks
0,baseline,Discuss the significance of the year 1155 in the history of Vienna and explain how it contributed to the city's development under the Babenberg dynasty.,,"[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl., == Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\n\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]",0.625,"The retrieved context does match the subject matter of the user's query. It provides a detailed history of Vienna, including the period of the Babenberg rule. It specifically mentions the year 1155, when Margrave Henry II of Austria made Vienna his capital, and how this contributed to the city's development. However, the context does not provide a comprehensive explanation of the significance of the year 1155 in the history of Vienna and how it contributed to the city's development under the Babenberg dynasty. It does not delve into the specific changes or developments that occurred in the city as a result of this event. Therefore, the context cannot be used exclusively to provide a full answer to the user's query.\n\n1. Does the retrieved context match the subject matter of the user's query? Yes (2/2)\n2. Can the retrieved context be used exclusively to provide a full answer to the user's query? No (0.5/2)\n\n[RESULT] 2.5/4"
1,baseline,"What were the implications of Vienna receiving the rights of a city and staple port in 1221, and how did this status affect its role in trade within the Holy Roman Empire?",,"[The history of Vienna has been long and varied, beginning when the Roman Empire created a military camp in the area now covered by Vienna's city centre. Vienna grew from the Roman settlement known as Vindobona to be an important trading site in the 11th century. It became the capital of the Babenberg dynasty and subsequently of the Austrian Habsburgs, under whom it became one of Europe's cultural hubs. During the 19th century as the capital of the Austrian Empire and later Austria-Hungary, it temporarily became one of Europe's biggest cities. Since the end of World War I, Vienna has been the capital of the Republic of Austria.\n\n\n== Beginnings and early Middle Ages ==\n\nThe name Vindobona derives from a Celtic language, suggesting that the region must have been inhabited before Roman times. The Romans created a military camp (occupied by Legio X Gemina) during the 1st century on the site of the city centre of present-day Vienna. The settlement was raised to the status of a municipium in 212. Even today, the streets of the First District show where the encampment placed its walls and moats. The Romans stayed until the 5th century.\nRoman Vindobona was located in the outskirts of the empire and thus fell prey to the chaos of the Migration Period. There are some indications that a catastrophic fire occurred around the beginning of the 5th century. However, the remains of the encampment were not deserted, and a small settlement remained. The streets and houses of early medieval Vienna followed the former Roman walls, which gives rise to the conclusion that parts of the fortification were still in place and used by the settlers.\nByzantine copper coins from the 6th century have been found several times in the area of today's city centre, indicating considerable trade activity. Graves from the 6th century were found during excavations next to the Berghof, in an area around Salvatorgasse. At that time, the Lombards controlled the area, with Slavs and Avars following later. Early Vienna was centred on the Berghof.\nThe first documented mention of the city during the Middle Ages is within the Salzburg Annals, dating to 881, when a battle apud Weniam was fought against the Magyars. However, it is unclear whether this refers to the city or the River Wien.\n\n\n== Babenberg rule ==\n\nIn 976, the Margraviate of Ostarrîchi was given to the Babenberg family. Vienna lay at its border with Hungary.\nVienna was an important site of trade as early as the 11th century. In the Exchange of Mautern between the Bishop of Passau and Margrave Leopold IV, Vienna is mentioned as a Civitas for the first time, which indicates the existence of a well-ordered settlement. \nIn 1155, Margrave Henry II of Austria made Vienna his capital. In 1156, Austria was raised to a duchy in the Privilegium Minus, with Vienna becoming the seat of all future dukes. During that time, the Schottenstift was founded.\nThe events surrounding the Third Crusade, during which King Richard the Lionheart was discovered and captured by Duke Leopold V the Virtuous two days before Christmas of 1192 in Erdberg near Vienna, brought an enormous ransom of 50,000 Silver Marks (about 10 to 12 tons of silver, about a third of the emperor's claims against the English. Richard had been extradited to him in March 1193). This allowed the creation of a mint and the construction of city walls around the year 1200. At the U-Bahn station Stubentor, some remains of the city walls can still be seen today. Because he had abused a protected crusader, Leopold V was excommunicated by Pope Celestine III, and died (without having been absolved) after falling from a horse in a tournament.\nIn 1221, Vienna received the rights of a city and as a staple port (Stapelrecht). This meant that all traders passing through Vienna had to offer their goods in the city. This allowed the Viennese to act as middlemen in trade, so that Vienna soon created a network of far-reaching trade relations, particularly along the Danube basin and to Venice, and to become one of the most important cities in the Holy Roman Empire.\n\nHowever, it was considered embarrassing that Vienna did not have its own bishop. It is known that Duke Frederick II negotiated about the creation of a bishopric in Vienna, and the same is suspected of Ottokar Přemysl., == Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\n\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades.]",0.75,"The retrieved context does match the subject matter of the user's query. It provides a detailed history of Vienna, including its rise as a trading site, its status as a city and staple port in 1221, and its role in trade within the Holy Roman Empire. The context explains that the rights of a city and staple port meant that all traders passing through Vienna had to offer their goods in the city, which allowed Vienna to act as middlemen in trade and establish a network of far-reaching trade relations. \n\nHowever, the context does not fully answer the user's query. While it explains what the rights of a city and staple port entailed and how this status affected Vienna's role in trade, it does not delve into the broader implications of this status for Vienna. For example, it does not discuss how this status may have affected Vienna's political standing, its relationship with other cities and regions within the Holy Roman Empire, or its economic development beyond trade. \n\n1. Does the retrieved context match the subject matter of the user's query? Yes (2/2)\n2. Can the retrieved context be used exclusively to provide a full answer to the user's query? Partially (1/2)\n\n[RESULT] 3/4"


And, of course you can apply any filters as you like. For example, if you want to look at the examples that yielded less than perfect results.

In [103]:
cond = deep_dfs["context_relevancy"]["scores"] < 0.6
deep_dfs["context_relevancy"][cond].head(5)

Unnamed: 0,rag,query,answer,contexts,scores,feedbacks
3,baseline,"Analyze the impact of the First Turkish Siege of 1529 on Vienna's fortifications and military strategy, and explain how these developments influenced the outcome of the Second Turkish Siege in 1683.",,"[== Habsburg rule ==\n\nIn 1278, Rudolf I took control over the Austrian lands after his victory over Ottokar II of Bohemia and began to establish Habsburg rule. In Vienna, it took a relatively long time for the Habsburgs to establish their control, because partisans of Ottokar remained strong for a long time. There were several uprisings against Albert I. The family of the Paltrams vom Stephansfreithof was foremost among the insurgents.\nIn 1280, Jans der Enikel wrote the ""Fürstenbuch"", a first history of the city.\nWith the Luxembourg emperors, Prague became the imperial residence and Vienna stood in its shadow. The early Habsburgs attempted to extend it in order to keep up. Duke Albert II, for example, had the gothic choir of the Stephansdom built. In 1327, Frederick the Handsome published his edict allowing the city to maintain an Eisenbuch (iron book) listing its privileges.\n\nThe combination of the heraldic eagle with the city coat of arms showing a white cross in a red field is found on a seal dated 1327.\nThis heraldic emblem was in use throughout the 14th century in different variants.\nRudolf IV of Austria deserves credit for his prudent economic policy, which raised the level of prosperity. His epithet the Founder is due to two things: first, he founded the University of Vienna in 1365, and second, he began the construction of the gothic nave in the Stephansdom. The latter is connected to the creation of a metropolitan chapter, as a symbolic substitute for a bishop.\nThere was a period of inheritance disputes among the Habsburgs resulting not only in confusion, but also in an economic decline and social unrest, with disputes between the parties of patricians and artisans. While the patricians supported Ernest the Iron, the artisans supported Leopold IV. In 1408, the mayor Konrad Vorlauf, an exponent of the patrician party, was executed.\nAfter the election of Duke Albert V as German King Albert II, Vienna became the capital of the Holy Roman Empire. Albert's name is remembered for his expulsion of the Jewish population of Vienna in 1421/22.\nEventually, in 1469, Vienna was given its own bishop, and the Stephansdom became a cathedral. During the upheavals of the era of Emperor Frederick III, Vienna remained on the side of his opponents (first Albert VI, then Matthias Corvinus), as Frederick proved unable to maintain peace in the land vis-à-vis rampaging gangs of mercenaries (often remaining from the Hussite Wars).\nIn 1485, the Hungarian King Matthias Corvinus and the Black Army of Hungary conquered the city and Vienna became the king's seat that served as the capital of Hungary until 1490.\nIn 1522, under Ferdinand I, Holy Roman Emperor the Blood Judgment of Wiener Neustadt led to the execution of leading members of the opposition within the city, and thus a destruction of the political structures. From then on, the city stood under direct imperial control.\n\nIn 1556, Vienna became the seat of the Emperor, with Bohemia having been added to the Habsburg realm in 1526.\nDuring this time, the city was also recatholicised after having become Protestant rather quickly. In 1551, the Jesuits were brought to town and soon gained a large influence in court. The leader of the Counter-Reformation here was Melchior Khlesl, Bishop of Vienna from 1600.\n\n\n=== Turkish sieges ===\n\nIn 1529, Vienna was besieged by the Ottoman Turks for the first time (the First Turkish Siege), although unsuccessfully. The city, protected by medieval walls, only barely withstood the attacks, until epidemics and an early winter forced the Turks to retreat. The siege had shown that new fortifications were needed. Following plans by Sebastian Schrantz, Vienna was expanded to a fortress in 1548. The city was furnished with eleven bastions and surrounded by a moat. A glacis was created around Vienna, a broad strip without any buildings, which allowed defenders to fire freely. These fortifications, which accounted for the major part of building activities well into the 17th century, became decisive in the Second Turkish Siege of 1683, as they allowed the city to maintain itself for two months, until the Turkish army was defeated by the army led by the Polish King John III Sobieski. This was the turning point in the Turkish Wars, as the Ottoman Empire was pushed back more and more during the following decades., === Expansion under Emperor Franz Joseph I ===\n\nThe city was expanded in 1850, mostly to include the area within the Linienwall. The Vorstädte thus became the 2nd to 9th districts, with the old city becoming the first. In 1858, the fortifications were demolished, and the broad Ringstraße boulevard was built in their place. Many monumental buildings were built alongside it. The Ringstraße Style (Historicism) characterises the architecture of Vienna to this day. The period peaked in the World Exhibition of 1873, immediately before the stock market crash, which ended the Gründerzeit (""foundation era"").\nIn 1861, the Liberals won the first (relatively) free elections after the end of neoabsolutism.\nAfter the great flood of 1830, Regulation of the Danube was frequently considered. It was finally put into practice during the 1860s. The many branches of the Danube were removed, and a straight course was created away from the city centre. The branch near the central city was made narrower and has been known under the somewhat misleading name Donaukanal (Danube Canal) ever since.\nDuring that period, the population of Vienna increased sharply, mostly because of immigration. Censuses were conducted regularly from 1869 onwards, which showed an all-time high of population in 1910, with 2,031,000 inhabitants.\nAround 1900, Vienna became a centre of the Jugendstil (Art Nouveau), most of all with Otto Wagner and the association of artists known as Vienna Secession (after which the characteristic building on Karlsplatz is named).\nIn 1890, the city was expanded for a second time: the (Vororte) suburbs beyond the old Linienwall were incorporated into the town as the districts 11 to 19 (the 10th district had been created in 1874 by the division of the fourth). Leopoldstadt was divided in 1900, with the northern part becoming the 20th district (Brigittenau). In 1904, Floridsdorf became part of Vienna as 21st district.\nDuring those years, Karl Lueger was the leading figure of city politics. Neither his dedication to social policy can be denied, nor other works for the municipality (such as the Wiener Hochquellwasserleitung, bringing fresh water from the mountains to Vienna and the creation of a belt of meadows and forests around the city). However, these positive aspects were coupled with his raving and rhetorically well presented anti-Semitism, which enjoyed popular support.\n\n\n=== World War I ===\n\nWorld War I (1914–1918) did not result in an immediate threat to Vienna, but it led to a lack of supplies because of the economic embargo imposed by the Entente powers, which resulted in a shortage of food and clothes. The skyrocketing costs of the war, mostly financed by borrowing, resulted in tremendous inflation that—as in Germany—wiped out the savings of many middle-class Viennese. One interesting quirk is that the 1916 Rent Control was never officially repealed; even today there are buildings where the 1916 rent—by now essentially meaningless—applies.\nThese food shortages led to the 1918 Austro-Hungarian January Strike which started in Vienna. The Bolshevik seizure of power in Russia in November 1917 inspired the youth wing of the Social Democratic Workers' Party of Austria to organise meetings in November 1917 opposing the war.\nThe Flight over Vienna propaganda flight, an air raid inspired by Gabriele d'Annunzio, was carried out on August 9, 1918, with 11 Ansaldo SVA. He flew undisturbed for over 1,200 km in a roundtrip to Vienna to drop about 400,000 propaganda leaflets, written in Italian and German, asking the people of Vienna the end of the alliance between Austria-Hungary and Germany.]",0.375,"1. The retrieved context does match the subject matter of the user's query. It provides information about the First Turkish Siege of 1529 and the Second Turkish Siege in 1683, as well as the impact of these events on Vienna's fortifications and military strategy. However, the context does not provide a detailed analysis of these impacts and how they influenced the outcome of the Second Turkish Siege. It only briefly mentions that the fortifications allowed the city to maintain itself for two months until the Turkish army was defeated. \n\n2. The retrieved context cannot be used exclusively to provide a full answer to the user's query. While it does provide some relevant information, it lacks a detailed analysis of the impact of the First Turkish Siege on Vienna's fortifications and military strategy, and how these developments influenced the outcome of the Second Turkish Siege. The context also includes a lot of information that is not relevant to the user's query, such as the history of Habsburg rule in Vienna and the city's expansion under Emperor Franz Joseph I.\n\n[RESULT] 1.5"
4,baseline,"Discuss the impact of Emperor Joseph II's administration reforms in 1783 on the city of Vienna, particularly in relation to urban management and public health.",,"[=== 18th century ===\n\nThe following period was characterised by extensive building activities. In the course of reconstruction, Vienna was largely turned into a baroque city. The most important architects were Johann Bernhard Fischer von Erlach and Johann Lukas von Hildebrandt. Most construction happened in the suburbs (Vorstädte), as the nobility began to cover the surrounding land with garden palaces, known as Palais. The best known are the Palais Liechtenstein, Palais Modena, Schönbrunn Palace, Palais Schwarzenberg, and the Belvedere (the garden palais of Prince Eugene of Savoy). In 1704, an outer fortification, the Linienwall, was built around the Vorstädte.\nAfter the extensive plague epidemics of 1679 and 1713, the population began to grow steadily. It is estimated that 150,000 people lived in Vienna in 1724, and 200,000 in 1790. At that time, the first factories were built, starting in Leopoldstadt. Leopoldstadt also became a site where many Jews lived, as they had been driven out of their 50-year-old ghetto in 1670. Hygienic problems began to become noticeable: sewers and street cleaning began to develop. Also in this time, the first house numbers (the Konskriptionsnummern) were issued, and the government postal system began to develop.\nUnder Emperor Joseph II, the city administration was modernized in 1783: officials in charge of only the city were introduced, and the Magistrate was created (More information about the Magistrate of the City of Vienna specifically can be found in German at de:Magistrat der Stadt Wien.). At the same time, the graveyards within the city were closed.\n\n\n=== 19th century ===\n\nDuring the Napoleonic Wars, Vienna was taken by the French twice, in 1805 and 1809. The first conquest happened without a battle. Three French marshals crossed the strongly defended Taborbrücke (Tábor bridge), the only Danube bridge at that time, and convinced the Austrian commander that the war was already over. In the meantime, the French army easily entered the city and was greeted by the population with interest rather than rejection. Napoleon allowed 10,000 men of the Vienna national guard to remain armed and left the arsenal to them when he left, as complete as he had found it.\nHowever, the second occupation happened only after heavy fire. Shortly after, Napoleon suffered his first large defeat at Aspern, nearby. Less than two months later, his army crossed the Danube again and fought the Battle of Wagram on the same terrain as the previous Battle of Aspern. This second battle resulted in a victory for the French, and Austria soon surrendered, ending the War of the Fifth Coalition. In 1810, Salomon Mayer Rothschild arrived in Vienna from Frankfurt and sets up a bank named ""Mayer von Rothschild und Söhne"". The Emperor of Austria in 1823, made the five Rothschild brothers barons. The Rothschild family became famous as bankers in the major countries of Europe, and the Rothschild banking family of Austria remained prominent until the Creditanstalt bank in Vienna was confiscated by the Nazis in 1938.\nAfter Napoleon's final defeat, the Congress of Vienna took place from September 18, 1814 to June 9, 1815, in which the political map of Europe was redrawn. The congress members indulged in many social events, which induced the witty Charles Joseph, Prince de Ligne to famously say: Le congres danse beaucoup, mais il ne marche pas (""The congress dances, but does not progress""). The events cost Austria a great deal of money, which was reflected in mockery about the major participants:\n\nAlexander of Russia: loves for all\nFrederick William of Prussia: thinks for all\nFrederick of Denmark: speaks for all\nMaximilian of Bavaria: drinks for all\nFrederick of Württemberg: eats for all\nEmperor Francis of Austria: pays for all\n\nThe first half of the century was characterised by intensive industrialization, with Vienna being the center of the railway network after 1837.\nThe French February Revolution of 1848 had an effect as far away as Vienna: on March 13, the March Revolution, which forced long-serving chancellor Metternich to resign.\nDuring the 19th century, Vienna, along with Budapest, became one of the main centers of the Aromanian diaspora. The Aromanian population of these cities stands out for one of the first ones to develop a strictly Aromanian identity., === Expansion under Emperor Franz Joseph I ===\n\nThe city was expanded in 1850, mostly to include the area within the Linienwall. The Vorstädte thus became the 2nd to 9th districts, with the old city becoming the first. In 1858, the fortifications were demolished, and the broad Ringstraße boulevard was built in their place. Many monumental buildings were built alongside it. The Ringstraße Style (Historicism) characterises the architecture of Vienna to this day. The period peaked in the World Exhibition of 1873, immediately before the stock market crash, which ended the Gründerzeit (""foundation era"").\nIn 1861, the Liberals won the first (relatively) free elections after the end of neoabsolutism.\nAfter the great flood of 1830, Regulation of the Danube was frequently considered. It was finally put into practice during the 1860s. The many branches of the Danube were removed, and a straight course was created away from the city centre. The branch near the central city was made narrower and has been known under the somewhat misleading name Donaukanal (Danube Canal) ever since.\nDuring that period, the population of Vienna increased sharply, mostly because of immigration. Censuses were conducted regularly from 1869 onwards, which showed an all-time high of population in 1910, with 2,031,000 inhabitants.\nAround 1900, Vienna became a centre of the Jugendstil (Art Nouveau), most of all with Otto Wagner and the association of artists known as Vienna Secession (after which the characteristic building on Karlsplatz is named).\nIn 1890, the city was expanded for a second time: the (Vororte) suburbs beyond the old Linienwall were incorporated into the town as the districts 11 to 19 (the 10th district had been created in 1874 by the division of the fourth). Leopoldstadt was divided in 1900, with the northern part becoming the 20th district (Brigittenau). In 1904, Floridsdorf became part of Vienna as 21st district.\nDuring those years, Karl Lueger was the leading figure of city politics. Neither his dedication to social policy can be denied, nor other works for the municipality (such as the Wiener Hochquellwasserleitung, bringing fresh water from the mountains to Vienna and the creation of a belt of meadows and forests around the city). However, these positive aspects were coupled with his raving and rhetorically well presented anti-Semitism, which enjoyed popular support.\n\n\n=== World War I ===\n\nWorld War I (1914–1918) did not result in an immediate threat to Vienna, but it led to a lack of supplies because of the economic embargo imposed by the Entente powers, which resulted in a shortage of food and clothes. The skyrocketing costs of the war, mostly financed by borrowing, resulted in tremendous inflation that—as in Germany—wiped out the savings of many middle-class Viennese. One interesting quirk is that the 1916 Rent Control was never officially repealed; even today there are buildings where the 1916 rent—by now essentially meaningless—applies.\nThese food shortages led to the 1918 Austro-Hungarian January Strike which started in Vienna. The Bolshevik seizure of power in Russia in November 1917 inspired the youth wing of the Social Democratic Workers' Party of Austria to organise meetings in November 1917 opposing the war.\nThe Flight over Vienna propaganda flight, an air raid inspired by Gabriele d'Annunzio, was carried out on August 9, 1918, with 11 Ansaldo SVA. He flew undisturbed for over 1,200 km in a roundtrip to Vienna to drop about 400,000 propaganda leaflets, written in Italian and German, asking the people of Vienna the end of the alliance between Austria-Hungary and Germany.]",0.25,"1. The retrieved context does match the subject matter of the user's query to some extent. It mentions the modernization of the city administration under Emperor Joseph II in 1783, including the introduction of officials in charge of only the city and the creation of the Magistrate. It also mentions the closure of graveyards within the city during the same period. However, it does not provide specific details about the impact of these reforms on urban management and public health in Vienna. Therefore, it partially matches the subject matter of the user's query. (1/2)\n2. The retrieved context cannot be used exclusively to provide a full answer to the user's query. While it does provide some information about the reforms implemented by Emperor Joseph II in 1783, it does not discuss their impact on the city of Vienna, particularly in relation to urban management and public health. Therefore, additional information would be needed to fully answer the user's query. (0/2)\n\n[RESULT] 1.0"
5,baseline,"Analyze the significance of the Congress of Vienna (1814-1815) in reshaping the political landscape of Europe, and explain how it affected Austria's financial situation.",,"[=== 18th century ===\n\nThe following period was characterised by extensive building activities. In the course of reconstruction, Vienna was largely turned into a baroque city. The most important architects were Johann Bernhard Fischer von Erlach and Johann Lukas von Hildebrandt. Most construction happened in the suburbs (Vorstädte), as the nobility began to cover the surrounding land with garden palaces, known as Palais. The best known are the Palais Liechtenstein, Palais Modena, Schönbrunn Palace, Palais Schwarzenberg, and the Belvedere (the garden palais of Prince Eugene of Savoy). In 1704, an outer fortification, the Linienwall, was built around the Vorstädte.\nAfter the extensive plague epidemics of 1679 and 1713, the population began to grow steadily. It is estimated that 150,000 people lived in Vienna in 1724, and 200,000 in 1790. At that time, the first factories were built, starting in Leopoldstadt. Leopoldstadt also became a site where many Jews lived, as they had been driven out of their 50-year-old ghetto in 1670. Hygienic problems began to become noticeable: sewers and street cleaning began to develop. Also in this time, the first house numbers (the Konskriptionsnummern) were issued, and the government postal system began to develop.\nUnder Emperor Joseph II, the city administration was modernized in 1783: officials in charge of only the city were introduced, and the Magistrate was created (More information about the Magistrate of the City of Vienna specifically can be found in German at de:Magistrat der Stadt Wien.). At the same time, the graveyards within the city were closed.\n\n\n=== 19th century ===\n\nDuring the Napoleonic Wars, Vienna was taken by the French twice, in 1805 and 1809. The first conquest happened without a battle. Three French marshals crossed the strongly defended Taborbrücke (Tábor bridge), the only Danube bridge at that time, and convinced the Austrian commander that the war was already over. In the meantime, the French army easily entered the city and was greeted by the population with interest rather than rejection. Napoleon allowed 10,000 men of the Vienna national guard to remain armed and left the arsenal to them when he left, as complete as he had found it.\nHowever, the second occupation happened only after heavy fire. Shortly after, Napoleon suffered his first large defeat at Aspern, nearby. Less than two months later, his army crossed the Danube again and fought the Battle of Wagram on the same terrain as the previous Battle of Aspern. This second battle resulted in a victory for the French, and Austria soon surrendered, ending the War of the Fifth Coalition. In 1810, Salomon Mayer Rothschild arrived in Vienna from Frankfurt and sets up a bank named ""Mayer von Rothschild und Söhne"". The Emperor of Austria in 1823, made the five Rothschild brothers barons. The Rothschild family became famous as bankers in the major countries of Europe, and the Rothschild banking family of Austria remained prominent until the Creditanstalt bank in Vienna was confiscated by the Nazis in 1938.\nAfter Napoleon's final defeat, the Congress of Vienna took place from September 18, 1814 to June 9, 1815, in which the political map of Europe was redrawn. The congress members indulged in many social events, which induced the witty Charles Joseph, Prince de Ligne to famously say: Le congres danse beaucoup, mais il ne marche pas (""The congress dances, but does not progress""). The events cost Austria a great deal of money, which was reflected in mockery about the major participants:\n\nAlexander of Russia: loves for all\nFrederick William of Prussia: thinks for all\nFrederick of Denmark: speaks for all\nMaximilian of Bavaria: drinks for all\nFrederick of Württemberg: eats for all\nEmperor Francis of Austria: pays for all\n\nThe first half of the century was characterised by intensive industrialization, with Vienna being the center of the railway network after 1837.\nThe French February Revolution of 1848 had an effect as far away as Vienna: on March 13, the March Revolution, which forced long-serving chancellor Metternich to resign.\nDuring the 19th century, Vienna, along with Budapest, became one of the main centers of the Aromanian diaspora. The Aromanian population of these cities stands out for one of the first ones to develop a strictly Aromanian identity., == See also ==\nTimeline of Vienna\nHistory of Austria\nDistricts of Vienna\n\n\n== References ==\n\n\n== Further reading ==\n\nBaranello, Micaela. The Operetta Empire: Music Theater in Early Twentieth-Century Vienna (U of California Press, 2021).\nBeller, Steven. Vienna and the Jews 1867-1938: A Cultural History (Cambridge, 1989).\nBowman, William D. Priest and Parish in Vienna, 1780 to 1880 (2000).\nBoyer, John W. Culture and Political Crisis in Vienna: Christian Socialism in Power, 1897-1918 (U of Chicago Press, 1995).\nBoyer, John. Political Radicalism in Late Imperial Vienna: Origins of the Christian Social Movement, 1848-1897 (U of Chicago Press, 1981).\nBuklijas, Tatjana. ""Surgery and national identity in late nineteenth-century Vienna."" Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 38.4 (2007): 756-774. online\nCoen, Deborah R. Vienna in the age of uncertainty: science, liberalism, and private life (U of Chicago Press, 2008).\nCsendes, Peter. Historical Dictionary of Vienna (Scarecrow Press, 1999).\nEmerson, Charles. 1913: In Search of the World Before the Great War (2013) compares Vienna to 20 major world cities on the eve of World War I; pp 87–109.\nGeehr, Richard S. Karl Lueger: Mayor of Fin de Siècle Vienna (Wayne State University Press, 1990)\nHamann, Brigette. Hitler's Vienna: A Dictator's Apprenticeship (Oxford P, 1999).\nHanák, Péter. The garden and the workshop: essays on the cultural history of Vienna and Budapest (Princeton University Press, 2014)\nHealy, Maureen. Vienna and the Fall of the Habsburg Empire: Total War and Everyday Life in World War I (2004).\nKarnes, Kevin C. ""Wagner, Klimt, and the Metaphysics of Creativity in fin-de-siècle Vienna."" Journal of the American Musicological Society 62.3 (2009): 647-697. online\nKarnes, Kevin. Music, criticism, and the challenge of history: Shaping modern musical thought in late nineteenth century Vienna (Oxford UP, 2008).\nKarnes, Kevin. A kingdom not of this world: Wagner, the arts, and utopian visions in fin-de-siècle Vienna (Oxford UP, 2013).\nMay, A.J. Vienna in the Age of Franz Joseph (U of Oklahoma Press, 1968).\nMillar, Simon and Peter Dennis. Vienna 1683: Christian Europe Repels the Ottomans (Osprey, 2008)\nMorton, Frederik. A Nervous Splendour: Vienna 1888-1889 (Little, Brown, 1979).\nOffenberger, Ilana Fritz. The Jews of Nazi Vienna, 1938-1945: Rescue and Destruction (Springer, 2017).\nParsons, Nicholas. Vienna: A Cultural History (2008).\nRampley, Matthew. The Vienna School of Art History: Empire and the Politics of Scholarship, 1847-1918 (Penn State Press, 2013).\nRegal, Wolfgang and Michael Nanut. Vienna A Doctor’s Guide: 15 walking tours through Vienna’s medical history (2007)\nRozenblit, Marsha. The Jews of Vienna, 1867-1914: Assimilation and Identity (State University of New York Press, 1984).\nSchorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna.""]",0.25,"The retrieved context does match the subject matter of the user's query to some extent. It provides information about the Congress of Vienna (1814-1815) and its role in reshaping the political landscape of Europe. However, it does not provide a detailed analysis of the significance of the Congress of Vienna in reshaping the political landscape of Europe, which is what the user asked for. \n\nThe context also does not provide any information on how the Congress of Vienna affected Austria's financial situation. The only mention of Austria's financial situation is a brief mention of the cost of the congress to Austria, but this does not provide a comprehensive answer to the user's query.\n\n1. Does the retrieved context match the subject matter of the user's query? Partially, the context does mention the Congress of Vienna and its role in reshaping Europe's political landscape, but it does not provide a detailed analysis of its significance. (1/2)\n2. Can the retrieved context be used exclusively to provide a full answer to the user's query? No, the context does not provide any information on how the Congress of Vienna affected Austria's financial situation. (0/2)\n\n[RESULT] 1/4"
11,baseline,Explain the significance of the Austrian State Treaty of 1955 and its implications for Austria's political status and neutrality in the context of post-war Europe.,,"[== The First Republic ==\n\nThe end of the war was also the end of Austria-Hungary. On November 12, 1918, the Republic of Deutsch-Österreich, or German-Austria, was proclaimed in front of the parliament. The population was concentrated in the capital, which was often called a hydrocephalus because of this; articles in the international press had doubts about the viability of Vienna as a major European metropolis after the dissolution of Austria-Hungary.\nIn 1921, Vienna was separated from surrounding Lower Austria and became a state of its own, with the mayor now equivalent in rank to a state governor. However, it remained the nominal capital of Lower Austria until 1986. The left-wing Social Democrats, who had dominated since the end of the war, were now in charge of the city administration. ""Red Vienna"" was considered an international model. Many notable Gemeindebauten (low-cost residential estates) were built during that period.\nHowever, the increasing economic difficulties resulted in a political radicalization and polarisation of the political parties. On the social democratic side, the left-wing Republikanische Schutzbund (Republican Protective Alliance) was formed in 1923/24, which was a well-organised and well-equipped paramilitary group. It was opposed by the right-wing Heimwehr (""Home Guard""), which had been formed after the end of the war from local guards and similar combat units.\nIn May 1923, in the presence of President Michael Hainisch, the First World Congress of Jewish Women was inaugurated at the Hofburg.\n\n\n== Austrofascism ==\n\nThe fire of the Justizpalast (Palace of Justice) in 1927 after violent demonstrations, the collapse of the Creditanstalt, the largest bank of the country, and finally the dissolution of parliament in 1933, marked the way to the Civil War in February 1934. After Engelbert Dollfuß, who had been Chancellor of Austria and foreign minister since 1932, had forbidden the Nazi Party, the Communist Party and the Schutzbund in 1933, he extended the ban to the Social Democratic Party in 1934 after the February Uprising. The only legal political organization was the new movement of his own creation, the Vaterländische Front. Dollfuß created an authoritarian regime called Ständestaat and ruled without parliamentary approval (also see Austrofascism).\n\n\n== Annexation by Nazi Germany and Second World War ==\n\nIn March 1938 Nazi Germany occupied and annexed Austria in a process known as the Anschluss. Adolf Hitler was rapturously received in Vienna by large crowds of admirers and famously gave a speech at Heldenplatz in which he welcomed his homeland into the Reich. Hitler's anti-Jewish policies fell on fertile soil in Vienna, where latent anti-Semitism had increased during the early 20th century. Immediately after the Anschluss the Jews of Vienna were subject to violence from the State as well as from Antisemites acting out of their own sadism. During the Reichskristallnacht on November 9, 1938, the synagogues, the Jewish centres of not only religious, but also social life, were destroyed. In August, the KZ Oberlanzendorf Wien (Central Office for Jewish Emigration) was created. Its head was Adolf Eichmann. On the whole, and despite some instances of anti-Semitism, the city of Vienna was less supportive of the Nazi regime than was the rest of Austria. This however did not reach any extent of an organised resistance. Hitler himself hated Vienna and was determined to build up Linz, his childhood hometown, and relegate Vienna to backwater status.\nIn the course of the expansion of the city in 1938, 91 adjoining municipalities were incorporated into the city, from which the 22nd (Groß-Enzersdorf), the 23rd (Schwechat), the 24th (Mödling), the 25th (Liesing) and the 26th (Klosterneuburg) districts were created. With an area of 1,224 km2, this made Vienna the city with the largest territory in the Third Reich., === World War II ===\nDuring the war, the city housed 12 sub-camps of the infamous Mauthausen-Gusen concentration camp, located in various parts of the city.\nAlthough originally out of range of Allied bombers operating from England, by 1943 the city began being attacked by bombers based in Italy. As in other major cities of Nazi Germany, large flak towers were constructed in Vienna as defensive structures against these bombers. These structures remain in place today, as they are so thick and their foundations so deep that to destroy them would damage surrounding buildings.\nThe U.S. bombings of 1944 and 1945 and the vicious fighting during the subsequent conquest of Vienna by Soviet troops in April 1945 caused much destruction within the city. However, some historic buildings survived the bombardment; many more were reconstructed after the war.\n\n\n== The Second Republic ==\n\n\n=== Allied occupation ===\n\nOnly a few days after the war, a provisional city government and administration was created. Also, the political parties were recreated. On April 29, 1945, the parliament building passed from the occupation force to the new Austrian government, and Karl Renner announced the reinstitution of the democratic Republic of Austria. Vienna was divided into five occupation zones between the Soviet Union, the United States, the UK, France, and with the first district (city centre) being patrolled by all four.\nThe first municipal elections were held in November 1945. Of the 100 seats in the municipal council, the left-wing Social Democratic Party captured 58, the right-wing Austrian People's Party 36, and the Communists 6. In 1946, it was decided that the expansion of city territory of 1938 should be reversed, but this law was delayed by a veto of the occupying powers and was not put into practice until 1954. Two districts remained with Vienna, namely the 22nd one (Donaustadt) north of the Danube and the 23rd one (Liesing) in the south (some other districts gained some Lower Austrian territory).\n\n\n=== Modern history since independence (1955) ===\n\nOn May 15, 1955, the country regained its political independence and sovereignty with the ""Austrian State Treaty"". The Austrian Parliament immediately amended the treaty to establish Austria's future neutrality and non-alignment (similar to that of Switzerland). This peace treaty was called a state treaty because Austria had temporarily ceased to exist in 1938.\nAfter the war, as everywhere in Western Europe, there was an enormous economic boom, among other things because of the economic aid resulting from the Marshall Plan.\nPublic transport in Vienna was improved by the introduction of the new U-Bahn network, the first part of which was opened in 1978. In 1979, the second Strategic Arms Limitation Talks treaty was signed in Vienna. During the 1970s, Vienna became the third official seat of the United Nations, and the UNO-City was built. At the end of the 20th century, a skyline consisting of several skyscrapers was created with, among others, the Andromeda Tower and Millennium Tower on the left and right sides of the Danube. Furthermore, a complex of skyscrapers was planned at the site of the Wien Mitte railway station, which might have endangered the position of Vienna's centre as a UNESCO World Heritage Site. The project has since been redesigned.\nVienna was the capital of the surrounding Bundesland of Lower Austria (German: Niederösterreich) until 1986, when it was replaced by Sankt Pölten. As Vienna was not a geographical part of Lower Austria it was a capital outside of the territory it served.\nIn the municipal elections of 2001, the Social Democrats regained an absolute majority. With the Liberal Forum not gaining enough votes, only four parties have been represented in the municipal council since then. In the 2005 elections, the Social Democrats further increased their majority.\n\nEspionage in Austria is legal if the country itself is not the target. The city has about 17,000 diplomats, many assigned to international organizations. Because of their presence and Austria's neutrality, Vienna became an important center for espionage, as depicted in The Third Man; during the Cold War the city allegedly had more spies than Austrian soldiers. Despite the 2022 Russian invasion of Ukraine, as of July 2023 there are more than 180 accredited Russian diplomats in Vienna, and reportedly more than one third are spies using diplomatic cover; Chinese, Iranian, Israeli, Saudi, American, British, and other European spies and counterspies are also in the country. Because of the tolerance of espionage the country is excluded from the Club de Berne.\n\n\n== See also ==\nTimeline of Vienna\nHistory of Austria\nDistricts of Vienna\n\n\n== References ==\n\n\n== Further reading ==\n\nBaranello, Micaela.]",0.375,"1. The retrieved context does match the subject matter of the user's query to some extent. It provides a detailed history of Austria, including the period leading up to and following the Austrian State Treaty of 1955. It mentions the treaty and its implications for Austria's political status and neutrality. However, the context does not focus specifically on the significance of the treaty and its implications for Austria's political status and neutrality in the context of post-war Europe. It provides a broader historical context, which includes but is not limited to the specific subject matter of the user's query. Therefore, it partially matches the subject matter of the user's query. (1/2 points)\n\n2. The retrieved context cannot be used exclusively to provide a full answer to the user's query. While it does mention the Austrian State Treaty of 1955 and its implications for Austria's political status and neutrality, it does not provide a detailed explanation of the significance of the treaty and its implications in the context of post-war Europe. The user's query requires a more focused and detailed explanation of these specific aspects, which the retrieved context does not provide. Therefore, it cannot be used exclusively to provide a full answer to the user's query. (0.5/2 points)\n\n[RESULT] 1.5/4"
12,baseline,Discuss the significance of Karl Lueger's mayoralty in fin-de-siècle Vienna as presented in Richard S. Geehr's work. How did his leadership influence the political and cultural landscape of the city during that period?,,"[Schorske, Carl E. Fin-de-siècle Vienna: politics and culture (1979)\nSilverman, Lisa. Becoming Austrians: Jews and Culture between the World Wars (Oxford UP, 2012), focus on Vienna.\nUhl, Heidemarie. ""Museums as Engines of Identity: 'Vienna around 1900' and Exhibitionary Cultures in Vienna—A Comment."" Austrian History Yearbook 46 (2015): 97-105.\nWagner-Trenkwitz, Christoph. A Sound Tradition: A Short History of the Vienna Philharmonic Orchestra (Amalthea Signum Verlag, 2017).\nWasserman, Janek. ""The Austro-Marxist struggle for 'intellectual workers': the lost debate on the question of intellectuals in interwar Vienna."" Modern Intellectual History 9.2 (2012): 361-388.\nWistrich, Robert S. ""Karl Lueger and the Ambiguities of Viennese Antisemitism."" Jewish Social Studies 45.3/4 (1983): 251-262. online\nYales, W. E. Theatre in Vienna: A Critical History, 1776-1995 (Cambridge University Press, 1996)\n\n\n=== Historiography and Memory ===\nArens, Katherine. Belle Necropolis: Ghosts of Imperial Vienna (2014), art and memory\nBeller, Steven. Rethinking Vienna 1900 (2001)\nJovanović, Miloš. ""Whitewashed empire: Historical narrative and place marketing in Vienna."" History and Anthropology 30.4 (2019): 460-476.\nPirker, Peter, Johannes Kramer, and Mathias Lichtenwagner. ""Transnational memory spaces in the making: World War II and holocaust remembrance in Vienna."" International Journal of Politics, Culture, and Society 32.4 (2019): 439-458. online\n\n\n== External links ==\n Media related to History of Vienna at Wikimedia Commons\n\nGeschichtewiki.wien.gv.at - Vienna History Wiki operated by the city of Vienna\nThe History of Jewish Vienna, === Expansion under Emperor Franz Joseph I ===\n\nThe city was expanded in 1850, mostly to include the area within the Linienwall. The Vorstädte thus became the 2nd to 9th districts, with the old city becoming the first. In 1858, the fortifications were demolished, and the broad Ringstraße boulevard was built in their place. Many monumental buildings were built alongside it. The Ringstraße Style (Historicism) characterises the architecture of Vienna to this day. The period peaked in the World Exhibition of 1873, immediately before the stock market crash, which ended the Gründerzeit (""foundation era"").\nIn 1861, the Liberals won the first (relatively) free elections after the end of neoabsolutism.\nAfter the great flood of 1830, Regulation of the Danube was frequently considered. It was finally put into practice during the 1860s. The many branches of the Danube were removed, and a straight course was created away from the city centre. The branch near the central city was made narrower and has been known under the somewhat misleading name Donaukanal (Danube Canal) ever since.\nDuring that period, the population of Vienna increased sharply, mostly because of immigration. Censuses were conducted regularly from 1869 onwards, which showed an all-time high of population in 1910, with 2,031,000 inhabitants.\nAround 1900, Vienna became a centre of the Jugendstil (Art Nouveau), most of all with Otto Wagner and the association of artists known as Vienna Secession (after which the characteristic building on Karlsplatz is named).\nIn 1890, the city was expanded for a second time: the (Vororte) suburbs beyond the old Linienwall were incorporated into the town as the districts 11 to 19 (the 10th district had been created in 1874 by the division of the fourth). Leopoldstadt was divided in 1900, with the northern part becoming the 20th district (Brigittenau). In 1904, Floridsdorf became part of Vienna as 21st district.\nDuring those years, Karl Lueger was the leading figure of city politics. Neither his dedication to social policy can be denied, nor other works for the municipality (such as the Wiener Hochquellwasserleitung, bringing fresh water from the mountains to Vienna and the creation of a belt of meadows and forests around the city). However, these positive aspects were coupled with his raving and rhetorically well presented anti-Semitism, which enjoyed popular support.\n\n\n=== World War I ===\n\nWorld War I (1914–1918) did not result in an immediate threat to Vienna, but it led to a lack of supplies because of the economic embargo imposed by the Entente powers, which resulted in a shortage of food and clothes. The skyrocketing costs of the war, mostly financed by borrowing, resulted in tremendous inflation that—as in Germany—wiped out the savings of many middle-class Viennese. One interesting quirk is that the 1916 Rent Control was never officially repealed; even today there are buildings where the 1916 rent—by now essentially meaningless—applies.\nThese food shortages led to the 1918 Austro-Hungarian January Strike which started in Vienna. The Bolshevik seizure of power in Russia in November 1917 inspired the youth wing of the Social Democratic Workers' Party of Austria to organise meetings in November 1917 opposing the war.\nThe Flight over Vienna propaganda flight, an air raid inspired by Gabriele d'Annunzio, was carried out on August 9, 1918, with 11 Ansaldo SVA. He flew undisturbed for over 1,200 km in a roundtrip to Vienna to drop about 400,000 propaganda leaflets, written in Italian and German, asking the people of Vienna the end of the alliance between Austria-Hungary and Germany.]",0.5,"1. The retrieved context does match the subject matter of the user's query. It provides information about Karl Lueger's mayoralty in fin-de-siècle Vienna, discussing his influence on the city's politics and culture. It also mentions his anti-Semitism, which was a significant aspect of his leadership. However, the context does not specifically reference Richard S. Geehr's work, which the user's query asked for. Therefore, it only partially matches the subject matter. (1.5 points)\n2. The retrieved context cannot be used exclusively to provide a full answer to the user's query. While it does provide some information about Karl Lueger's leadership and its influence on Vienna, it does not provide a comprehensive analysis of his significance during this period. Furthermore, it does not discuss Richard S. Geehr's interpretation of Lueger's mayoralty, which the user specifically asked for. Therefore, additional information would be needed to fully answer the user's query. (0.5 points)\n\n[RESULT] 2.0"


## 2.2 Retrieval Evaluation  

Given a retriever and a set of questions, evaluate retrieved results using ranking metrics.

https://docs.llamaindex.ai/en/stable/module_guides/evaluating/
https://docs.llamaindex.ai/en/stable/module_guides/evaluating/usage_pattern_retrieval/
https://docs.llamaindex.ai/en/stable/module_guides/evaluating/usage_pattern/