# CitationQueryEngine

This notebook walks through how to use the CitationQueryEngine

The CitationQueryEngine can be used with any existing index.

## Setup

In [1]:
import os
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from llama_index.query_engine import CitationQueryEngine
from llama_index.retrievers import VectorIndexRetriever
from llama_index import (
    VectorStoreIndex,
    ResponseSynthesizer,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage,
    LLMPredictor,
    ServiceContext,
)

  from .autonotebook import tqdm as notebook_tqdm


In [11]:
service_context = ServiceContext.from_defaults(
    llm_predictor=LLMPredictor(llm=ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0))
)

In [12]:
if not os.path.exists("./citation"):
    documents = SimpleDirectoryReader("../data/paul_graham").load_data()
    index = VectorStoreIndex.from_documents(documents, service_context=service_context)
    index.storage_context.persist(persist_dir="./citation")
else:
    index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./citation"), 
        service_context=service_context
    )


## Create the CitationQueryEngine w/ Default Arguments

In [24]:
query_engine = CitationQueryEngine.from_args(
    index, 
    similarity_top_k=3,
    # here we can control how granular citation sources are, the default is 512
    citation_chunk_size=512 
)

In [25]:
response = query_engine.query("What did the author do growing up?")

In [26]:
print(response)

Before college, the author worked on writing short stories and programming on an IBM 1401 using an early version of Fortran [11]. Later, the author got a TRS-80 and wrote simple games, a program to predict how high model rockets would fly, and a word processor that their father used to write at least one book [14].


### Inspecting the Actual Source
Sources start counting at 1, but python arrays start counting at zero!

Let's confirm the source makes sense.

In [29]:
print(response.source_nodes[10].node.get_text())

Source 11:
		

What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.

The language we used was an early version of Fortran. You had to type programs on punch cards, then st

In [30]:
print(response.source_nodes[13].node.get_text())

Source 14:
used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.

Though I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.

I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.

AI was in the air in the mid 1980s

## Adjusting Settings

In [31]:
query_engine = CitationQueryEngine.from_args(
    index, 
    # increase the citation chunk size!
    citation_chunk_size=1024,
    similarity_top_k=3
)

In [32]:
response = query_engine.query("What did the author do growing up?")

In [33]:
print(response)

Before college, the author worked on writing short stories and programming on an IBM 1401 using an early version of Fortran [11]. They later got a TRS-80 microcomputer and wrote simple games, a program to predict how high model rockets would fly, and a word processor that their father used to write at least one book [14].


### Inspecting the Actual Source
Sources start counting at 1, but python arrays start counting at zero!

Let's confirm the source makes sense.

In [34]:
print(response.source_nodes[10].node.get_text())

Source 11:
		

What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.

The language we used was an early version of Fortran. You had to type programs on punch cards, then st

In [35]:
print(response.source_nodes[13].node.get_text())

Source 14:
used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.

Though I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.

I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.

AI was in the air in the mid 1980s