# An Introduction to LlamaIndex Query Pipelines

## Overview
LlamaIndex provides a declarative query API that allows you to chain together different modules in order to orchestrate simple-to-advanced workflows over your data.

This is centered around our `QueryPipeline` abstraction. Load in a variety of modules (from LLMs to prompts to retrievers to other pipelines), connect them all together into a sequential chain or DAG, and run it end2end.

**NOTE**: You can orchestrate all these workflows without the declarative pipeline abstraction (by using the modules imperatively and writing your own functions). So what are the advantages of `QueryPipeline`? 

- Express common workflows with fewer lines of code/boilerplate
- Greater readability
- Greater parity / better integration points with common low-code / no-code solutions (e.g. LangFlow)
- [In the future] A declarative interface allows easy serializability of pipeline components, providing portability of pipelines/easier deployment to different systems.

## Cookbook

In this cookbook we give you an introduction to our `QueryPipeline` interface and show you some basic workflows you can tackle.

- Chain together prompt and LLM
- Chain together query rewriting (prompt + LLM) with retrieval
- Chain together a full RAG query pipeline (query rewriting, retrieval, reranking, response synthesis)
- Setting up a custom query component

## Setup

Here we setup some data + indexes (from PG's essay) that we'll be using in the rest of the cookbook.

In [1]:
# setup Arize Phoenix for logging/observability
import phoenix as px

px.launch_app()
import llama_index

llama_index.set_global_handler("arize_phoenix")

🌍 To view the Phoenix app in your browser, visit http://127.0.0.1:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


In [2]:
from llama_index.query_pipeline.query import QueryPipeline
from llama_index.llms import OpenAI
from llama_index.prompts import PromptTemplate
from llama_index import (
    VectorStoreIndex,
    ServiceContext,
    SimpleDirectoryReader,
    load_index_from_storage,
)

In [3]:
reader = SimpleDirectoryReader("../data/paul_graham")

In [4]:
docs = reader.load_data()

In [5]:
import os
from llama_index.storage import StorageContext

if not os.path.exists("storage"):
    index = VectorStoreIndex.from_documents(docs)
    # save index to disk
    index.set_index_id("vector_index")
    index.storage_context.persist("./storage")
else:
    # rebuild storage context
    storage_context = StorageContext.from_defaults(persist_dir="storage")
    # load index
    index = load_index_from_storage(storage_context, index_id="vector_index")

## 1. Chain Together Prompt and LLM 

In this section we show a super simple workflow of chaining together a prompt with LLM.

We simply define `chain` on initialization. This is a special case of a query pipeline where the components are purely sequential, and we automatically convert outputs into the right format for the next inputs.

In [6]:
# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")

p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

In [7]:
output = p.run(movie_name="The Departed")

[1;3;38;2;155;135;227m> Running module 78ba745c-3ef7-45f3-b72e-81142d5c2bef with input: 
movie_name: The Departed

[0m[1;3;38;2;155;135;227m> Running module 528c1828-2484-4a54-8f2c-b4241360fe9c with input: 
messages: Please generate related movies to The Departed

[0m

In [8]:
print(str(output))

assistant: 1. Infernal Affairs (2002) - This is the Hong Kong crime thriller that inspired The Departed. It follows a similar storyline of undercover cops infiltrating the criminal underworld.

2. Internal Affairs (1990) - This American crime thriller, starring Richard Gere and Andy Garcia, revolves around a corrupt cop and an internal affairs officer determined to expose him.

3. The Town (2010) - Directed by Ben Affleck, this crime drama follows a group of bank robbers in Boston who find themselves in a dangerous situation when they take a hostage during a heist.

4. Heat (1995) - Directed by Michael Mann, this crime thriller features Al Pacino and Robert De Niro as a detective and a professional thief, respectively, whose paths cross in a high-stakes game of cat and mouse.

5. The Departed (2006) - Although it is the movie that inspired this list, it's worth mentioning The Departed itself. Directed by Martin Scorsese, it tells the story of an undercover cop and a mole in the police 

### Try Output Parsing

Let's parse the outputs into a structured Pydantic object.

In [9]:
from typing import List
from pydantic import BaseModel, Field
from llama_index.output_parsers import PydanticOutputParser

class Movie(BaseModel):
    """Object representing a single movie."""
    name: str = Field(..., description="Name of the movie.")
    year: int = Field(..., description="Year of the movie.")

class Movies(BaseModel):
    """Object representing a list of movies."""
    movies: List[Movie] = Field(..., description="List of movies.")


llm = OpenAI(model="gpt-3.5-turbo")
output_parser = PydanticOutputParser(Movies)
prompt_str = """\
Please generate related movies to {movie_name}. Output with the following JSON format: 
"""
prompt_str = prompt_str + output_parser.format_string

In [10]:
# add JSON spec to prompt template
prompt_tmpl = PromptTemplate(prompt_str)

p = QueryPipeline(chain=[prompt_tmpl, llm, output_parser], verbose=True)
output = p.run(movie_name="Toy Story")

[1;3;38;2;155;135;227m> Running module 526a8682-3e06-49f2-841c-79ab8c5db208 with input: 
movie_name: Toy Story

[0m[1;3;38;2;155;135;227m> Running module 9d3b027b-2258-4793-9383-a015e3f3e912 with input: 
messages: Please generate related movies to Toy Story. Output with the following JSON format: 

Here's a JSON schema to follow:
{"title": "Movies", "description": "Object representing a list of movies.", "type"...

[0m[1;3;38;2;155;135;227m> Running module 5397732d-62fb-470a-97df-366593904c94 with input: 
input: assistant: {
  "movies": [
    {
      "name": "Finding Nemo",
      "year": 2003
    },
    {
      "name": "Cars",
      "year": 2006
    },
    {
      "name": "Up",
      "year": 2009
    },
    {...

[0m

In [11]:
output

Movies(movies=[Movie(name='Finding Nemo', year=2003), Movie(name='Cars', year=2006), Movie(name='Up', year=2009), Movie(name='Inside Out', year=2015), Movie(name='Coco', year=2017)])

## Chain Together Query Rewriting Workflow (prompts + LLM) with Retrieval

Here we try a slightly more complex workflow where we send the input through two prompts before initiating retrieval.

1. Generate question about given topic.
2. Hallucinate answer given question, for better retrieval.

Since each prompt only takes in one input, note that the `QueryPipeline` will automatically chain LLM outputs into the prompt and then into the LLM. 

You'll see how to define links more explicitly in the next section.

In [12]:
from llama_index.postprocessor import CohereRerank

# generate question regarding topic
prompt_str1 = "Please generate a concise question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl1 = PromptTemplate(prompt_str1)
# use HyDE to hallucinate answer.
prompt_str2 = (
    "Please write a passage to answer the question\n"
    "Try to include as many key details as possible.\n"
    "\n"
    "\n"
    "{query_str}\n"
    "\n"
    "\n"
    'Passage:"""\n'
)
prompt_tmpl2 = PromptTemplate(prompt_str2)

llm = OpenAI(model="gpt-3.5-turbo")
retriever = index.as_retriever(similarity_top_k=5)
p = QueryPipeline(
    chain=[prompt_tmpl1, llm, prompt_tmpl2, llm, retriever], verbose=True
)

In [13]:
nodes = p.run(topic="college")
len(nodes)

[1;3;38;2;155;135;227m> Running module cd0adb52-0d2a-469e-88f5-de305698fe92 with input: 
topic: college

[0m[1;3;38;2;155;135;227m> Running module 90eeb27f-53fb-4ff5-8a08-7457f2bbffd0 with input: 
messages: Please generate a concise question about Paul Graham's life regarding the following topic college

[0m[1;3;38;2;155;135;227m> Running module 20b8e244-5e43-477e-8b19-02176229a342 with input: 
query_str: assistant: How did Paul Graham's college experience shape his career and entrepreneurial mindset?

[0m[1;3;38;2;155;135;227m> Running module e5d357b7-4299-4eed-ac27-54287cfa3543 with input: 
messages: Please write a passage to answer the question
Try to include as many key details as possible.


assistant: How did Paul Graham's college experience shape his career and entrepreneurial mindset?


Pass...

[0m[1;3;38;2;155;135;227m> Running module c90e1117-7eae-43e8-9bbb-21c773dee1fc with input: 
input: assistant: Paul Graham's college experience played a pivotal role in shaping 

5

## Chain Together a Full RAG Pipeline

Here we chain together a full RAG pipeline consisting of query rewriting, retrieval, reranking, and response synthesis.

Here we can't use `chain` syntax because certain modules depend on multiple inputs (for instance, response synthesis expects both the retrieved nodes and the original question). Instead we'll construct a DAG explicitly, through `add_modules` and then `add_link`.

In [14]:
from llama_index.postprocessor import CohereRerank
from llama_index.response_synthesizers import TreeSummarize
from llama_index import ServiceContext

# define modules
prompt_str = "Please generate a question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")
retriever = index.as_retriever(similarity_top_k=3)
reranker = CohereRerank()
summarizer = TreeSummarize(
    service_context=ServiceContext.from_defaults(llm=llm)
)

In [15]:
# define query pipeline
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "llm": llm,
        "prompt_tmpl": prompt_tmpl,
        "retriever": retriever,
        "summarizer": summarizer,
        "reranker": reranker,
    }
)
p.add_link("prompt_tmpl", "llm")
p.add_link("llm", "retriever")
p.add_link("retriever", "reranker", dest_key="nodes")
p.add_link("llm", "reranker", dest_key="query_str")
p.add_link("reranker", "summarizer", dest_key="nodes")
p.add_link("llm", "summarizer", dest_key="query_str")

In [16]:
response = p.run(topic="YC")

[1;3;38;2;155;135;227m> Running module prompt_tmpl with input: 
topic: YC

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: Please generate a question about Paul Graham's life regarding the following topic YC

[0m[1;3;38;2;155;135;227m> Running module retriever with input: 
input: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?

[0m[1;3;38;2;155;135;227m> Running module reranker with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(id_='543f958b-2c46-4c0f-b046-22e0a60ea950', embedding=None, metadata={'file_path': '../data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file...

[0m[1;3;38;2;155;135;227m> Running module summarizer with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(

In [17]:
print(str(response))

Paul Graham played a significant role in the founding and development of Y Combinator (YC). He was one of the co-founders of YC and was actively involved in its early stages. He helped establish the structure and funding model for YC, including the Summer Founders Program. Graham also played a key role in selecting and supporting the startups that were part of YC's batches. As YC grew, Graham's involvement shifted, and he focused more on writing essays and working on YC, while also being involved in other projects such as Hacker News. Eventually, Graham decided to hand over the reins of YC to Sam Altman and retire from his active role in the organization.


In [18]:
# you can do async too
response = await p.arun(topic="YC")
print(str(response))

[1;3;38;2;155;135;227m> Running module prompt_tmpl with input: 
topic: YC

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: Please generate a question about Paul Graham's life regarding the following topic YC

[0m[1;3;38;2;155;135;227m> Running module retriever with input: 
input: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?

[0m[1;3;38;2;155;135;227m> Running module reranker with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(id_='543f958b-2c46-4c0f-b046-22e0a60ea950', embedding=None, metadata={'file_path': '../data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file...

[0m[1;3;38;2;155;135;227m> Running module summarizer with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(

## Defining a Custom Component in a Query Pipeline

You can easily define a custom component. Simply subclass a `QueryComponent`, implement validation/run functions + some helpers, and plug it in.

Let's wrap the related movie generation prompt+LLM chain from the first example into a custom component.

In [19]:
from llama_index.query_pipeline import CustomQueryComponent, InputKeys, OutputKeys
from typing import Dict, Any
from llama_index.llms.llm import BaseLLM
from pydantic import Field


class RelatedMovieComponent(CustomQueryComponent):
    """Related movie component."""

    llm: BaseLLM = Field(..., description="OpenAI LLM")

    def _validate_component_inputs(self, input: Dict[str, Any]) -> Dict[str, Any]:
        """Validate component inputs during run_component."""
        # NOTE: this is OPTIONAL but we show you here how to do validation as an example
        return input

    @property
    def _input_keys(self) -> set:
        """Input keys dict."""
        # NOTE: These are required inputs. If you have optional inputs please override 
        # `optional_input_keys_dict`
        return {"movie"}

    @property
    def _output_keys(self) -> set:
        return {"output"}

    def _run_component(self, **kwargs) -> Dict[str, Any]:
        """Run the component."""
        # use QueryPipeline itself here for convenience
        prompt_str = "Please generate related movies to {movie_name}"
        prompt_tmpl = PromptTemplate(prompt_str)
        p = QueryPipeline(chain=[prompt_tmpl, llm])
        return {"output": p.run(movie_name=kwargs["movie"])}

Let's try the custom component out! We'll also add a step to convert the output to Shakespeare.

In [20]:
llm = OpenAI(model="gpt-3.5-turbo")
component = RelatedMovieComponent(llm=llm)

# let's add some subsequent prompts for fun
prompt_str = """\
Here's some text:

{text}

Can you rewrite this in the voice of Shakespeare?
"""
prompt_tmpl = PromptTemplate(prompt_str)

p = QueryPipeline(chain=[component, prompt_tmpl, llm], verbose=True)

In [21]:
output = p.run(movie="Love Actually")

[1;3;38;2;155;135;227m> Running module 3c12a0fd-9028-4146-9ae5-16d0ac3b6357 with input: 
movie: Love Actually

[0m[1;3;38;2;155;135;227m> Running module 6340e3f9-d784-44a8-a31e-e15cfe868cc9 with input: 
text: assistant: 1. "Valentine's Day" (2010)
2. "New Year's Eve" (2011)
3. "The Holiday" (2006)
4. "Crazy, Stupid, Love" (2011)
5. "Notting Hill" (1999)
6. "Four Weddings and a Funeral" (1994)
7. "Bridget J...

[0m[1;3;38;2;155;135;227m> Running module c5caa17c-c05b-42e9-9b65-5d1637553695 with input: 
messages: Here's some text:

assistant: 1. "Valentine's Day" (2010)
2. "New Year's Eve" (2011)
3. "The Holiday" (2006)
4. "Crazy, Stupid, Love" (2011)
5. "Notting Hill" (1999)
6. "Four Weddings and a Funeral" (...

[0m

In [22]:
print(str(output))

assistant: 1. "Valentine's Day" (2010) - "A day of love, where hearts entwine,
   And Cupid's arrows pierce the soul divine."

2. "New Year's Eve" (2011) - "When old year fades, and new year's dawn,
   We gather 'round, to celebrate reborn."

3. "The Holiday" (2006) - "Two souls, adrift in search of cheer,
   Find solace in a holiday so dear."

4. "Crazy, Stupid, Love" (2011) - "Love's madness, folly, and delight,
   A tale of hearts entwined, both day and night."

5. "Notting Hill" (1999) - "In London's streets, where love may bloom,
   A humble man finds love, dispelling gloom."

6. "Four Weddings and a Funeral" (1994) - "Four unions blessed, with joy and tears,
   And one farewell, that stirs our deepest fears."

7. "Bridget Jones's Diary" (2001) - "A maiden's tale, of love and strife,
   As she records her journey, through love's life."

8. "About Time" (2013) - "A man, bestowed with time's embrace,
   Seeks love's solace, in each fleeting space."

9. "The Best Exotic Marigold Hote