# An Introduction to LlamaIndex Query Pipelines

## Overview
LlamaIndex provides a declarative query API that allows you to chain together different modules in order to orchestrate simple-to-advanced workflows over your data.

This is centered around our `QueryPipeline` abstraction. Load in a variety of modules (from LLMs to prompts to retrievers to other pipelines), connect them all together into a sequential chain or DAG, and run it end2end.

**NOTE**: You can orchestrate all these workflows without the declarative pipeline abstraction (by using the modules imperatively and writing your own functions). So what are the advantages of `QueryPipeline`?

- Express common workflows with fewer lines of code/boilerplate
- Greater readability
- Greater parity / better integration points with common low-code / no-code solutions (e.g. LangFlow)
- [In the future] A declarative interface allows easy serializability of pipeline components, providing portability of pipelines/easier deployment to different systems.

## Cookbook

In this cookbook we give you an introduction to our `QueryPipeline` interface and show you some basic workflows you can tackle.

- Chain together prompt and LLM
- Chain together query rewriting (prompt + LLM) with retrieval
- Chain together a full RAG query pipeline (query rewriting, retrieval, reranking, response synthesis)
- Setting up a custom query component

In [None]:
!pip install llama-index==0.9.45.post1 arize-phoenix==2.2.1

## Setup

Here we setup some data + indexes (from PG's essay) that we'll be using in the rest of the cookbook.

In [None]:
# setup Arize Phoenix for logging/observability
import phoenix as px
px.launch_app()
import llama_index
llama_index.set_global_handler("arize_phoenix")

In [None]:
from llama_index.query_pipeline import QueryPipeline
from llama_index.llms import OpenAI
from llama_index.prompts import PromptTemplate
from llama_index import (
    VectorStoreIndex,
    ServiceContext,
    SimpleDirectoryReader,
    load_index_from_storage,
)

In [None]:
reader = SimpleDirectoryReader("../data/paul_graham")

In [None]:
docs = reader.load_data()
print(docs[0].get_content())



What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.

The language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in t

In [None]:
import os
from llama_index.storage import StorageContext

if not os.path.exists("storage"):
    index = VectorStoreIndex.from_documents(docs)
    # save index to disk
    index.set_index_id("vector_index")
    index.storage_context.persist("./storage")
else:
    # rebuild storage context
    storage_context = StorageContext.from_defaults(persist_dir="storage")
    # load index
    index = load_index_from_storage(storage_context, index_id="vector_index")

## 1. Chain Together Prompt and LLM

In this section we show a super simple workflow of chaining together a prompt with LLM.

We simply define `chain` on initialization. This is a special case of a query pipeline where the components are purely sequential, and we automatically convert outputs into the right format for the next inputs.

In [None]:
# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")

p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

In [None]:
output = p.run(movie_name="The Departed")

[1;3;38;2;155;135;227m> Running module b3254005-5bd2-4e1e-bdab-ce1d1aa97252 with input: 
movie_name: The Departed

[0m[1;3;38;2;155;135;227m> Running module 96d0d711-3838-4b16-a74d-bfe4303ac060 with input: 
messages: Please generate related movies to The Departed

[0m

In [None]:
print(str(output))

assistant: 1. Infernal Affairs (2002) - The Departed is actually a remake of this Hong Kong crime thriller. It follows a similar storyline of undercover cops infiltrating a criminal organization.

2. The Town (2010) - Directed by Ben Affleck, this crime drama revolves around a group of bank robbers in Boston. It explores themes of loyalty, betrayal, and the blurred lines between law enforcement and criminals.

3. American Gangster (2007) - Based on a true story, this crime film follows the rise and fall of a Harlem drug lord and the efforts of a dedicated detective to bring him down. It delves into the corrupt underworld of organized crime and law enforcement.

4. The Departed 2 (hypothetical) - Although there is no official sequel to The Departed, a hypothetical continuation of the story could explore the aftermath of the events in the original film, with new characters navigating the treacherous world of undercover work and criminal activity.

5. Training Day (2001) - This crime thri

In [None]:
# if you implemented this imperatively - you can still do so!
# just make sure you format the prompt and call the right method on the LLM
from llama_index.llms import ChatMessage, MessageRole

# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")


# format prompt, pass to LLM
movie_name = "The Departed"
full_prompt_tmpl = prompt_tmpl.format(movie_name=movie_name)
response = llm.chat([ChatMessage(content=full_prompt_tmpl, role=MessageRole.USER)])
print(str(response))

assistant: 1. Infernal Affairs (2002) - The Departed is actually a remake of this Hong Kong crime thriller, which follows a similar storyline of undercover cops infiltrating a criminal organization.

2. The Town (2010) - Directed by Ben Affleck, this crime drama revolves around a group of bank robbers in Boston and the FBI agent determined to bring them down.

3. Heat (1995) - Directed by Michael Mann, this classic crime film features an intense cat-and-mouse game between a skilled detective and a professional thief in Los Angeles.

4. American Gangster (2007) - Based on a true story, this crime drama explores the rise and fall of a Harlem drug lord and the detective who is determined to bring him to justice.

5. Training Day (2001) - Denzel Washington won an Academy Award for his role in this crime thriller, which follows a corrupt narcotics detective and his rookie partner as they navigate the dangerous streets of Los Angeles.

6. The Departed (2006) - Although it's the movie in ques

### Try Output Parsing

Let's parse the outputs into a structured Pydantic object.

In [None]:
from typing import List
from pydantic import BaseModel, Field
from llama_index.output_parsers import PydanticOutputParser


class Movie(BaseModel):
    """Object representing a single movie."""

    name: str = Field(..., description="Name of the movie.")
    year: int = Field(..., description="Year of the movie.")


class Movies(BaseModel):
    """Object representing a list of movies."""

    movies: List[Movie] = Field(..., description="List of movies.")


llm = OpenAI(model="gpt-3.5-turbo")
output_parser = PydanticOutputParser(Movies)
json_prompt_str = """\
Please generate related movies to {movie_name}.
"""
json_prompt_str = output_parser.format(json_prompt_str)

In [None]:
print(json_prompt_str)

Please generate related movies to {movie_name}.  



Here's a JSON schema to follow:
{{"title": "Movies", "description": "Object representing a list of movies.", "type": "object", "properties": {{"movies": {{"title": "Movies", "description": "List of movies.", "type": "array", "items": {{"$ref": "#/definitions/Movie"}}}}}}, "required": ["movies"], "definitions": {{"Movie": {{"title": "Movie", "description": "Object representing a single movie.", "type": "object", "properties": {{"name": {{"title": "Name", "description": "Name of the movie.", "type": "string"}}, "year": {{"title": "Year", "description": "Year of the movie.", "type": "integer"}}}}, "required": ["name", "year"]}}}}}}

Output a valid JSON object but do not repeat the schema.



In [None]:
# add JSON spec to prompt template
json_prompt_tmpl = PromptTemplate(json_prompt_str)

p = QueryPipeline(chain=[json_prompt_tmpl, llm, output_parser], verbose=True)
output = p.run(movie_name="Toy Story")

[1;3;38;2;155;135;227m> Running module 68648b49-9c0a-42fb-adff-0acdcd032f16 with input: 
movie_name: Toy Story

[0m[1;3;38;2;155;135;227m> Running module 7148919b-6a0d-4e8a-aaf3-65ff00025b38 with input: 
messages: Please generate related movies to Toy Story.  



Here's a JSON schema to follow:
{"title": "Movies", "description": "Object representing a list of movies.", "type": "object", "properties": {"movies":...

[0m[1;3;38;2;155;135;227m> Running module 932fd79c-37d7-4eb1-a9d7-142d5f2210f5 with input: 
input: assistant: {
  "movies": [
    {
      "name": "Finding Nemo",
      "year": 2003
    },
    {
      "name": "Cars",
      "year": 2006
    },
    {
      "name": "Monsters, Inc.",
      "year": 2001
...

[0m

In [None]:
output

Movies(movies=[Movie(name='Finding Nemo', year=2003), Movie(name='Cars', year=2006), Movie(name='Monsters, Inc.', year=2001), Movie(name='The Incredibles', year=2004), Movie(name='Ratatouille', year=2007)])

### Streaming Support

The query pipelines have LLM streaming support (simply do `as_query_component(streaming=True)`). Intermediate outputs will get autoconverted, and the final output can be a streaming output. Here's some examples.

**1. Chain multiple Prompts with Streaming**

In [None]:
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
# let's add some subsequent prompts for fun
prompt_str2 = """\
Here's some text:

{text}

Can you rewrite this with a summary of each movie?
"""
prompt_tmpl2 = PromptTemplate(prompt_str2)
llm = OpenAI(model="gpt-3.5-turbo")
llm_c = llm.as_query_component(streaming=True)

p = QueryPipeline(
    chain=[prompt_tmpl, llm_c, prompt_tmpl2, llm_c], verbose=True
)
# p = QueryPipeline(chain=[prompt_tmpl, llm_c], verbose=True)

In [None]:
output = p.run(movie_name="The Dark Knight")
for o in output:
    print(o.delta, end="")

[1;3;38;2;155;135;227m> Running module 3fbc7286-6261-4bf4-bbf2-c9d3c45def50 with input: 
movie_name: The Dark Knight

[0m[1;3;38;2;155;135;227m> Running module c244429b-b8f2-4700-83e4-a80798732d9f with input: 
messages: Please generate related movies to The Dark Knight

[0m[1;3;38;2;155;135;227m> Running module a11460e9-ce4d-4afd-8a88-d2db13a4aff1 with input: 
text: <generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x12d22cf90>

[0m[1;3;38;2;155;135;227m> Running module 13802c56-b3cb-421b-844e-f5b9f0e7740f with input: 
messages: Here's some text:

1. Batman Begins (2005)
2. The Dark Knight Rises (2012)
3. Batman v Superman: Dawn of Justice (2016)
4. Man of Steel (2013)
5. The Avengers (2012)
6. Iron Man (2008)
7. Captain Amer...

[0m1. Batman Begins (2005): A young Bruce Wayne becomes Batman to protect Gotham City from corruption and crime, facing his fears and training under the guidance of Ra's al Ghul.
2. The Dark Knight Rise

**2. Feed streaming output to output parser**

In [None]:
p = QueryPipeline(
    chain=[
        json_prompt_tmpl,
        llm.as_query_component(streaming=True),
        output_parser,
    ],
    verbose=True,
)
output = p.run(movie_name="Toy Story")
print(output)

[1;3;38;2;155;135;227m> Running module bf36ac87-0722-44e3-947a-aba13c52ea5e with input: 
movie_name: Toy Story

[0m[1;3;38;2;155;135;227m> Running module 7b42f632-ea70-4059-b393-99eeda5abe88 with input: 
messages: Please generate related movies to Toy Story.  



Here's a JSON schema to follow:
{"title": "Movies", "description": "Object representing a list of movies.", "type": "object", "properties": {"movies":...

[0m[1;3;38;2;155;135;227m> Running module 94e9a64e-9540-4717-a81a-0cedfe870b43 with input: 
input: <generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x12d22c9e0>

[0mmovies=[Movie(name='Finding Nemo', year=2003), Movie(name='Monsters, Inc.', year=2001), Movie(name='Cars', year=2006), Movie(name='The Incredibles', year=2004), Movie(name='Ratatouille', year=2007)]


## Chain Together Query Rewriting Workflow (prompts + LLM) with Retrieval

Here we try a slightly more complex workflow where we send the input through two prompts before initiating retrieval.

1. Generate question about given topic.
2. Hallucinate answer given question, for better retrieval.

Since each prompt only takes in one input, note that the `QueryPipeline` will automatically chain LLM outputs into the prompt and then into the LLM.

You'll see how to define links more explicitly in the next section.

In [None]:
# from llama_index.postprocessor import CohereRerank

# generate question regarding topic
prompt_str1 = "Please generate a concise question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl1 = PromptTemplate(prompt_str1)
# use HyDE to hallucinate answer.
prompt_str2 = (
    "Please write a passage to answer the question\n"
    "Try to include as many key details as possible.\n"
    "\n"
    "\n"
    "{query_str}\n"
    "\n"
    "\n"
    'Passage:"""\n'
)
prompt_tmpl2 = PromptTemplate(prompt_str2)

llm = OpenAI(model="gpt-3.5-turbo")
retriever = index.as_retriever(similarity_top_k=5)
p = QueryPipeline(
    chain=[prompt_tmpl1, llm, prompt_tmpl2, llm, retriever], verbose=True
)

In [None]:
nodes = p.run(topic="college")
len(nodes)

[1;3;38;2;155;135;227m> Running module df8b4807-c289-4692-9892-a90b42cfecb7 with input: 
topic: college

[0m[1;3;38;2;155;135;227m> Running module 2259b032-c87f-4349-8e70-6ec4781f7b0b with input: 
messages: Please generate a concise question about Paul Graham's life regarding the following topic college

[0m[1;3;38;2;155;135;227m> Running module 28e84454-0820-47d5-91f4-dcce1b08bb33 with input: 
query_str: assistant: How did Paul Graham's college experience shape his career and entrepreneurial mindset?

[0m[1;3;38;2;155;135;227m> Running module d483e667-d081-4f87-b4f3-f9eab6402ec4 with input: 
messages: Please write a passage to answer the question
Try to include as many key details as possible.


How did Paul Graham's college experience shape his career and entrepreneurial mindset?


Passage:"""


[0m[1;3;38;2;155;135;227m> Running module b7b3e563-247d-42d4-bc10-47bdd18d792c with input: 
input: assistant: Paul Graham's college experience played a pivotal role in shaping his ca

5

## Create a Full RAG Pipeline as a DAG

Here we chain together a full RAG pipeline consisting of query rewriting, retrieval, reranking, and response synthesis.

Here we can't use `chain` syntax because certain modules depend on multiple inputs (for instance, response synthesis expects both the retrieved nodes and the original question). Instead we'll construct a DAG explicitly, through `add_modules` and then `add_link`.

### 1. RAG Pipeline with Query Rewriting

We use an LLM to rewrite the query first before passing it to our downstream modules - retrieval/reranking/synthesis.

In [None]:
from llama_index.postprocessor import CohereRerank
from llama_index.response_synthesizers import TreeSummarize
from llama_index import ServiceContext

# define modules
prompt_str = "Please generate a question about Paul Graham's life regarding the following topic {topic}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")
retriever = index.as_retriever(similarity_top_k=3)
reranker = CohereRerank()
## NOTE: we are deprecating ServiceContext soon in v0.10 and letting you pass in `llm` directly.
summarizer = TreeSummarize(
    service_context=ServiceContext.from_defaults(llm=llm)
)

In [None]:
# define query pipeline
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "llm": llm,
        "prompt_tmpl": prompt_tmpl,
        "retriever": retriever,
        "summarizer": summarizer,
        "reranker": reranker,
    }
)

Next we draw links between modules with `add_link`. `add_link` takes in the source/destination module ids, and optionally the `source_key` and `dest_key`. Specify the `source_key` or `dest_key` if there are multiple outputs/inputs respectively.

You can view the set of input/output keys for each module through `module.as_query_component().input_keys` and `module.as_query_component().output_keys`.

Here we explicitly specify `dest_key` for the `reranker` and `summarizer` modules because they take in two inputs (query_str and nodes).

In [None]:
p.add_link("prompt_tmpl", "llm")
p.add_link("llm", "retriever")
p.add_link("retriever", "reranker", dest_key="nodes")
p.add_link("llm", "reranker", dest_key="query_str")
p.add_link("reranker", "summarizer", dest_key="nodes")
p.add_link("llm", "summarizer", dest_key="query_str")

# look at summarizer input keys
print(summarizer.as_query_component().input_keys)

required_keys={'query_str', 'nodes'} optional_keys=set()


We use `networkx` to store the graph representation. This gives us an easy way to view the DAG!

In [None]:
## create graph
from pyvis.network import Network

net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(p.dag)
net.show("rag_dag.html")

rag_dag.html


In [None]:
response = p.run(topic="YC")

[1;3;38;2;155;135;227m> Running module prompt_tmpl with input: 
topic: YC

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: Please generate a question about Paul Graham's life regarding the following topic YC

[0m[1;3;38;2;155;135;227m> Running module retriever with input: 
input: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?

[0m[1;3;38;2;155;135;227m> Running module reranker with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(id_='543f958b-2c46-4c0f-b046-22e0a60ea950', embedding=None, metadata={'file_path': '../data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file...

[0m[1;3;38;2;155;135;227m> Running module summarizer with input: 
query_str: assistant: What role did Paul Graham play in the founding and development of Y Combinator (YC)?
nodes: [NodeWithScore(node=TextNode(

In [None]:
print(str(response))

Paul Graham played a significant role in the founding and development of Y Combinator (YC). He was one of the co-founders of YC and provided the initial funding for the investment firm. Along with his partners, he implemented the ideas they had been discussing and started their own investment firm. Paul Graham also played a key role in shaping the unique batch model of YC, where a group of startups is funded at once and receives intensive support and guidance for a period of three months. He was actively involved in selecting and helping the founders and worked on various projects related to YC, including writing essays and developing internal software.


In [None]:
# you can do async too
response = await p.arun(topic="YC")
print(str(response))

### 2. RAG Pipeline without Query Rewriting

Here we setup a RAG pipeline without the query rewriting step.

Here we need a way to link the input query to both the retriever, reranker, and summarizer. We can do this by defining a special `InputComponent`, allowing us to link the inputs to multiple downstream modules.

In [None]:
from llama_index.postprocessor import CohereRerank
from llama_index.response_synthesizers import TreeSummarize
from llama_index import ServiceContext
from llama_index.query_pipeline import InputComponent

retriever = index.as_retriever(similarity_top_k=5)
summarizer = TreeSummarize(
    service_context=ServiceContext.from_defaults(
        llm=OpenAI(model="gpt-3.5-turbo")
    )
)
reranker = CohereRerank()

In [None]:
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "summarizer": summarizer,
    }
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")

In [None]:
output = p.run(input="what did the author do in YC")

[1;3;38;2;155;135;227m> Running module input with input: 
input: what did the author do in YC

[0m[1;3;38;2;155;135;227m> Running module retriever with input: 
input: what did the author do in YC

[0m[1;3;38;2;155;135;227m> Running module summarizer with input: 
query_str: what did the author do in YC
nodes: [NodeWithScore(node=TextNode(id_='d2fc236f-2120-4f43-92d9-6c8e8725b806', embedding=None, metadata={'file_path': '../data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file...

[0m

In [None]:
print(str(output))

The author worked on YC, wrote essays, ran Hacker News (HN), funded startups through YC's Summer Founders Program, provided support and guidance to founders, and worked on YC's internal software in Arc.


## Defining a Custom Component in a Query Pipeline

You can easily define a custom component. Simply subclass a `QueryComponent`, implement validation/run functions + some helpers, and plug it in.

Let's wrap the related movie generation prompt+LLM chain from the first example into a custom component.

In [None]:
from llama_index.query_pipeline import (
    CustomQueryComponent,
    InputKeys,
    OutputKeys,
)
from typing import Dict, Any
from llama_index.llms.llm import BaseLLM
from pydantic import Field


class RelatedMovieComponent(CustomQueryComponent):
    """Related movie component."""

    llm: BaseLLM = Field(..., description="OpenAI LLM")

    def _validate_component_inputs(
        self, input: Dict[str, Any]
    ) -> Dict[str, Any]:
        """Validate component inputs during run_component."""
        # NOTE: this is OPTIONAL but we show you here how to do validation as an example
        return input

    @property
    def _input_keys(self) -> set:
        """Input keys dict."""
        # NOTE: These are required inputs. If you have optional inputs please override
        # `optional_input_keys_dict`
        return {"movie"}

    @property
    def _output_keys(self) -> set:
        return {"output"}

    def _run_component(self, **kwargs) -> Dict[str, Any]:
        """Run the component."""
        # use QueryPipeline itself here for convenience
        prompt_str = "Please generate related movies to {movie_name}"
        prompt_tmpl = PromptTemplate(prompt_str)
        p = QueryPipeline(chain=[prompt_tmpl, llm])
        return {"output": p.run(movie_name=kwargs["movie"])}


# from llama_index.query_pipeline import FunctionComponent

# def foo(x: str) -> str:
#     return x + ":hello"

# component = FunctionComponent(fn=foo)

Let's try the custom component out! We'll also add a step to convert the output to Shakespeare.

In [None]:
llm = OpenAI(model="gpt-3.5-turbo")
component = RelatedMovieComponent(llm=llm)

# let's add some subsequent prompts for fun
prompt_str = """\
Here's some text:

{text}

Can you rewrite this in the voice of Shakespeare?
"""
prompt_tmpl = PromptTemplate(prompt_str)

p = QueryPipeline(chain=[component, prompt_tmpl, llm], verbose=True)

In [None]:
output = p.run(movie="Love Actually")

[1;3;38;2;155;135;227m> Running module 4cf0fedf-04e6-4f75-8f12-8882d549b8c6 with input: 
movie: Love Actually

[0m[1;3;38;2;155;135;227m> Running module dddfc2bf-93ca-4434-90cd-6e20f71d36c7 with input: 
text: assistant: 1. "Valentine's Day" (2010) - This romantic comedy follows the lives of several interconnected couples and singles in Los Angeles as they navigate love and relationships on Valentine's Day....

[0m[1;3;38;2;155;135;227m> Running module ff76fde2-503b-42da-a748-7ea70c223a94 with input: 
messages: Here's some text:

1. "Valentine's Day" (2010) - This romantic comedy follows the lives of several interconnected couples and singles in Los Angeles as they navigate love and relationships on Valentin...

[0m

In [None]:
print(str(output))

assistant: 1. "Valentine's Daye" (2010) - Thise romantic comedy doth follow the lives of several interconnected couples and singles in Los Angeles as they doth navigate love and relationships on Valentine's Daye.

2. "New Year's Eve" (2011) - Similar to "Love Actually," thise film doth tell the tale of multiple characters whose lives doth intertwine on New Year's Eve in New York City, exploring themes of love, hope, and second chances.

3. "Crazy, Stupid, Love" (2011) - Thise romantic comedy-drama doth revolve around a middle-aged man who, after his wife doth ask him for a divorce, seeketh guidance from a young bachelor on how to navigate the dating scene and win back his wife.

4. "The Holiday" (2006) - In thise heartwarming film, two women from different countries doth swap homes during the Christmas season to escape their personal troubles. Whilst on their respective vacations, they unexpectedly doth find love and learn valuable life lessons.

5. "Notting Hill" (1999) - Starring Hug

## Async / Parallel Execution

Here we showcase our query pipeline with async + parallel execution.

We do this by setting up a RAG pipeline that does the following:
1. Send query to multiple RAG query engines.
2. Combine results.

In the process we'll also show some nice abstractions for joining results (e.g. our `ArgPackComponent()`)

### Define Multiple Query Engines (One per Chunk Size)

In [None]:
from llama_index.query_pipeline import (
    QueryPipeline,
    InputComponent,
    ArgPackComponent,
)
from typing import Dict, Any, List, Optional
from llama_index.llama_pack.base import BaseLlamaPack
from llama_index.llms.llm import LLM
from llama_index.llms.openai import OpenAI
from llama_index import Document, VectorStoreIndex, ServiceContext
from llama_index.response_synthesizers import TreeSummarize
from llama_index.schema import NodeWithScore, TextNode
from llama_index.node_parser import SentenceSplitter


llm = OpenAI(model="gpt-3.5-turbo")
chunk_sizes = [128, 256, 512, 1024]
query_engines = {}
for chunk_size in chunk_sizes:
    splitter = SentenceSplitter(chunk_size=chunk_size, chunk_overlap=0)
    nodes = splitter.get_nodes_from_documents(docs)
    vector_index = VectorStoreIndex(nodes)
    query_engines[str(chunk_size)] = vector_index.as_query_engine()

### Construct a Query Pipeline

In [None]:
# construct query pipeline
p = QueryPipeline(verbose=True)
module_dict = {
    **query_engines,
    "input": InputComponent(),
    "summarizer": TreeSummarize(),
    "join": ArgPackComponent(
        convert_fn=lambda x: NodeWithScore(node=TextNode(text=str(x)))
    ),
}
p.add_modules(module_dict)
# add links from input to query engine (id'ed by chunk_size)
for chunk_size in chunk_sizes:
    p.add_link("input", str(chunk_size))
    p.add_link(str(chunk_size), "join", dest_key=str(chunk_size))
p.add_link("join", "summarizer", dest_key="nodes")
p.add_link("input", "summarizer", dest_key="query_str")

#### Visualize

In [None]:
## create graph
from pyvis.network import Network

net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(p.dag)
net.show("rag_dag.html")

rag_dag.html


### Run Pipeline

In [None]:
import time

start_time = time.time()
response = await p.arun(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")

[1;3;38;2;155;135;227m> Running modules and inputs in parallel: 
Module key: input. Input: 
input: What did the author do during his time in YC?


[0m[1;3;38;2;155;135;227m> Running modules and inputs in parallel: 
Module key: 128. Input: 
input: What did the author do during his time in YC?

Module key: 256. Input: 
input: What did the author do during his time in YC?

Module key: 512. Input: 
input: What did the author do during his time in YC?

Module key: 1024. Input: 
input: What did the author do during his time in YC?


[0m[1;3;38;2;155;135;227m> Running modules and inputs in parallel: 
Module key: join. Input: 
128: The author spent his time in YC doing three main things: hacking, writing essays, and working on YC. However, as YC grew and became more exciting to the author, it started to take up more of his atten...
256: During his time in YC, the author worked on YC's internal software in Arc and also wrote essays.
512: During his time in YC, the author worked on various 

In [None]:
# compare with sync method

start_time = time.time()
response = p.run(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")

[1;3;38;2;155;135;227m> Running module input with input: 
input: What did the author do during his time in YC?

[0m[1;3;38;2;155;135;227m> Running module 128 with input: 
input: What did the author do during his time in YC?

[0m[1;3;38;2;155;135;227m> Running module 256 with input: 
input: What did the author do during his time in YC?

[0m[1;3;38;2;155;135;227m> Running module 512 with input: 
input: What did the author do during his time in YC?

[0m[1;3;38;2;155;135;227m> Running module 1024 with input: 
input: What did the author do during his time in YC?

[0m[1;3;38;2;155;135;227m> Running module join with input: 
128: The author spent his time in YC working on various tasks, including hacking, writing essays, and working on YC itself. However, as YC grew and became more exciting to the author, it started to take up...
256: During his time in YC, the author worked on YC's internal software in Arc and also wrote essays.
512: During his time in YC, the author worked on vari