# Lesson 3: Adding RAG

**Lesson objective**: Add a document database to a workflow

In this lab, you’ll parse a resume and load it into a vector store, and use the agent to run basic queries against the documents. You’ll use LlamaParse to parse the documents.

<div style="background-color:#fff1d7; padding:15px;"> <b> Note</b>: Make sure to run the notebook cell by cell. Please try to avoid running all cells at once.</div>

In [None]:
!pip install llama-index-core
!pip install llama-index-utils-workflow
!pip install llama-index-llms-openai
!pip install llama-parse
!pip install llama-index-embeddings-openai
!pip install llama-index-readers-whisper
!pip install gradio

Collecting llama-parse
  Downloading llama_parse-0.6.50-py3-none-any.whl.metadata (6.9 kB)
Collecting llama-cloud-services>=0.6.49 (from llama-parse)
  Downloading llama_cloud_services-0.6.51-py3-none-any.whl.metadata (3.5 kB)
Collecting llama-cloud==0.1.34 (from llama-cloud-services>=0.6.49->llama-parse)
  Downloading llama_cloud-0.1.34-py3-none-any.whl.metadata (1.2 kB)
Collecting python-dotenv<2.0.0,>=1.0.1 (from llama-cloud-services>=0.6.49->llama-parse)
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Downloading llama_parse-0.6.50-py3-none-any.whl (4.9 kB)
Downloading llama_cloud_services-0.6.51-py3-none-any.whl (48 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.9/48.9 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading llama_cloud-0.1.34-py3-none-any.whl (289 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m289.9/289.9 kB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading python_dotenv-1.1.1-py3-none

Collecting llama-index-readers-whisper
  Downloading llama_index_readers_whisper-0.1.0-py3-none-any.whl.metadata (1.3 kB)
Downloading llama_index_readers_whisper-0.1.0-py3-none-any.whl (3.3 kB)
Installing collected packages: llama-index-readers-whisper
Successfully installed llama-index-readers-whisper-0.1.0




## Importing Libraries

In [None]:
from IPython.display import display, HTML
from llama_index.utils.workflow import draw_all_possible_flows
import os

You need nested async for this to work, so let's enable it here. It allows you to nest asyncio event loops within each other.

*Note:* In asynchronous programming, the event loop is like a continuous cycle that manages the execution of code.

In [None]:
import nest_asyncio
nest_asyncio.apply()

You also need two API keys:
- OpenAI like you used earlier;
- LlamaCloud API key to use LlamaParse to parse the PDFs. In this notebook, you are provided with such a key. For your personal project, you can get a key at cloud.llamaindex.ai for free.

LlamaParse is an advanced document parser that can read PDFs, Word files, Powerpoints, Excel spreadsheets, and extract information out of complicated PDFs into a form LLMs find easy to understand.

In [None]:
os.environ["OPENAI_API_KEY"] = "your_key"
os.environ["LLAMA_CLOUD_API_KEY"]="your_key"

In [None]:
def extract_html_content(filename):
    try:
        with open(filename, 'r') as file:
            html_content = file.read()
            html_content = f""" <div style="width: 100%; height: 800px; overflow: hidden;"> {html_content} </div>"""
            return html_content
    except Exception as e:
        raise Exception(f"Error reading file: {str(e)}")

## Performing Retrieval-Augmented Generation (RAG) on a Resume Document

### 1. Parsing the Resume Document

Let's start by parsing a resume.

<img width="400" src="images/parsing_res.png">

Using LLamaParse, you will transform the resume into a list of Document objects. By default, a Document object stores text along with some other attributes:
- metadata: a dictionary of annotations that can be appended to the text.
- relationships: a dictionary containing relationships to other Documents.
  

You can tell LlamaParse what kind of document it's parsing, so that it will parse the contents more intelligently. In this case, you tell it that it's reading a resume.

In [None]:
from llama_parse import LlamaParse

In [None]:
documents = LlamaParse(
    api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
    base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
    result_type="markdown",
    content_guideline_instruction="This is a resume, gather related facts together and format it as bullet points with headers"
).load_data(
    "/content/fake_resume.pdf",
)

Started parsing the file under job_id 106d82f1-67c4-430b-95f0-d2a263019861


This gives you a list of Document objects you can feed to a VectorStoreIndex.

In [None]:
print(documents[2].text)

# Projects

# EcoTrack | GitHub

- Built full-stack application for tracking carbon footprint using React, Node.js, and MongoDB
- Implemented machine learning algorithm for providing personalized sustainability recommendations
- Featured in TechCrunch's "Top 10 Environmental Impact Apps of 2023"

# ChatFlow | Demo

- Developed real-time chat application using WebSocket protocol and React
- Implemented end-to-end encryption and message persistence
- Serves 5000+ monthly active users

# Certifications

- AWS Certified Solutions Architect (2023)
- Google Cloud Professional Developer (2022)
- MongoDB Certified Developer (2021)

# Languages

- English (Native)
- Mandarin Chinese (Fluent)
- Spanish (Intermediate)

# Interests

- Open source contribution
- Tech blogging (15K+ Medium followers)
- Hackathon mentoring
- Rock climbing


### 2. Creating a Vector Store Index


<img width="400" src="images/vector_store_index.png">

You'll now feed the Document objects to `VectorStoreIndex`. The `VectorStoreIndex` will use an embedding model to embed the text, i.e. turn it into vectors that you can search. You'll be using an embedding model provided by OpenAI, which is why we needed an OpenAI key.

The `VectorStoreIndex` will return an index object, which is a data structure that allows you to quickly retrieve relevant context for your query. It's the core foundation for RAG use-cases. You can use indexes to build Query Engines and Chat Engines which enables question & answer and chat over your data.


In [None]:
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import VectorStoreIndex

In [None]:
index = VectorStoreIndex.from_documents(
    documents,
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small",
                                api_key= os.getenv("OPENAI_API_KEY"))
)

### 3. Creating a Query Engine with the Index

With an index, you can create a query engine and ask questions. Let's try it out! Asking questions requires an LLM, so let's use OpenAI again.

In [None]:
from llama_index.llms.openai import OpenAI

In [None]:
llm = OpenAI(model="gpt-4o-mini")

In [None]:
query_engine = index.as_query_engine(llm=llm, similarity_top_k=5)
response = query_engine.query("What is this person's name and what was their most recent job?")
print(response)

The person's name is Sarah Chen, and their most recent job is Senior Full Stack Developer at TechFlow Solutions in San Francisco, CA.


### 4. Storing the Index to Disk

Indexes can be persisted to disk. This is useful in a notebook that you might run several times! In a production setting, you would probably use a hosted vector store of some kind. Let's save your index to disk.

In [None]:
storage_dir = "/content/storage"

index.storage_context.persist(persist_dir=storage_dir)

In [None]:
from llama_index.core import StorageContext, load_index_from_storage

You can check if your index has already been stored, and if it has, you can reload an index from disk using the `load_index_from_storage` method, like this:

In [None]:
# Check if the index is stored on disk
if os.path.exists(storage_dir):
    # Load the index from disk
    storage_context = StorageContext.from_defaults(persist_dir=storage_dir)
    restored_index = load_index_from_storage(storage_context)
else:
    print("Index not found on disk.")

Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/index_store.json.


In [None]:
response = restored_index.as_query_engine().query("What is this person's name and what was their most recent job?")
print(response)

This person's name is Sarah Chen and their most recent job was as a Senior Full Stack Developer at TechFlow Solutions in San Francisco, CA.


Congratulations! You have performed retrieval augmented generation (RAG) on a resume document. With proper scaling, this technique can work across databases of thousands of documents!

## Making RAG Agentic

With a RAG pipeline in hand, let's turn it into a tool that can be used by an agent to answer questions. This is a stepping-stone towards creating an agentic system that can perform your larger goal.

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgent

First, create a regular python function that performs a RAG query. It's important to give this function a descriptive name, to mark its input and output types, and to include a docstring (that's the thing in triple quotes) which describes what it does. The framework will give all this metadata to the LLM, which will use it to decide what a tool does and whether to use it.

In [None]:
def query_resume(q: str) -> str:
    """Answers questions about a specific resume."""
    # we're using the query engine we already created above
    response = query_engine.query(f"This is a question about the specific resume we have in our database: {q}")
    return response.response

The next step is to create the actual tool. There's a utility function, `FunctionTool.from_defaults`, to do this for you.

In [None]:
resume_tool = FunctionTool.from_defaults(fn=query_resume)

Now you can instantiate a `FunctionCallingAgent` using that tool. There are a number of different agent types supported by LlamaIndex; this one is particularly capable and efficient.

You pass it an array of tools (just one in this case), you give it the same LLM we instantiated earlier, and you set Verbose to true so you get a little more info on what your agent is up to.

In [None]:
agent = FunctionCallingAgent.from_tools(
    tools=[resume_tool],
    llm=llm,
    verbose=True
)


This implementation will be removed in a v0.13.0.

See the docs for more information on updated usage: https://docs.llamaindex.ai/en/stable/understanding/agent/)
  return cls(

This implementation will be removed in a v0.13.0.

See the docs for more information on updated agent usage: https://docs.llamaindex.ai/en/stable/understanding/agent/)
  return old_new1(cls, *args, **kwargs)


Now you can chat to the agent! Let's ask it a quick question about our applicant.

In [None]:
response = agent.chat("How many years of experience does the applicant have?")
print(response)

> Running step 09da25cd-4168-4c0a-899c-5acffa6244fd. Step input: How many years of experience does the applicant have?
Added user message to memory: How many years of experience does the applicant have?
=== Calling Function ===
Calling function: query_resume with args: {"q": "How many years of experience does the applicant have?"}
=== Function Output ===
The applicant has over 6 years of experience in web development.
> Running step f7cda5e8-0032-453e-88d2-815039f30da7. Step input: None
=== LLM Response ===
The applicant has over 6 years of experience in web development.
The applicant has over 6 years of experience in web development.


You can see the agent getting the question, adding it to its memory, picking a tool, calling it with appropriate arguments, and getting the output back.

## Wrapping the Agentic RAG into a Workflow

You've now got a RAG pipeline and an agent. Let's now create a similar agentic RAG from scratch using a workflow, which you'll extend in later lessons. You won't rely on any of the things you've already created.

Here's the workflow you will create:
<img width="400" src="images/rag_workflow.png">

It consists of two steps:
1. `set_up` which is triggered by `StartEvent` and emits `QueryEvent`: at this step, the RAG system is set up and the query is passed to the second step;
2. `ask_question` which is triggered by `QueryEvent` and emits `StopEvent`: here the response to the query is generated using the RAG query engine.

In [None]:
from llama_index.core.workflow import (
    StartEvent,
    StopEvent,
    Workflow,
    step,
    Event,
    Context
)

In [None]:
class QueryEvent(Event):
    query: str

In [None]:
class RAGWorkflow(Workflow):
    storage_dir = "/content/storage"
    llm: OpenAI
    query_engine: VectorStoreIndex

    # the first step will be setup
    @step
    async def set_up(self, ctx: Context, ev: StartEvent) -> QueryEvent:

        if not ev.resume_file:
            raise ValueError("No resume file provided")

        # define an LLM to work with
        self.llm = OpenAI(model="gpt-4o-mini")

        # ingest the data and set up the query engine
        if os.path.exists(self.storage_dir):
            # you've already ingested your documents
            storage_context = StorageContext.from_defaults(persist_dir=self.storage_dir)
            index = load_index_from_storage(storage_context)
        else:
            # parse and load your documents
            documents = LlamaParse(
                result_type="markdown",
                content_guideline_instruction="This is a resume, gather related facts together and format it as bullet points with headers"
            ).load_data(ev.resume_file)
            # embed and index the documents
            index = VectorStoreIndex.from_documents(
                documents,
                embed_model=OpenAIEmbedding(model_name="text-embedding-3-small")
            )
            index.storage_context.persist(persist_dir=self.storage_dir)

        # either way, create a query engine
        self.query_engine = index.as_query_engine(llm=self.llm, similarity_top_k=5)

        # now fire off a query event to trigger the next step
        return QueryEvent(query=ev.query)

    # the second step will be to ask a question and return a result immediately
    @step
    async def ask_question(self, ctx: Context, ev: QueryEvent) -> StopEvent:
        response = self.query_engine.query(f"This is a question about the specific resume we have in our database: {ev.query}")
        return StopEvent(result=response.response)

You run it like before, giving it a fake resume we created for you.

In [None]:
w = RAGWorkflow(timeout=120, verbose=False)
result = await w.run(
    resume_file="/content/fake_resume.pdf",
    query="Where is the first place the applicant worked?"
)
print(result)

Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/index_store.json.
The first place the applicant worked is StartupHub in San Jose, CA.


There's nothing in this workflow you haven't done before, it's just making things neat and encapsulated.

If you're particularly suspicious, you might notice there's a small bug here: if you run this a second time, with a new resume, this code will find the old resume and not bother to parse it. You don't need to fix that now, but think about how you might fix that.

## Workflow Visualization

You can visualize the workflow you just created.

In [None]:
WORKFLOW_FILE = "/content/workflows/rag_workflow.html"
draw_all_possible_flows(w, filename=WORKFLOW_FILE)
html_content = extract_html_content(WORKFLOW_FILE)
display(HTML(html_content), metadata=dict(isolated=True))

/content/workflows/rag_workflow.html


## Congratulations!

You've successfully created an agent with RAG tools. In the next lesson, you'll give your agent more complicated tasks.

# **FORM PARSING**

In [None]:
class ParseFormEvent(Event):
    application_form: str

class QueryEvent(Event):
    query: str
    field: str

# new!
class ResponseEvent(Event):
    response: str

In [None]:
import json

In [None]:
class RAGWorkflow(Workflow):

    storage_dir = "/content/storage"
    llm: OpenAI
    query_engine: VectorStoreIndex

    @step
    async def set_up(self, ctx: Context, ev: StartEvent) -> ParseFormEvent:

        if not ev.resume_file:
            raise ValueError("No resume file provided")

        if not ev.application_form:
            raise ValueError("No application form provided")

        # define the LLM to work with
        self.llm = OpenAI(model="gpt-4o-mini")

        # ingest the data and set up the query engine
        if os.path.exists(self.storage_dir):
            # you've already ingested the resume document
            storage_context = StorageContext.from_defaults(persist_dir=self.storage_dir)
            index = load_index_from_storage(storage_context)
        else:
            # parse and load the resume document
            documents = LlamaParse(
                api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
                base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
                result_type="markdown",
                content_guideline_instruction="This is a resume, gather related facts together and format it as bullet points with headers"
            ).load_data(ev.resume_file)
            # embed and index the documents
            index = VectorStoreIndex.from_documents(
                documents,
                embed_model=OpenAIEmbedding(model_name="text-embedding-3-small")
            )
            index.storage_context.persist(persist_dir=self.storage_dir)

        # create a query engine
        self.query_engine = index.as_query_engine(llm=self.llm, similarity_top_k=5)

        # you no longer need a query to be passed in,
        # you'll be generating the queries instead
        # let's pass the application form to a new step to parse it
        return ParseFormEvent(application_form=ev.application_form)

    @step
    async def parse_form(self, ctx: Context, ev: ParseFormEvent) -> QueryEvent:
        print("Parsing the form...")
        parser = LlamaParse(
            api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
            base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
            result_type="markdown",
            content_guideline_instruction="This is a job application form. Create a list of all the fields that need to be filled in.",
            formatting_instruction="Return a bulleted list of the fields ONLY."
        )

        # get the LLM to convert the parsed form into JSON
        result = parser.load_data(ev.application_form)[0]
        raw_json = self.llm.complete(
            f"""
            This is a parsed form.
            Convert it into a JSON object containing only the list
            of fields to be filled in, in the form {{ fields: [...] }}.
            <form>{result.text}</form>.
            Return JSON ONLY, no markdown.
            """)
        fields = json.loads(raw_json.text)["fields"]
        print(f"Fields: {fields}")
        # new!
        # generate one query for each of the fields, and fire them off
        for field in fields:
            ctx.send_event(QueryEvent(
                field=field,
                query=f"How would you answer this question about the candidate? {field}"
            ))

        # store the number of fields so we know how many to wait for later
        await ctx.set("total_fields", len(fields))
        return

    @step
    async def ask_question(self, ctx: Context, ev: QueryEvent) -> ResponseEvent:
        response = self.query_engine.query(f"This is a question about the specific resume we have in our database: {ev.query}")
        return ResponseEvent(field=ev.field, response=response.response)

    # new!
    @step
    async def fill_in_application(self, ctx: Context, ev: ResponseEvent) -> StopEvent:
        # get the total number of fields to wait for
        total_fields = await ctx.get("total_fields")

        responses = ctx.collect_events(ev, [ResponseEvent] * total_fields)
        if responses is None:
            return None # do nothing if there's nothing to do yet
        print(f"Responses: {responses}")
        # we've got all the responses!
        responseList = "\n".join("Field: " + r.field + "\n" + "Response: " + r.response for r in responses)
        print(f"Response list: {responseList}")
        result = self.llm.complete(f"""
            You are given a list of fields in an application form and responses to
            questions about those fields from a resume. Combine the two into a list of
            fields and fill in the details for me.

            <responses>
            {responseList}
            </responses>
        """)
        return StopEvent(result=result)

In [None]:
w = RAGWorkflow(timeout=120, verbose=False)
result = await w.run(
    resume_file="/content/fake_resume.pdf",
    application_form="/content/fake_application_form.pdf"
)
print(result)

Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/index_store.json.
Parsing the form...
Started parsing the file under job_id 29c87be8-c0fa-4027-b26d-f84fdb59f165
Fields: ['First Name', 'Last Name', 'Email', 'Phone', 'Linkedin', 'Project Portfolio', 'Degree', 'Graduation Date', 'Current Job Title', 'Current Employer', 'Technical Skills', 'Describe why you’re a good fit for this position', 'Do you have 5 years of experience in React?']


  await ctx.set("total_fields", len(fields))
  total_fields = await ctx.get("total_fields")


Responses: [ResponseEvent(response='Sarah'), ResponseEvent(response='Chen'), ResponseEvent(response='sarah.chen@email.com'), ResponseEvent(response="The candidate's phone number is not provided in the information available."), ResponseEvent(response="The candidate's LinkedIn profile can be found at linkedin.com/in/sarahchen."), ResponseEvent(response='The candidate has a project portfolio that includes notable projects such as EcoTrack and ChatFlow. EcoTrack is a full-stack application designed for tracking carbon footprints, utilizing React, Node.js, and MongoDB, and features a machine learning algorithm for personalized sustainability recommendations. It was recognized in TechCrunch\'s "Top 10 Environmental Impact Apps of 2023." ChatFlow is a real-time chat application developed with the WebSocket protocol and React, featuring end-to-end encryption and message persistence, serving over 5000 monthly active users. These projects highlight the candidate\'s skills in full-stack developme

In [None]:
WORKFLOW_FILE = "/content/workflows/FormParser_workflow.html"
draw_all_possible_flows(w, filename=WORKFLOW_FILE)
html_content = extract_html_content(WORKFLOW_FILE)
# display(HTML(html_content), metadata=dict(isolated=True))

/content/workflows/FormParser_workflow.html


# **Human In The Loop** (Event Driven)

In [63]:
from llama_index.core.workflow import InputRequiredEvent, HumanResponseEvent

In [64]:
class ParseFormEvent(Event):
    application_form: str

class QueryEvent(Event):
    query: str
    field: str

class ResponseEvent(Event):
    response: str

# class HumanResponseEvent(Event):
#     response: str

# class InputRequiredEvent(Event):
#     prefix: str
#     result: str

class FeedbackEvent(Event):
    feedback: str

class GenerateQuestionsEvent(Event):
    pass

In [65]:
class RAGWorkflow(Workflow):

    storage_dir = "/content/storage"
    llm: OpenAI
    query_engine: VectorStoreIndex

    @step
    async def set_up(self, ctx: Context, ev: StartEvent) -> ParseFormEvent:

        if not ev.resume_file:
            raise ValueError("No resume file provided")

        if not ev.application_form:
            raise ValueError("No application form provided")

        # define the LLM to work with
        self.llm = OpenAI(model="gpt-4o-mini")

        # ingest the data and set up the query engine
        if os.path.exists(self.storage_dir):
            # you've already ingested the resume document
            storage_context = StorageContext.from_defaults(persist_dir=
                                                           self.storage_dir)
            index = load_index_from_storage(storage_context)
        else:
            # parse and load the resume document
            documents = LlamaParse(
                api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
                base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
                result_type="markdown",
                content_guideline_instruction="This is a resume, gather related facts together and format it as bullet points with headers"
            ).load_data(ev.resume_file)
            # embed and index the documents
            index = VectorStoreIndex.from_documents(
                documents,
                embed_model=OpenAIEmbedding(model_name="text-embedding-3-small")
            )
            index.storage_context.persist(persist_dir=self.storage_dir)

        # create a query engine
        self.query_engine = index.as_query_engine(llm=self.llm, similarity_top_k=5)

        # let's pass the application form to a new step to parse it
        return ParseFormEvent(application_form=ev.application_form)

    # form parsing
    @step
    async def parse_form(self, ctx: Context, ev: ParseFormEvent) -> GenerateQuestionsEvent:
        parser = LlamaParse(
            api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
            base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
            result_type="markdown",
            content_guideline_instruction="This is a job application form. Create a list of all the fields that need to be filled in.",
            formatting_instruction="Return a bulleted list of the fields ONLY."
        )

        # get the LLM to convert the parsed form into JSON
        result = parser.load_data(ev.application_form)[0]
        raw_json = self.llm.complete(
            f"This is a parsed form. Convert it into a JSON object containing only the list of fields to be filled in, in the form {{ fields: [...] }}. <form>{result.text}</form>. Return JSON ONLY, no markdown.")
        fields = json.loads(raw_json.text)["fields"]

        await ctx.set("fields_to_fill", fields)

        return GenerateQuestionsEvent()

    # generate questions
    @step
    async def generate_questions(self, ctx: Context, ev: GenerateQuestionsEvent | FeedbackEvent) -> QueryEvent:

        # get the list of fields to fill in
        fields = await ctx.get("fields_to_fill")

        # generate one query for each of the fields, and fire them off
        for field in fields:
            question = f"How would you answer this question about the candidate? <field>{field}</field>"

            # new! Is there feedback? If so, add it to the query:
            if hasattr(ev,"feedback"):
                question += f"""
                    \nWe previously got feedback about how we answered the questions.
                    It might not be relevant to this particular field, but here it is:
                    <feedback>{ev.feedback}</feedback>
                """

            ctx.send_event(QueryEvent(
                field=field,
                query=question
            ))

        # store the number of fields so we know how many to wait for later
        await ctx.set("total_fields", len(fields))
        return

    @step
    async def ask_question(self, ctx: Context, ev: QueryEvent) -> ResponseEvent:
        response = self.query_engine.query(f"This is a question about the specific resume we have in our database: {ev.query}")
        return ResponseEvent(field=ev.field, response=response.response)


    # Get feedback from the human
    @step
    async def fill_in_application(self, ctx: Context, ev: ResponseEvent) -> InputRequiredEvent:
        # get the total number of fields to wait for
        total_fields = await ctx.get("total_fields")

        responses = ctx.collect_events(ev, [ResponseEvent] * total_fields)
        if responses is None:
            return None # do nothing if there's nothing to do yet

        # we've got all the responses!
        responseList = "\n".join("Field: " + r.field + "\n" + "Response: " + r.response for r in responses)

        result = self.llm.complete(f"""
            You are given a list of fields in an application form and responses to
            questions about those fields from a resume. Combine the two into a list of
            fields and succinct, factual answers to fill in those fields.

            <responses>
            {responseList}
            </responses>
        """)

        # save the result for later
        await ctx.set("filled_form", str(result))

        # Fire off the feedback request
        return InputRequiredEvent(
            prefix="How does this look? Give me any feedback you have on any of the answers.",
            result=result
        )

    # Accept the feedback when a HumanResponseEvent fires
    @step
    async def get_feedback(self, ctx: Context, ev: HumanResponseEvent) -> FeedbackEvent | StopEvent:

        result = self.llm.complete(f"""
            You have received some human feedback on the form-filling task you've done.
            Does everything look good, or is there more work to be done?
            <feedback>
            {ev.response}
            </feedback>
            If everything is fine, respond with just the word 'OKAY'.
            If there's any other feedback, respond with just the word 'FEEDBACK'.
        """)

        verdict = result.text.strip()

        print(f"LLM says the verdict was {verdict}")
        if (verdict == "OKAY"):
            return StopEvent(result=await ctx.get("filled_form"))
        else:
            return FeedbackEvent(feedback=ev.response)


In [66]:
w = RAGWorkflow(timeout=600, verbose=False)
handler = w.run(
    resume_file="/content/fake_resume.pdf",
    application_form="/content/fake_application_form.pdf"
)

async for event in handler.stream_events():
    if isinstance(event, InputRequiredEvent):
        print("We've filled in your form! Here are the results:\n")
        print(event.result)
        # now ask for input from the keyboard
        response = input(event.prefix)
        handler.ctx.send_event(
            HumanResponseEvent(
                response=response
            )
        )

response = await handler
print("Agent complete! Here's your final result:")
print(str(response))

Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from /content/storage/index_store.json.
Started parsing the file under job_id ffb7efb7-00ed-4b53-88a2-709e77e45dfd


  await ctx.set("fields_to_fill", fields)
  fields = await ctx.get("fields_to_fill")
  await ctx.set("total_fields", len(fields))
  total_fields = await ctx.get("total_fields")
  await ctx.set("filled_form", str(result))


We've filled in your form! Here are the results:

Here is the combined list of fields and succinct, factual answers:

1. **First Name**: Sarah
2. **Last Name**: Chen
3. **Email**: sarah.chen@email.com
4. **Phone**: Not provided
5. **LinkedIn**: linkedin.com/in/sarahchen
6. **Project Portfolio**: Notable projects include EcoTrack, a full-stack application for tracking carbon footprints recognized in TechCrunch's "Top 10 Environmental Impact Apps of 2023," and ChatFlow, a real-time chat application with end-to-end encryption serving over 5000 monthly active users.
7. **Degree**: Bachelor of Science in Computer Science
8. **Graduation Date**: 2017
9. **Current Job Title**: Senior Full Stack Developer
10. **Current Employer**: TechFlow Solutions
11. **Technical Skills**: Proficient in frontend technologies (React.js, Redux, Next.js, TypeScript, Vue.js, HTML5, CSS3, SASS/SCSS) and backend technologies (Node.js, Express.js, Python, Django). Experienced with GraphQL, REST APIs, PostgreSQL, an

  return StopEvent(result=await ctx.get("filled_form"))


In [None]:
WORKFLOW_FILE = "/content/workflows/FormParser_Human_in_loop_workflow.html"
draw_all_possible_flows(w, filename=WORKFLOW_FILE)
# html_content = extract_html_content(WORKFLOW_FILE)

/content/workflows/FormParser_Human_in_loop_workflow.html
