In [1]:
! python --version

Python 3.12.2


## Resume Question and Answering with LlamaIndex

In this demo, we explore answering complex queries by **decomposing them into simpler sub-queries**. 

### <span style="color:orange">Note: We will be using a commercial LLM for this particular notebook.</span>

**Simple questions** can be answered by our Local LLM, however, more **complex questions** will need a more powerful LLM such as the **Claude 3** LLM which is available from Bedrock. Complex questions are those that require the answers from multiple indexes or multiple disparate parts of the same source document. 

Please use a good LLM that you have access to such as the ones in Bedrock, OpenAI, Meta or Anthropic.

We will be asking simple as well as complex questions about one or more resumes that are in the dataset. Simple questions are those that can be answered by a single resume, while complex questions require information from multiple resumes.

## Install the required packages
- `%%capture` is used to suppress the output of the installation commands.

In [2]:
%%capture
%pip install llama-index-readers-file pymupdf
%pip install llama-index-vector-stores-postgres
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-bedrock
%pip install llama-index-embeddings-bedrock
%pip install psycopg2-binary
%pip install ipywidgets
%pip install SQLAlchemy
%pip install python-dotenv

In [3]:
import nest_asyncio

nest_asyncio.apply()

## Import the required libraries
- The `load_dotenv` function is used to load the environment variables from the `.env` file - this is used when I had to access a more capable generator model in Bedrock. 
- This notebook is based on this [notebook from LlamaIndex - Sub Question querying](https://docs.llamaindex.ai/en/stable/examples/usecases/10k_sub_question/).

In [4]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from dotenv import load_dotenv
load_dotenv(verbose=True, dotenv_path=".env")

True

## Setup the Retriever and Generator models
- pass `mode=local` to the `setup` function to use our local LMStudio models.
- pass `mode=remote` to the `setup` function to use the AWS Bedrock.

In [5]:
from llama_index.core import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.llms.lmstudio import LMStudio
from llama_index.embeddings.bedrock import BedrockEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.anthropic import Anthropic
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler, CBEventType
import os

token_counter = TokenCountingHandler(tokenizer=Anthropic().tokenizer.encode)

def setup_models(mode="local"):
    Settings.callback_manager = CallbackManager([token_counter])
    if mode == "local":
        # Setup Retriever model
        embedding_model = "BAAI/bge-large-en-v1.5" # "BAAI/bge-base-en-v1.5"
        print(f"Setting up local Retriever model (embedding: {embedding_model})...")
        Settings.embed_model = HuggingFaceEmbedding(model_name=embedding_model)

        Settings.chunk_size = 512
        Settings.chunk_overlap = 20
        
        # Setup Generator model
        llm_model = "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF"
        print(f"Setting up local Generator model (main LLM: {llm_model})...")
        Settings.llm = LMStudio(
            model_name=llm_model,
            base_url="http://localhost:1234/v1",
            temperature=0,
            context_window=8192,
            request_timeout=240,
        )
    elif mode == "remote":
        # Setup Retriever model
        embedding_model = "cohere.embed-multilingual-v3"
        print(f"Setting up remote Retriever model (embedding: {embedding_model})...")
        Settings.embed_model = BedrockEmbedding(
            model_name=embedding_model,
            region_name=os.environ["AWS_DEFAULT_REGION"],
        )
        Settings.chunk_size = 1024
        Settings.chunk_overlap = 20
                
        # Setup Generator model
        llm_model = "anthropic.claude-3-sonnet-20240229-v1:0"
        print(f"Setting up remote Generator model (main LLM: {llm_model})...")
        Settings.llm = Bedrock(
            model=llm_model,
            aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
            aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
            aws_session_token=os.environ["AWS_SESSION_TOKEN"],
            region_name=os.environ["AWS_DEFAULT_REGION"],
            context_window=8192,
            request_timeout=120,
        )
    else:
        raise ValueError(f"Unknown mode: {mode}")
    
llm_mode = "remote" # <=== Change this `local` or `remote`
setup_models(mode=llm_mode) 

text_embedding = Settings.embed_model.get_text_embedding("Once upon a time, there was a cat.")
print(text_embedding[:5])
print(f"Emedding length: {len(text_embedding)}")
vector_size = len(text_embedding)


Setting up remote Retriever model (embedding: cohere.embed-multilingual-v3)...
Setting up remote Generator model (main LLM: anthropic.claude-3-sonnet-20240229-v1:0)...
[-0.043518066, -0.010955811, -0.00032567978, 0.0057792664, -0.016540527]
Emedding length: 1024


# Setup PgVector extension in Postgres SQL
- In the code below, we drop the database everytime, just to ensure that we are starting from scratch. This is not recommended in production.

In [6]:
import psycopg2
import nest_asyncio

try:
    pg_pw = "mysecretpassword"
    pg_db = "vector_store"
    connection_string = f"postgresql://postgres:{pg_pw}@localhost:5432"
    db_name = pg_db
    conn = psycopg2.connect(connection_string)
    conn.autocommit = True

    with conn.cursor() as c:
        c.execute(f"DROP DATABASE {db_name} WITH (FORCE);")
        c.execute(f"CREATE DATABASE {db_name};")

    conn.commit()
    conn.close()
    
    nest_asyncio.apply()
    
except Exception as e:
    print(e)
    

In [7]:
from IPython.display import Markdown
from llama_index.core import SimpleDirectoryReader
from sqlalchemy import make_url
from llama_index.core import VectorStoreIndex
from llama_index.core import StorageContext
from llama_index.vector_stores.postgres import PGVectorStore

BASE_DIR = "./data/resume/"

def simple_RAG(vector_size):
    """
    Simple Retrieval Augmented Generation (RAG) using Llama Index.
    """
    url = make_url(connection_string)
    print(f"Url {url}")
    
    vector_store = PGVectorStore.from_params(
        database=db_name,
        host=url.host,
        password=url.password,
        port=url.port,
        user=url.username,
        table_name="basic_rag",
        embed_dim=vector_size
    )

    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    
    nodes = ingest_directory(BASE_DIR)
    
    print(f"Number of nodes: {len(nodes)}")

    index = VectorStoreIndex.from_documents(nodes, storage_context=storage_context, show_progress=True)
    return index

def advanced_RAG(vector_size, input_file):
    """
    Simple Retrieval Augmented Generation (RAG) using Llama Index.
    """

    print(f"Ingesting document: {input_file}...")
    url = make_url(connection_string)
    # print(f"Url {url}")
    
    vector_store = PGVectorStore.from_params(
        database=db_name,
        host=url.host,
        password=url.password,
        port=url.port,
        user=url.username,
        table_name="advanced_rag",
        embed_dim=vector_size
    )

    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    
    nodes = ingest_document(input_file)
    
    print(f"Number of nodes: {len(nodes)}")

    index = VectorStoreIndex.from_documents(nodes, storage_context=storage_context, show_progress=True)
    return index

def ingest_document(input_file):
    """
    Ingest a document into the vector store. 
    """
    reader = SimpleDirectoryReader(input_files=[input_file])
    return reader.load_data(show_progress=True)

def ingest_directory(directory):
    """
    Ingest documents from a directory into the vector store. 
    """
    reader = SimpleDirectoryReader(input_dir=directory)
    return reader.load_data(show_progress=True)

def markdown(question, response, display_nodes=False):
    """
    Display a question and response in markdown format.
    """
    nodes = []
    if display_nodes:
        nodes = response.source_nodes
    return Markdown(
f"""
## Question:
{question}

## Answer:
{response.response}

## Num Nodes:
{len(nodes)}
""")
    
def calculate_call_cost(token_counter):
    """
    Calculate the cost of the call based on the number of tokens used.
    """
    # Cost (for Claude 3 Sonnet)
    usd_cost_per_k_input_tokens = 0.003
    usd_cost_per_k_output_tokens = 0.015
    
    # Cost (for Claude 3 Haiku)
    # usd_cost_per_k_input_tokens = 0.00025
    # usd_cost_per_k_output_tokens = 0.00125
    
    # Cost for Cohere Multilingual Embedding
    usd_cost_per_k_tokens = 0.0001
    
    if llm_mode == "local":
        cost_input_tokens = 0
        cost_output_tokens = 0
        cost_embedding_tokens = 0
        print(f"LLM Model     : {llm_mode} model ==> {Settings.llm.model_name}")
    else:
        cost_input_tokens = (token_counter.prompt_llm_token_count * usd_cost_per_k_input_tokens) / 1000
        cost_output_tokens = (token_counter.completion_llm_token_count * usd_cost_per_k_output_tokens) / 1000
        cost_embedding_tokens = (token_counter.total_embedding_token_count * usd_cost_per_k_tokens) / 1000
        print(f"LLM Model     : {llm_mode} model ==> {Settings.llm.model}")
    
    
    print(f"Input tokens  : {token_counter.prompt_llm_token_count}, cost: USD {round(cost_input_tokens, 6)}")
    print(f"Output tokens : {token_counter.completion_llm_token_count}, cost: USD {round(cost_output_tokens, 6)}")
    print(f"Embedding tokens : {token_counter.total_embedding_token_count}, cost: USD {round(cost_embedding_tokens, 6)}")
    print(f"Total cost    : USD {round(cost_input_tokens + cost_output_tokens + cost_embedding_tokens, 6)} for this call.")
    token_counter.reset_counts()

## Using naive LlamaIndex RAG
- A single index all the resumes in the directory, and setup the query engine with top `k=3`. 

In [8]:
index = simple_RAG(vector_size=vector_size)
query_engine = index.as_query_engine(similarity_top_k=3, verbose=True)

Url postgresql://postgres:***@localhost:5432


Loading files: 100%|██████████| 5/5 [00:00<00:00, 226.47file/s]

Number of nodes: 83





Parsing nodes:   0%|          | 0/83 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/83 [00:00<?, ?it/s]

In [9]:
question = "Give me the names of all the applicants and their summaries."
response = query_engine.query(question)

display(markdown(question, response, display_nodes=True))

calculate_call_cost(token_counter)


## Question:
Give me the names of all the applicants and their summaries.

## Answer:
The context information provided appears to be a resume or portfolio for a person named Jane Smith. It details her professional experience as a Junior Full Stack Developer and Full Stack Developer, along with key projects she has worked on. The summary section highlights her overall experience and skills as a Full Stack Developer with expertise in various technologies and methodologies.

Unfortunately, the context does not contain information about multiple applicants or their summaries. It focuses solely on the background and qualifications of Jane Smith.

## Num Nodes:
3


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 1519, cost: USD 0.004557
Output tokens : 192, cost: USD 0.00288
Embedding tokens : 9807, cost: USD 0.000981
Total cost    : USD 0.008418 for this call.


In [10]:
question = "Tell more more about the candidate Jane Smith."
response = query_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)


## Question:
Tell more more about the candidate Jane Smith.

## Answer:
Jane Smith is an experienced and versatile Full Stack Developer with over 7 years of expertise. She is proficient in both frontend and backend technologies, including JavaScript, Python, React, Node.js, and Django. Jane has a strong background in database management, cloud services, and DevOps practices. She is adept at working in Agile environments, leading development teams, and delivering high-quality, scalable software solutions.

Jane is passionate about continuous learning and applying new technologies to solve complex problems. In her free time, she actively contributes to open-source projects on GitHub, collaborating with developers worldwide to improve and maintain software. Jane also enjoys tech blogging, where she writes technical articles and tutorials on full stack development, sharing her knowledge and insights with the developer community.

In addition to her professional pursuits, Jane has a keen interest in travel, using her experiences to gain new perspectives and inspiration for creative solutions. She is also an avid cyclist, participating in local cycling events and exploring new trails and routes.

Jane's professional experience includes working as a Full Stack Developer at Innovative Solutions Ltd., where she developed and maintained web applications using Django for the backend and Angular for the frontend, enhancing performance and user experience. She designed and implemented RESTful APIs, optimized database queries, and automated deployment processes using Docker and Kubernetes. Jane led the development of an Inventory Management System and a Customer Relationship Management (CRM) Tool, demonstrating her ability to deliver high-quality software solutions.

## Num Nodes:
3


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 1326, cost: USD 0.003978
Output tokens : 614, cost: USD 0.00921
Embedding tokens : 9, cost: USD 1e-06
Total cost    : USD 0.013189 for this call.


## Using Sub Question decomposition method
- Given the same question `Give me the names of all the applicants and their summaries.` Sub Question is able to generate answeres more clearly and intelligently than the naive RAG method. Also notice the sub questions generated by the system.
- Below, we are building multiple indices, one for each candidate, so that they can be compared and contrasted more effectively.

In [11]:
indices = []
candidates = []
for file in os.listdir(BASE_DIR):
    print(f"Processing {file}...")
    index = advanced_RAG(vector_size=vector_size, input_file=os.path.join(BASE_DIR, file))
    indices.append(index)
    candidates.append(file.replace(".md", "").replace("-", " ").replace("_", " ").title())
print(f"Candidates: {candidates}")

Processing emma-roberts.md...
Ingesting document: ./data/resume/emma-roberts.md...


Loading files: 100%|██████████| 1/1 [00:00<00:00, 647.27file/s]

Number of nodes: 16





Parsing nodes:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Processing jane-smith.md...
Ingesting document: ./data/resume/jane-smith.md...


Loading files: 100%|██████████| 1/1 [00:00<00:00, 530.79file/s]

Number of nodes: 17





Parsing nodes:   0%|          | 0/17 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/17 [00:00<?, ?it/s]

Processing michael-johnson.md...
Ingesting document: ./data/resume/michael-johnson.md...


Loading files: 100%|██████████| 1/1 [00:00<00:00, 552.75file/s]

Number of nodes: 18





Parsing nodes:   0%|          | 0/18 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/18 [00:00<?, ?it/s]

Processing john-doe.md...
Ingesting document: ./data/resume/john-doe.md...


Loading files: 100%|██████████| 1/1 [00:00<00:00, 491.94file/s]

Number of nodes: 16





Parsing nodes:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Processing alex-thompson.md...
Ingesting document: ./data/resume/alex-thompson.md...


Loading files: 100%|██████████| 1/1 [00:00<00:00, 556.42file/s]

Number of nodes: 16





Parsing nodes:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Candidates: ['Emma Roberts', 'Jane Smith', 'Michael Johnson', 'John Doe', 'Alex Thompson']


In [12]:
query_engine_tools = []

for i, idx in enumerate(indices):
    query_engine = idx.as_query_engine(similarity_top_k=3, verbose=True)
    query_engine_tools.append(
        QueryEngineTool(
            query_engine=query_engine,
            metadata=ToolMetadata(
                name=f"{candidates[i]} resume",
                description=(
                    f"This document is the resume of {candidates[i]}"
                ),
            ),
        )
    )
    
s_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    verbose=True
)    

In [13]:
question = "Give me the names of all the applicants and their summaries."
response = s_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)

Generated 5 sub questions.
[1;3;38;2;237;90;200m[Emma Roberts resume] Q: What is the name and summary of Emma Roberts?
[0m[1;3;38;2;90;149;237m[Jane Smith resume] Q: What is the name and summary of Jane Smith?
[0m[1;3;38;2;11;159;203m[Michael Johnson resume] Q: What is the name and summary of Michael Johnson?
[0m[1;3;38;2;155;135;227m[John Doe resume] Q: What is the name and summary of John Doe?
[0m[1;3;38;2;237;90;200m[Alex Thompson resume] Q: What is the name and summary of Alex Thompson?
[0m[1;3;38;2;90;149;237m[Jane Smith resume] A: Jane Smith is an experienced and versatile Full Stack Developer with over 7 years of expertise. She is proficient in both frontend and backend technologies, including JavaScript, Python, React, Node.js, and Django. Jane has a strong background in database management, cloud services, and DevOps practices. She is adept at working in Agile environments, leading development teams, and delivering high-quality, scalable software solutions. Jane is 


## Question:
Give me the names of all the applicants and their summaries.

## Answer:
Here are the names and summaries of all the applicants based on the provided context:

Emma Roberts - An enthusiastic and driven Platform Developer with over 1.5 years of experience, proficient in Python, Java, SQL, cloud computing, microservices, and DevOps practices. Skilled in Agile environments and delivering high-quality software solutions.

Jane Smith - An experienced and versatile Full Stack Developer with over 7 years of expertise in frontend and backend technologies like JavaScript, Python, React, Node.js, and Django. Strong background in databases, cloud services, DevOps, leading teams, and delivering scalable software solutions.

Michael Johnson - A highly skilled and results-oriented Principal Full Stack Developer with over 12 years of experience in frontend and backend technologies such as JavaScript, TypeScript, Python, React, Node.js, and Django. Expertise in architecting scalable solutions, mentoring teams, and driving project success.

John Doe - A creative and detail-oriented Frontend Developer with 1.5 years of experience in HTML, CSS, JavaScript, React, and Vue.js. Skilled in collaborating with cross-functional teams to deliver high-quality products that enhance user experience.

Alex Thompson - A detail-oriented and proactive Backend Developer with over 1.5 years of experience in Python, Java, SQL, RESTful API design, database management, and cloud services. Proven ability to collaborate effectively and deliver robust, scalable, and efficient backend solutions.

## Num Nodes:
20


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 8315, cost: USD 0.024945
Output tokens : 1909, cost: USD 0.028635
Embedding tokens : 8590, cost: USD 0.000859
Total cost    : USD 0.054439 for this call.


In [14]:
question = "Give me the names of developers with less than 5 years experience. Give me summaries too."
response = s_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)

Generated 10 sub questions.
[1;3;38;2;237;90;200m[Emma Roberts resume] Q: What is the work experience of Emma Roberts?
[0m[1;3;38;2;90;149;237m[Emma Roberts resume] Q: What is a summary of Emma Roberts' work experience?
[0m[1;3;38;2;11;159;203m[Jane Smith resume] Q: What is the work experience of Jane Smith?
[0m[1;3;38;2;155;135;227m[Jane Smith resume] Q: What is a summary of Jane Smith's work experience?
[0m[1;3;38;2;237;90;200m[Michael Johnson resume] Q: What is the work experience of Michael Johnson?
[0m[1;3;38;2;90;149;237m[Michael Johnson resume] Q: What is a summary of Michael Johnson's work experience?
[0m[1;3;38;2;11;159;203m[John Doe resume] Q: What is the work experience of John Doe?
[0m[1;3;38;2;155;135;227m[John Doe resume] Q: What is a summary of John Doe's work experience?
[0m[1;3;38;2;237;90;200m[Alex Thompson resume] Q: What is the work experience of Alex Thompson?
[0m[1;3;38;2;90;149;237m[Alex Thompson resume] Q: What is a summary of Alex Thompson's 


## Question:
Give me the names of developers with less than 5 years experience. Give me summaries too.

## Answer:
Based on the context information provided, the developers with less than 5 years of experience are:

1. Emma Roberts - Platform Developer (over 1.5 years of experience)
Summary: Emma Roberts is an enthusiastic and driven Platform Developer with over 1.5 years of professional experience. She is proficient in Python, Java, and SQL, with a strong understanding of cloud computing, microservices architecture, and DevOps practices. Emma has proven ability to work collaboratively in Agile environments, delivering high-quality software solutions that meet business needs. She is committed to continuous learning and leveraging new technologies to enhance platform performance and functionality.

2. John Doe - Junior Frontend Developer (1.5 years of experience)
Summary: John Doe has 1.5 years of professional experience as a Frontend Developer. He worked as a Junior Frontend Developer at ABC Digital Agency from June 2022 to December 2022. During his time there, he contributed to the development of various client projects, including landing pages, corporate websites, and single-page applications. He was responsible for translating wireframes and mockups into interactive, responsive web pages using HTML, CSS, and JavaScript. Additionally, he implemented animations and interactive elements, integrated APIs with backend systems, resolved cross-browser compatibility issues, and contributed to code maintenance and refactoring.

3. Alex Thompson - Backend Developer (over 1.5 years of experience)
Summary: Alex Thompson is a detail-oriented and proactive Backend Developer with over 1.5 years of professional experience. He is proficient in Python, Java, and SQL, with a strong understanding of RESTful API design, database management, and cloud services. Alex has proven ability to collaborate effectively with cross-functional teams to deliver robust, scalable, and efficient backend solutions. In his current role at Tech Innovations Ltd., he has developed and maintained server-side logic for web applications using Django and Flask, designed and implemented RESTful APIs, managed and optimized database schemas, and implemented CI/CD pipelines using Jenkins.

## Num Nodes:
40


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 17609, cost: USD 0.052827
Output tokens : 5091, cost: USD 0.076365
Embedding tokens : 100, cost: USD 1e-05
Total cost    : USD 0.129202 for this call.


In [15]:
question = "Give me the names of senior developers and their summaries."
response = s_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)

Generated 5 sub questions.
[1;3;38;2;237;90;200m[Emma Roberts resume] Q: What is Emma Roberts' job title and summary?
[0m[1;3;38;2;90;149;237m[Jane Smith resume] Q: What is Jane Smith's job title and summary?
[0m[1;3;38;2;11;159;203m[Michael Johnson resume] Q: What is Michael Johnson's job title and summary?
[0m[1;3;38;2;155;135;227m[John Doe resume] Q: What is John Doe's job title and summary?
[0m[1;3;38;2;237;90;200m[Alex Thompson resume] Q: What is Alex Thompson's job title and summary?
[0m[1;3;38;2;11;159;203m[Michael Johnson resume] A: Michael Johnson's current job title is Principal Full Stack Developer. His summary highlights that he is a highly skilled and results-oriented professional with over 12 years of experience in designing, developing, and leading complex web applications. He is proficient in both frontend and backend technologies, including JavaScript, TypeScript, Python, React, Node.js, and Django. Michael has demonstrated expertise in architecting scalable


## Question:
Give me the names of senior developers and their summaries.

## Answer:
Based on the context information provided, the senior developers and their summaries are:

Michael Johnson - Principal Full Stack Developer
Michael Johnson is a highly skilled and results-oriented professional with over 12 years of experience in designing, developing, and leading complex web applications. He is proficient in both frontend and backend technologies, including JavaScript, TypeScript, Python, React, Node.js, and Django. Michael has demonstrated expertise in architecting scalable solutions, mentoring development teams, and driving project success from inception to deployment. He is committed to delivering high-quality software solutions and continuously improving processes and technologies.

Jane Smith - Experienced and versatile Full Stack Developer with over 7 years of expertise in designing, developing, and deploying robust web applications. Proficient in both frontend and backend technologies, including JavaScript, Python, React, Node.js, and Django. Strong background in database management, cloud services, and DevOps practices. Adept at working in Agile environments, leading development teams, and delivering high-quality, scalable software solutions. Passionate about continuous learning and applying new technologies to solve complex problems.

## Num Nodes:
20


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 8905, cost: USD 0.026715
Output tokens : 1991, cost: USD 0.029865
Embedding tokens : 50, cost: USD 5e-06
Total cost    : USD 0.056585 for this call.


In [16]:
question = "Give me the names of developers specialising in platform. Give me their summaries too."
response = s_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)

Generated 10 sub questions.
[1;3;38;2;237;90;200m[Emma Roberts resume] Q: Does Emma Roberts' resume indicate she is a developer specializing in platform?
[0m[1;3;38;2;90;149;237m[Emma Roberts resume] Q: If so, what is Emma Roberts' summary?
[0m[1;3;38;2;11;159;203m[Jane Smith resume] Q: Does Jane Smith's resume indicate she is a developer specializing in platform?
[0m[1;3;38;2;155;135;227m[Jane Smith resume] Q: If so, what is Jane Smith's summary?
[0m[1;3;38;2;237;90;200m[Michael Johnson resume] Q: Does Michael Johnson's resume indicate he is a developer specializing in platform?
[0m[1;3;38;2;90;149;237m[Michael Johnson resume] Q: If so, what is Michael Johnson's summary?
[0m[1;3;38;2;11;159;203m[John Doe resume] Q: Does John Doe's resume indicate he is a developer specializing in platform?
[0m[1;3;38;2;155;135;227m[John Doe resume] Q: If so, what is John Doe's summary?
[0m[1;3;38;2;237;90;200m[Alex Thompson resume] Q: Does Alex Thompson's resume indicate he is a devel


## Question:
Give me the names of developers specialising in platform. Give me their summaries too.

## Answer:
Based on the context information provided, the only developer whose resume indicates specialization in platform development is Emma Roberts. Here is her name and summary:

Emma Roberts
"Enthusiastic and driven Platform Developer with over 1.5 years of professional experience in designing, developing, and maintaining scalable platform solutions. Proficient in Python, Java, and SQL with a strong understanding of cloud computing, microservices architecture, and DevOps practices. Proven ability to work collaboratively in Agile environments, delivering high-quality software solutions that meet business needs. Committed to continuous learning and leveraging new technologies to enhance platform performance and functionality."

The resumes of Jane Smith, Michael Johnson, John Doe, and Alex Thompson do not explicitly mention specialization in platform development.

## Num Nodes:
40


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 16229, cost: USD 0.048687
Output tokens : 3047, cost: USD 0.045705
Embedding tokens : 125, cost: USD 1.3e-05
Total cost    : USD 0.094404 for this call.


In [17]:
question = "Compare and contrast the senior candidates."
response = s_engine.query(question)

display(markdown(question, response, display_nodes=True))
calculate_call_cost(token_counter)

Generated 5 sub questions.
[1;3;38;2;237;90;200m[Emma Roberts resume] Q: What is the work experience and qualifications of Emma Roberts?
[0m[1;3;38;2;90;149;237m[Jane Smith resume] Q: What is the work experience and qualifications of Jane Smith?
[0m[1;3;38;2;11;159;203m[Michael Johnson resume] Q: What is the work experience and qualifications of Michael Johnson?
[0m[1;3;38;2;155;135;227m[John Doe resume] Q: What is the work experience and qualifications of John Doe?
[0m[1;3;38;2;237;90;200m[Alex Thompson resume] Q: What is the work experience and qualifications of Alex Thompson?
[0m[1;3;38;2;90;149;237m[Jane Smith resume] A: Based on the provided context information, Jane Smith is an experienced and versatile Full Stack Developer with over 7 years of expertise. She is proficient in both frontend and backend technologies, including JavaScript, Python, React, Node.js, and Django. Jane has a strong background in database management, cloud services, and DevOps practices. She is 


## Question:
Compare and contrast the senior candidates.

## Answer:
Based on the provided context information, the senior candidates with the most extensive experience are Michael Johnson and Jane Smith.

Michael Johnson is a Principal Full Stack Developer with over 12 years of professional experience. He has expertise in both frontend and backend technologies, including JavaScript, TypeScript, Python, React, Node.js, and Django. Michael has led the development of complex web applications, architected scalable solutions, mentored development teams, and driven project success from inception to deployment. He has worked on projects involving Angular, Django, RESTful APIs, GraphQL, Docker, Kubernetes, and real-time collaboration tools.

Jane Smith is an experienced Full Stack Developer with over 7 years of expertise. She is proficient in frontend technologies like JavaScript, React, and backend technologies like Python, Node.js, and Django. Jane has a strong background in database management, cloud services, and DevOps practices. She has led development teams and delivered high-quality, scalable software solutions. Jane has a diverse skill set spanning various programming languages, frameworks, tools, and platforms.

Both Michael and Jane have extensive experience in full-stack development, working with modern technologies, and leading teams in Agile environments. However, Michael has a more senior role and longer experience, suggesting he may have more expertise in architecting complex solutions and leading large-scale projects.

In contrast, Emma Roberts and Alex Thompson are relatively junior candidates with around 1.5 years of professional experience each. Emma is a Platform Developer proficient in Python, Java, and SQL, with a focus on cloud computing, microservices, and DevOps practices. Alex is a Backend Developer skilled in Python, Java, SQL, RESTful APIs, and database management.

While Emma and Alex have relevant skills and qualifications, their experience is significantly less compared to Michael and Jane, who have over a decade of industry experience and have likely worked on more complex and diverse projects.

## Num Nodes:
20


LLM Model     : remote model ==> anthropic.claude-3-sonnet-20240229-v1:0
Input tokens  : 8935, cost: USD 0.026805
Output tokens : 3376, cost: USD 0.05064
Embedding tokens : 55, cost: USD 6e-06
Total cost    : USD 0.077451 for this call.


## Let's inspect the default template used by SubQuestion query engine
- Note that it is a longer and more detailed prompt template that supplies an example for the Generator LLM. 

In [18]:
from IPython.display import Markdown, display

def display_prompt(prompt_template, prompt_key):
    text_md = f"**Prompt Key**: {prompt_key}<br>" f"**Text:** <br>"
    display(Markdown(text_md))
    print(prompt_template)
    display(Markdown("<br><br>"))
        
prompt_key = "question_gen:question_gen_prompt" # prompt used internally by Llamaindex for SubQuestion query engine
prompt = s_engine.get_prompts()[prompt_key]
display_prompt(prompt.template, prompt_key)


**Prompt Key**: question_gen:question_gen_prompt<br>**Text:** <br>

Given a user question, and a list of tools, output a list of relevant sub-questions in json markdown that when composed can help answer the full user question:

# Example 1
<Tools>
```json
{{
    "uber_10k": "Provides information about Uber financials for year 2021",
    "lyft_10k": "Provides information about Lyft financials for year 2021"
}}
```

<User Question>
Compare and contrast the revenue growth and EBITDA of Uber and Lyft for year 2021


<Output>
```json
{{
    "items": [
        {{
            "sub_question": "What is the revenue growth of Uber",
            "tool_name": "uber_10k"
        }},
        {{
            "sub_question": "What is the EBITDA of Uber",
            "tool_name": "uber_10k"
        }},
        {{
            "sub_question": "What is the revenue growth of Lyft",
            "tool_name": "lyft_10k"
        }},
        {{
            "sub_question": "What is the EBITDA of Lyft",
            "tool_name": "lyft_10k"
        }}
    ]
}}
```

# Example 2
<Tool

<br><br>