# RAG Post Retrieval Optimization - Reranking featuring Amazon SageMaker, Bedrock and llamaindex

In this tutorial, we showcase how to use a sub question query engine to tackle the problem of answering a complex query using multiple data sources.
It first breaks down the complex query into sub questions for each relevant data source, then gather all the intermediate reponses and synthesizes a final response.

- Vector Database (Faiss / local)
- LLM (Amazon Bedrock - Claude3 Sonnet)
- Embeddings Model (Bedrock Titan Text Embedding v2.0)
- Datasets ( Amazon SEC 10-k statements for year 2022 and 2023 )

In [1]:
!pip install llama-index
%pip install llama-index-llms-bedrock
%pip install llama-index-embeddings-bedrock
!pip uninstall pydantic -y
!pip install pydantic
%pip install sqlalchemy==2.0.21 --force-reinstall --quiet
%pip install llama-index-embeddings-instructor
%pip install llama-index-embeddings-huggingface

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Found existing installation: pydantic 2.8.2
Uninstalling pydantic-2.8.2:
  Successfully uninstalled pydantic-2.8.2
Collecting pydantic
  Using cached pydantic-2.8.2-py3-none-any.whl.metadata (125 kB)
Using cached pydantic-2.8.2-py3-none-any.whl (423 kB)
Installing collected packages: pydantic
Successfully installed pydantic-2.8.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyter-server 2.14.1 requires packaging>=22.0, but you have packaging 21.3 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
from llama_index.embeddings.bedrock import BedrockEmbedding

In [3]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
from llama_index.core import Settings

In [4]:
import json
from typing import Sequence, List
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

llm = Bedrock(model = "anthropic.claude-3-sonnet-20240229-v1:0")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v2:0")

Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
import nest_asyncio
nest_asyncio.apply()

In [5]:
%pip install torch sentence-transformers
from llama_index.core.postprocessor import SentenceTransformerRerank
postprocessor = SentenceTransformerRerank(
    model="mixedbread-ai/mxbai-rerank-large-v1", top_n=3
)

Note: you may need to restart the kernel to use updated packages.


In [6]:
!mkdir -p 'data/amazon/'
!wget 'https://s2.q4cdn.com/299287126/files/doc_financials/2023/q4/c7c14359-36fa-40c3-b3ca-5bf7f3fa0b96.pdf' -O 'data/amazon/amazon_2023.pdf'
!wget 'https://s2.q4cdn.com/299287126/files/doc_financials/2022/q4/d2fde7ee-05f7-419d-9ce8-186de4c96e25.pdf' -O 'data/amazon/amazon_2022.pdf'

--2024-07-26 18:38:32--  https://s2.q4cdn.com/299287126/files/doc_financials/2023/q4/c7c14359-36fa-40c3-b3ca-5bf7f3fa0b96.pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.4, 68.70.205.3, 68.70.205.1, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 800598 (782K) [application/pdf]
Saving to: ‘data/amazon/amazon_2023.pdf’


2024-07-26 18:38:32 (13.8 MB/s) - ‘data/amazon/amazon_2023.pdf’ saved [800598/800598]

--2024-07-26 18:38:33--  https://s2.q4cdn.com/299287126/files/doc_financials/2022/q4/d2fde7ee-05f7-419d-9ce8-186de4c96e25.pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.2, 68.70.205.1, 68.70.205.3, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 712683 (696K) [application/pdf]
Saving to: ‘data/amazon/amazon_2022.pdf’


2024-07-26 18:38:33 (11.0 MB/s) - ‘data/amazon/amazon_2022.pdf’ saved [712683/712683]



In [7]:
# load data
amazon_secfiles = SimpleDirectoryReader(input_dir="./data/amazon/").load_data()

# build index
amazon_index = VectorStoreIndex.from_documents(
    amazon_secfiles,
    use_async=True,
)

# Retrieval - Sentence Transformers Rerank.

In [8]:
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core import QueryBundle
import pandas as pd
from IPython.display import display, HTML


pd.set_option("display.max_colwidth", -1)


def get_retrieved_nodes(
    query_str, vector_top_k=10, reranker_top_n=3, with_reranker=False
):
    query_bundle = QueryBundle(query_str)
    # configure retriever
    retriever = VectorIndexRetriever(
        index=amazon_index,
        similarity_top_k=vector_top_k,
    )
    retrieved_nodes = retriever.retrieve(query_bundle)

    if with_reranker:
        # configure reranker
        reranker = SentenceTransformerRerank(model="mixedbread-ai/mxbai-rerank-large-v1", top_n=3)
        retrieved_nodes = reranker.postprocess_nodes(
            retrieved_nodes, query_bundle
        )

    return retrieved_nodes


def pretty_print(df):
    return display(HTML(df.to_html().replace("\\n", "")))


def visualize_retrieved_nodes(nodes) -> None:
    result_dicts = []
    for node in nodes:
        result_dict = {"Score": node.score, "Text": node.node.get_text()}
        result_dicts.append(result_dict)

    pretty_print(pd.DataFrame(result_dicts))

  pd.set_option("display.max_colwidth", -1)


In [9]:
new_nodes = get_retrieved_nodes(
    "What were key challenges faced by Amazon in year 2022?",
    vector_top_k=10,
    with_reranker=False,
)

In [10]:
visualize_retrieved_nodes(new_nodes)

Unnamed: 0,Score,Text
0,0.671755,"which has been critical to ourgrowth and success;•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks, armed hostilities, and political conflicts, including those involving China), labor ortrade disputes (including restrictive governmental actions impacting us, our customers, and our third-party sellers and suppliers in China or otherforeign countries), and similar events; and•potential negative impacts of climate change, including: increased operating costs due to more frequent extreme weather events or climate-relatedchanges, such as rising temperatures and water scarcity; increased investment requirements associated with the transition to a low-carbon economy;decreased demand for our products and services as a result of changes in customer behavior; increased compliance costs due to more extensive andglobal regulations and third-party requirements; and reputational damage resulting from perceptions of our environmental impact.We Face Risks Related to Successfully Optimizing and Operating Our Fulfillment Network and Data CentersFailures to adequately predict customer demand and consumer spending patterns or otherwise optimize and operate our fulfillment network and datacenters successfully from time to time result in excess or insufficient fulfillment or data center capacity, service interruptions, increased costs, and impairmentcharges, any of which could materially harm our business. As we continue to add fulfillment and data center capability or add new businesses with differentrequirements, our fulfillment and data center networks become increasingly complex and operating them becomes more challenging. There can be no assurancethat we will be able to operate our networks effectively.10"
1,0.660597,"resources such as land, water, and energy, commodities like paper andpacking supplies and hardware products, and technology infrastructure products, including as a result of inflationary pressures;•constrained labor markets, which increase our payroll costs;•the extent to which operators of the networks between our customers and our stores successfully charge fees to grant our customers unimpaired andunconstrained access to our online services;•our ability to collect amounts owed to us when they become due;•the extent to which new and existing technologies, or industry trends, restrict online advertising or affect our ability to customize advertising orotherwise tailor our product and service offerings;•the extent to which use of our services is affected by spyware, viruses, phishing and other spam emails, denial of service attacks, data theft, computerintrusions, outages, and similar events;•the extent to which we fail to maintain our unique culture of innovation, customer obsession, and long-term thinking, which has been critical to ourgrowth and success;•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks, armed hostilities, and political conflicts, including those involving China), labor ortrade disputes (including restrictive governmental actions impacting us, our customers, and our third-party sellers and suppliers in China or otherforeign countries), and similar events; and•potential negative impacts of climate change, including: increased operating costs due to more frequent extreme weather events or climate-relatedchanges, such as rising temperatures and water scarcity; increased investment requirements associated with the transition to a low-carbon economy;decreased demand for our products and services as a result of changes in customer behavior; increased compliance costs due to more extensive andglobal regulations and third-party requirements; and reputational damage resulting from perceptions of our environmental impact."
2,0.644169,"Table of Contentstransaction costs, our level of productivity and accuracy, changes in volume, size, and weight of units received and fulfilled, the extent to which third-partysellers utilize Fulfillment by Amazon services, timing of fulfillment network and physical store expansion, the extent we utilize fulfillment services providedby third parties, mix of products and services sold, and our ability to affect customer service contacts per unit by implementing improvements in our operationsand enhancements to our customer self-service features. Additionally, sales by our sellers have higher payment processing and related transaction costs as apercentage of net sales compared to our retail sales because payment processing costs are based on the gross purchase price of underlying transactions.The increase in fulfillment costs in absolute dollars in 2022, compared to the prior year, is primarily due to increased investments in our fulfillmentnetwork and variable costs corresponding with increased product and service sales volume and inventory levels, and increased wage rates and incentives.Changes in foreign exchange rates reduced fulfillment costs by $2.5 billion in 2022.We seek to expand our fulfillment network to accommodate a greater selection and in-stock inventory levels and to meet anticipated shipment volumesfrom sales of our own products as well as sales by third parties for which we provide the fulfillment services. We regularly evaluate our facility requirements.Technology and ContentTechnology and content costs include payroll and related expenses for employees involved in the research and development of new and existing productsand services, development, design, and maintenance of our stores, curation and display of products and services made available in our online stores, andinfrastructure costs. Infrastructure costs include servers, networking equipment, and data center related depreciation and amortization, rent, utilities, and otherexpenses necessary to support AWS and other Amazon businesses. Collectively, these costs reflect the investments we make in order to offer a wide variety ofproducts and services to our customers, including expenditures related to initiatives to build and deploy innovative and efficient software and electronic devicesand the development of a satellite network for global broadband service and autonomous vehicles for ride-hailing services.We seek to invest efficiently in numerous areas of technology and content so we may continue to enhance the customer experience and improve ourprocess efficiency through rapid technology developments, while operating at an ever increasing scale."
3,0.642127,"Table of Contentsfulfilled, the extent to which third-party sellers utilize Fulfillment by Amazon services, timing of fulfillment network and physical store expansion, the extentwe utilize fulfillment services provided by third parties, mix of products and services sold, and our ability to affect customer service contacts per unit byimplementing improvements in our operations and enhancements to our customer self-service features. Additionally, sales by our sellers have higher paymentprocessing and related transaction costs as a percentage of net sales compared to our retail sales because payment processing costs are based on the grosspurchase price of underlying transactions.The increase in fulfillment costs in absolute dollars in 2023, compared to the prior year, is primarily due to increased sales and investments in ourfulfillment network, partially offset by fulfillment network efficiencies. Changes in foreign exchange rates increased fulfillment costs by $52 million in 2023.We seek to expand our fulfillment network to accommodate a greater selection and in-stock inventory levels and to meet anticipated shipment volumesfrom sales of our own products as well as sales by third parties for which we provide the fulfillment services. We regularly evaluate our facility requirements.Technology and InfrastructureTechnology and infrastructure costs include payroll and related expenses for employees involved in the research and development of new and existingproducts and services, development, design, and maintenance of our stores, curation and display of products and services made available in our online stores,and infrastructure costs. Infrastructure costs include servers, networking equipment, and data center related depreciation and amortization, rent, utilities, andother expenses necessary to support AWS and other Amazon businesses. Collectively, these costs reflect the investments we make in order to offer a widevariety of products and services to our customers, including expenditures related to initiatives to build and deploy innovative and efficient software andelectronic devices and the development of a satellite network for global broadband service and autonomous vehicles for ride-hailing services.We seek to invest efficiently in numerous areas of technology and infrastructure so we may continue to enhance the customer experience and improveour process efficiency through rapid technology developments, while operating at an ever increasing scale. Our technology and infrastructure investment andcapital spending projects often support a variety of product and service offerings due to geographic expansion and the cross-functionality of our systems andoperations."
4,0.639565,"They may secure better terms from vendors, adopt more aggressive pricing, and devote more resources to technology, infrastructure,fulfillment, and marketing.Competition continues to intensify, including with the development of new business models and the entry of new and well-funded competitors, and asour competitors enter into business combinations or alliances and established companies in other market segments expand to become competitive with ourbusiness. In addition, new and enhanced technologies, including search, web and infrastructure computing services, digital content, and electronic devicescontinue to increase our competition. The Internet facilitates competitive entry and comparison shopping, which enhances the ability of new, smaller, or lesserknown businesses to compete against us. As a result of competition, our product and service offerings may not be successful, we may fail to gain or may losebusiness, and we may be required to increase our spending or lower prices, any of which could materially reduce our sales and profits.Our Expansion into New Products, Services, Technologies, and Geographic Regions Subjects Us to Additional RisksWe may have limited or no experience in our newer market segments, and our customers may not adopt our product or service offerings. These offerings,which can present new and difficult technology challenges, may subject us to claims if customers of these offerings experience, or are otherwise impacted by,service disruptions, delays, setbacks, or failures or quality issues. In addition, profitability, if any, in our newer activities may not meet our expectations, and wemay not be successful enough in these newer activities to recoup our investments in them, which investments are often significant. Failure to realize thebenefits of amounts we invest in new technologies, products, or services could result in the value of those investments being written down or written off. Inaddition, our sustainability initiatives may be unsuccessful for a variety of6"
5,0.625723,"In addition, rising fuel, utility, and food costs, rising interest rates, and recessionary fears may impact customer demandand our ability to forecast consumer spending patterns. We also expect the current macroeconomic environment and enterprise customer cost optimizationefforts to impact our AWS revenue growth rates. We expect some or all of these factors to continue to impact our operations into Q1 2023.Net SalesNet sales include product and service sales. Product sales represent revenue from the sale of products and related shipping fees and digital media contentwhere we record revenue gross. Service sales primarily represent third-party seller fees, which includes commissions and any related fulfillment and shippingfees, AWS sales, advertising services, Amazon Prime membership fees, and certain digital content subscriptions. Net sales information is as follows (inmillions): Year Ended December 31, 2021 2022Net Sales:North America $ 279,833 $ 315,880 International 127,787 118,007 AWS 62,202 80,096 Consolidated $ 469,822 $ 513,983 Year-over-year Percentage Growth (Decline):North America 18 % 13 %International 22 (8)AWS 37 29 Consolidated 22 9 Year-over-year Percentage Growth, excluding the effect of foreign exchange rates:North America 18 % 13 %International 20 4 AWS 37 29 Consolidated 21 13 Net sales mix:North America 60 % 61 %International 27 23 AWS 13 16 Consolidated 100 % 100 %Sales increased 9% in 2022, compared to the prior year. Changes in foreign currency exchange rates reduced net sales by $15.5 billion in 2022. For adiscussion of the effect of foreign exchange rates on sales growth, see “Effect of Foreign Exchange Rates” below.North America sales increased 13% in 2022, compared to the prior year. The sales growth primarily reflects increased unit sales, including sales by third-party sellers, advertising sales, and subscription services."
6,0.624265,"Table of Contents•the outcomes of legal proceedings and claims, which may include significant monetary damages or injunctive relief and could have a materialadverse impact on our operating results;•variations in the mix of products and services we sell;•variations in our level of merchandise and vendor returns;•the extent to which we offer fast and free delivery, continue to reduce prices worldwide, and provide additional benefits to our customers;•factors affecting our reputation or brand image (including any actual or perceived inability to achieve our goals or commitments, whether related tosustainability, customers, employees, or other topics);•the extent to which we invest in technology and content, fulfillment, and other expense categories;•increases in the prices of transportation (including fuel), energy products, commodities like paper and packing supplies and hardware products, andtechnology infrastructure products, including as a result of inflationary pressures;•constrained labor markets, which increase our payroll costs;•the extent to which operators of the networks between our customers and our stores successfully charge fees to grant our customers unimpaired andunconstrained access to our online services;•our ability to collect amounts owed to us when they become due;•the extent to which new and existing technologies, or industry trends, restrict online advertising or affect our ability to customize advertising orotherwise tailor our product and service offerings;•the extent to which use of our services is affected by spyware, viruses, phishing and other spam emails, denial of service attacks, data theft, computerintrusions, outages, and similar events; and•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks and armed hostilities), labor or trade disputes (including restrictive governmentalactions impacting us and our third-party sellers in China or other foreign countries), and similar events.We Face Risks Related to Successfully Optimizing and Operating Our Fulfillment Network and Data CentersFailures to adequately predict customer demand or otherwise optimize and operate our fulfillment network and data centers successfully from time totime result in excess or insufficient fulfillment or data center capacity, service interruptions, increased costs, and impairment charges, any of which couldmaterially harm our business."
7,0.622228,"Table of Contentsultimately take a view contrary to ours. In addition, our Chinese and Indian businesses and operations may be unable to continue to operate if we or ouraffiliates are unable to access sufficient funding or, in China, enforce contractual relationships we or our affiliates have in place. Violation of any existing orfuture PRC, Indian, or other laws or regulations or changes in the interpretations of those laws and regulations could result in our businesses in those countriesbeing subject to fines and other financial penalties, having licenses revoked, or being forced to restructure our operations or shut down entirely.The Variability in Our Retail Business Places Increased Strain on Our OperationsDemand for our products and services can fluctuate significantly for many reasons, including as a result of seasonality, promotions, product launches, orunforeseeable events, such as in response to global economic conditions such as recessionary fears or rising inflation, natural or human-caused disasters(including public health crises) or extreme weather (including as a result of climate change), or geopolitical events. For example, we expect a disproportionateamount of our retail sales to occur during our fourth quarter. Our failure to stock or restock popular products in sufficient amounts such that we fail to meetcustomer demand could significantly affect our revenue and our future growth. When we overstock products, we may be required to take significant inventorymarkdowns or write-offs and incur commitment costs, which could materially reduce profitability. We regularly experience increases in our net shipping costdue to complimentary upgrades, split-shipments, and additional long-zone shipments necessary to ensure timely delivery for the holiday season. If too manycustomers access our websites within a short period of time due to increased demand, we may experience system interruptions that make our websitesunavailable or prevent us from efficiently fulfilling orders, which may reduce the volume of goods we offer or sell and the attractiveness of our products andservices. In addition, we may be unable to adequately staff our fulfillment network and customer service centers during these peak periods and delivery andother fulfillment companies and customer service co-sourcers may be unable to meet the seasonal demand. Risks described elsewhere in this Item 1A relatingto fulfillment network optimization and inventory are magnified during periods of high demand."
8,0.619883,";•lower levels of credit card usage and increased payment risk;•difficulty in staffing, developing, and managing foreign operations as a result of distance, language, and cultural differences;•different employee/employer relationships and the existence of works councils and labor unions;•compliance with the U.S. Foreign Corrupt Practices Act and other applicable U.S. and foreign laws prohibiting corrupt payments to governmentofficials and other third parties;•laws and policies of the U.S. and other jurisdictions affecting trade, foreign investment, loans, and taxes; and•geopolitical events, including war and terrorism.As international physical, e-commerce, and omnichannel retail, cloud services, and other services grow, competition will intensify, including throughadoption of evolving business models. Local companies may have a substantial competitive advantage because of their greater understanding of, and focus on,the local customer, as well as their more established local brand names. The inability to hire, train, retain, and manage sufficient required personnel may limitour international growth.The People’s Republic of China (“PRC”) and India regulate Amazon’s and its affiliates’ businesses and operations in country through regulations andlicense requirements that may restrict (i) foreign investment in and operation of the internet, IT infrastructure, data centers, retail, delivery, and other sectors,(ii) internet content, and (iii) the sale of media and other products and services. For example, in order to meet local ownership, regulatory licensing, andcybersecurity requirements, we provide certain technology services in China through contractual relationships with third parties that hold PRC licenses toprovide services. In India, the government restricts the ownership or control of Indian companies by foreign entities involved in online multi-brand retailtrading activities. For www.amazon.in, we provide certain marketing tools and logistics services to third-party sellers to enable them to sell online and deliverto customers, and we hold an indirect minority interest in an entity that is a third-party seller on the www.amazon.in marketplace. Although we believe thesestructures and activities comply with existing laws, they involve unique risks, and the PRC and India may from time to time consider and implement additionalchanges in7"
9,0.618867,"Table of ContentsIn addition, failure to optimize inventory management or staffing in our fulfillment network increases our net shipping cost by increasing the distanceproducts are shipped and reducing the number of units per shipment or delivery. We and our co-sourcers may be unable to adequately staff our fulfillmentnetwork and customer service centers. For example, productivity across our fulfillment network is affected by regional labor market constraints, which increasepayroll costs and make it difficult to hire, train, and deploy a sufficient number of people to operate our fulfillment network as efficiently as we would like.Under some of our commercial agreements, we maintain the inventory of other companies, thereby increasing the complexity of tracking inventory andoperating our fulfillment network. Our failure to adequately predict seller demand for storage or to properly handle such inventory or the inability of the otherbusinesses on whose behalf we perform inventory fulfillment services to accurately forecast product demand may result in us being unable to secure sufficientstorage space or to optimize our fulfillment network or cause other unexpected costs and other harm to our business and reputation.We rely on a limited number of shipping companies to deliver inventory to us and completed orders to our customers. An inability to negotiate acceptableterms with these companies or performance problems, staffing limitations, or other difficulties experienced by these companies or by our own transportationsystems, including as a result of labor market constraints and related costs, could negatively impact our operating results and customer experience. In addition,our ability to receive inbound inventory efficiently and ship completed orders to customers also may be negatively affected by natural or human-causeddisasters (including public health crises) or extreme weather (including as a result of climate change), geopolitical events and security issues, labor or tradedisputes, and similar events."


In [11]:
new_nodes = get_retrieved_nodes(
    "What were key challenges faced by Amazon in year 2022?",
    vector_top_k=10,
    reranker_top_n=3,
    with_reranker=True,
)

In [12]:
visualize_retrieved_nodes(new_nodes)

Unnamed: 0,Score,Text
0,0.795661,"which has been critical to ourgrowth and success;•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks, armed hostilities, and political conflicts, including those involving China), labor ortrade disputes (including restrictive governmental actions impacting us, our customers, and our third-party sellers and suppliers in China or otherforeign countries), and similar events; and•potential negative impacts of climate change, including: increased operating costs due to more frequent extreme weather events or climate-relatedchanges, such as rising temperatures and water scarcity; increased investment requirements associated with the transition to a low-carbon economy;decreased demand for our products and services as a result of changes in customer behavior; increased compliance costs due to more extensive andglobal regulations and third-party requirements; and reputational damage resulting from perceptions of our environmental impact.We Face Risks Related to Successfully Optimizing and Operating Our Fulfillment Network and Data CentersFailures to adequately predict customer demand and consumer spending patterns or otherwise optimize and operate our fulfillment network and datacenters successfully from time to time result in excess or insufficient fulfillment or data center capacity, service interruptions, increased costs, and impairmentcharges, any of which could materially harm our business. As we continue to add fulfillment and data center capability or add new businesses with differentrequirements, our fulfillment and data center networks become increasingly complex and operating them becomes more challenging. There can be no assurancethat we will be able to operate our networks effectively.10"
1,0.770071,"In addition, rising fuel, utility, and food costs, rising interest rates, and recessionary fears may impact customer demandand our ability to forecast consumer spending patterns. We also expect the current macroeconomic environment and enterprise customer cost optimizationefforts to impact our AWS revenue growth rates. We expect some or all of these factors to continue to impact our operations into Q1 2023.Net SalesNet sales include product and service sales. Product sales represent revenue from the sale of products and related shipping fees and digital media contentwhere we record revenue gross. Service sales primarily represent third-party seller fees, which includes commissions and any related fulfillment and shippingfees, AWS sales, advertising services, Amazon Prime membership fees, and certain digital content subscriptions. Net sales information is as follows (inmillions): Year Ended December 31, 2021 2022Net Sales:North America $ 279,833 $ 315,880 International 127,787 118,007 AWS 62,202 80,096 Consolidated $ 469,822 $ 513,983 Year-over-year Percentage Growth (Decline):North America 18 % 13 %International 22 (8)AWS 37 29 Consolidated 22 9 Year-over-year Percentage Growth, excluding the effect of foreign exchange rates:North America 18 % 13 %International 20 4 AWS 37 29 Consolidated 21 13 Net sales mix:North America 60 % 61 %International 27 23 AWS 13 16 Consolidated 100 % 100 %Sales increased 9% in 2022, compared to the prior year. Changes in foreign currency exchange rates reduced net sales by $15.5 billion in 2022. For adiscussion of the effect of foreign exchange rates on sales growth, see “Effect of Foreign Exchange Rates” below.North America sales increased 13% in 2022, compared to the prior year. The sales growth primarily reflects increased unit sales, including sales by third-party sellers, advertising sales, and subscription services."
2,0.726103,"They may secure better terms from vendors, adopt more aggressive pricing, and devote more resources to technology, infrastructure,fulfillment, and marketing.Competition continues to intensify, including with the development of new business models and the entry of new and well-funded competitors, and asour competitors enter into business combinations or alliances and established companies in other market segments expand to become competitive with ourbusiness. In addition, new and enhanced technologies, including search, web and infrastructure computing services, digital content, and electronic devicescontinue to increase our competition. The Internet facilitates competitive entry and comparison shopping, which enhances the ability of new, smaller, or lesserknown businesses to compete against us. As a result of competition, our product and service offerings may not be successful, we may fail to gain or may losebusiness, and we may be required to increase our spending or lower prices, any of which could materially reduce our sales and profits.Our Expansion into New Products, Services, Technologies, and Geographic Regions Subjects Us to Additional RisksWe may have limited or no experience in our newer market segments, and our customers may not adopt our product or service offerings. These offerings,which can present new and difficult technology challenges, may subject us to claims if customers of these offerings experience, or are otherwise impacted by,service disruptions, delays, setbacks, or failures or quality issues. In addition, profitability, if any, in our newer activities may not meet our expectations, and wemay not be successful enough in these newer activities to recoup our investments in them, which investments are often significant. Failure to realize thebenefits of amounts we invest in new technologies, products, or services could result in the value of those investments being written down or written off. Inaddition, our sustainability initiatives may be unsuccessful for a variety of6"


# Seeing reranking in action in final results

In [13]:
query_engine_naive = amazon_index.as_query_engine(similarity_top_k=10)

In [14]:
response = query_engine_naive.query(
    "What were key challenges faced by Amazon in year 2022?"
)

In [15]:
print(response)

Based on the context information provided, some of the key challenges faced by Amazon in 2022 included:

1. Inflationary pressures leading to increased costs for resources like land, water, energy, commodities, and technology infrastructure products.

2. Constrained labor markets resulting in higher payroll costs.

3. Disruptions from natural disasters, public health crises, extreme weather events (potentially due to climate change), geopolitical tensions, labor disputes, and similar events impacting operations.

4. Potential negative impacts of climate change, such as increased operating costs due to extreme weather, investment requirements for transitioning to a low-carbon economy, decreased demand due to changes in customer behavior, and reputational damage.

5. Challenges in optimizing and operating the fulfillment network and data centers, leading to excess or insufficient capacity, service interruptions, and increased costs.

6. Variability in retail demand, especially during pea

In [16]:
query_engine_rerank = amazon_index.as_query_engine(similarity_top_k=10, node_postprocessor =[postprocessor])

In [17]:
response = query_engine_rerank.query(
    "What were key challenges faced by Amazon in year 2022?"
)

In [18]:
print(response)

Based on the context information provided, some of the key challenges faced by Amazon in 2022 included:

1. Increased operating costs due to factors like rising fuel, utility, and food costs, as well as higher payroll costs from constrained labor markets.

2. Impacts from macroeconomic factors like rising inflation, recessionary fears, and enterprise customers' cost optimization efforts, which affected consumer spending patterns and AWS revenue growth rates.

3. Supply chain disruptions and fulfillment network challenges caused by factors like natural disasters, extreme weather events related to climate change, geopolitical conflicts, labor disputes, and similar events.

4. Difficulties in optimizing inventory management and staffing levels across the fulfillment network, leading to higher shipping costs and operational inefficiencies.

5. Dependence on a limited number of shipping companies, whose performance issues or capacity constraints could negatively impact order fulfillment and

# With LLMRerank Module

In [61]:
from llama_index.core.postprocessor import LLMRerank

In [62]:
llm = Bedrock(model = "anthropic.claude-3-haiku-20240307-v1:0")

In [63]:
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

In [66]:
#from llama_index.core import ServiceContext
#service_context = ServiceContext.from_defaults(llm=Settings.llm, 
 #                                              embed_model=Settings.embed_model, 
#                                               chunk_size=Settings.chunk_size,
#                                               chunk_overlap=Settings.chunk_overlap,
#                                            )

In [67]:
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core import QueryBundle
import pandas as pd
from IPython.display import display, HTML


pd.set_option("display.max_colwidth", -1)


def get_retrieved_nodes(
    query_str, vector_top_k=10, reranker_top_n=3, with_reranker=False
):
    query_bundle = QueryBundle(query_str)
    # configure retriever
    retriever = VectorIndexRetriever(
        index=amazon_index,
        similarity_top_k=vector_top_k,
    )
    retrieved_nodes = retriever.retrieve(query_bundle)

    if with_reranker:
        # configure reranker
        reranker = LLMRerank(choice_batch_size=3, top_n=reranker_top_n)
        retrieved_nodes = reranker.postprocess_nodes(
            retrieved_nodes, query_bundle
        )

    return retrieved_nodes


def pretty_print(df):
    return display(HTML(df.to_html().replace("\\n", "")))


def visualize_retrieved_nodes(nodes) -> None:
    result_dicts = []
    for node in nodes:
        result_dict = {"Score": node.score, "Text": node.node.get_text()}
        result_dicts.append(result_dict)

    pretty_print(pd.DataFrame(result_dicts))

  pd.set_option("display.max_colwidth", -1)


In [68]:
new_nodes = get_retrieved_nodes(
    "What were key challenges faced by Amazon in year 2022?",
    vector_top_k=5,
    with_reranker=False,
)

In [69]:
visualize_retrieved_nodes(new_nodes)

Unnamed: 0,Score,Text
0,0.671755,"which has been critical to ourgrowth and success;•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks, armed hostilities, and political conflicts, including those involving China), labor ortrade disputes (including restrictive governmental actions impacting us, our customers, and our third-party sellers and suppliers in China or otherforeign countries), and similar events; and•potential negative impacts of climate change, including: increased operating costs due to more frequent extreme weather events or climate-relatedchanges, such as rising temperatures and water scarcity; increased investment requirements associated with the transition to a low-carbon economy;decreased demand for our products and services as a result of changes in customer behavior; increased compliance costs due to more extensive andglobal regulations and third-party requirements; and reputational damage resulting from perceptions of our environmental impact.We Face Risks Related to Successfully Optimizing and Operating Our Fulfillment Network and Data CentersFailures to adequately predict customer demand and consumer spending patterns or otherwise optimize and operate our fulfillment network and datacenters successfully from time to time result in excess or insufficient fulfillment or data center capacity, service interruptions, increased costs, and impairmentcharges, any of which could materially harm our business. As we continue to add fulfillment and data center capability or add new businesses with differentrequirements, our fulfillment and data center networks become increasingly complex and operating them becomes more challenging. There can be no assurancethat we will be able to operate our networks effectively.10"
1,0.660597,"resources such as land, water, and energy, commodities like paper andpacking supplies and hardware products, and technology infrastructure products, including as a result of inflationary pressures;•constrained labor markets, which increase our payroll costs;•the extent to which operators of the networks between our customers and our stores successfully charge fees to grant our customers unimpaired andunconstrained access to our online services;•our ability to collect amounts owed to us when they become due;•the extent to which new and existing technologies, or industry trends, restrict online advertising or affect our ability to customize advertising orotherwise tailor our product and service offerings;•the extent to which use of our services is affected by spyware, viruses, phishing and other spam emails, denial of service attacks, data theft, computerintrusions, outages, and similar events;•the extent to which we fail to maintain our unique culture of innovation, customer obsession, and long-term thinking, which has been critical to ourgrowth and success;•disruptions from natural or human-caused disasters (including public health crises) or extreme weather (including as a result of climate change),geopolitical events and security issues (including terrorist attacks, armed hostilities, and political conflicts, including those involving China), labor ortrade disputes (including restrictive governmental actions impacting us, our customers, and our third-party sellers and suppliers in China or otherforeign countries), and similar events; and•potential negative impacts of climate change, including: increased operating costs due to more frequent extreme weather events or climate-relatedchanges, such as rising temperatures and water scarcity; increased investment requirements associated with the transition to a low-carbon economy;decreased demand for our products and services as a result of changes in customer behavior; increased compliance costs due to more extensive andglobal regulations and third-party requirements; and reputational damage resulting from perceptions of our environmental impact."
2,0.644169,"Table of Contentstransaction costs, our level of productivity and accuracy, changes in volume, size, and weight of units received and fulfilled, the extent to which third-partysellers utilize Fulfillment by Amazon services, timing of fulfillment network and physical store expansion, the extent we utilize fulfillment services providedby third parties, mix of products and services sold, and our ability to affect customer service contacts per unit by implementing improvements in our operationsand enhancements to our customer self-service features. Additionally, sales by our sellers have higher payment processing and related transaction costs as apercentage of net sales compared to our retail sales because payment processing costs are based on the gross purchase price of underlying transactions.The increase in fulfillment costs in absolute dollars in 2022, compared to the prior year, is primarily due to increased investments in our fulfillmentnetwork and variable costs corresponding with increased product and service sales volume and inventory levels, and increased wage rates and incentives.Changes in foreign exchange rates reduced fulfillment costs by $2.5 billion in 2022.We seek to expand our fulfillment network to accommodate a greater selection and in-stock inventory levels and to meet anticipated shipment volumesfrom sales of our own products as well as sales by third parties for which we provide the fulfillment services. We regularly evaluate our facility requirements.Technology and ContentTechnology and content costs include payroll and related expenses for employees involved in the research and development of new and existing productsand services, development, design, and maintenance of our stores, curation and display of products and services made available in our online stores, andinfrastructure costs. Infrastructure costs include servers, networking equipment, and data center related depreciation and amortization, rent, utilities, and otherexpenses necessary to support AWS and other Amazon businesses. Collectively, these costs reflect the investments we make in order to offer a wide variety ofproducts and services to our customers, including expenditures related to initiatives to build and deploy innovative and efficient software and electronic devicesand the development of a satellite network for global broadband service and autonomous vehicles for ride-hailing services.We seek to invest efficiently in numerous areas of technology and content so we may continue to enhance the customer experience and improve ourprocess efficiency through rapid technology developments, while operating at an ever increasing scale."
3,0.642127,"Table of Contentsfulfilled, the extent to which third-party sellers utilize Fulfillment by Amazon services, timing of fulfillment network and physical store expansion, the extentwe utilize fulfillment services provided by third parties, mix of products and services sold, and our ability to affect customer service contacts per unit byimplementing improvements in our operations and enhancements to our customer self-service features. Additionally, sales by our sellers have higher paymentprocessing and related transaction costs as a percentage of net sales compared to our retail sales because payment processing costs are based on the grosspurchase price of underlying transactions.The increase in fulfillment costs in absolute dollars in 2023, compared to the prior year, is primarily due to increased sales and investments in ourfulfillment network, partially offset by fulfillment network efficiencies. Changes in foreign exchange rates increased fulfillment costs by $52 million in 2023.We seek to expand our fulfillment network to accommodate a greater selection and in-stock inventory levels and to meet anticipated shipment volumesfrom sales of our own products as well as sales by third parties for which we provide the fulfillment services. We regularly evaluate our facility requirements.Technology and InfrastructureTechnology and infrastructure costs include payroll and related expenses for employees involved in the research and development of new and existingproducts and services, development, design, and maintenance of our stores, curation and display of products and services made available in our online stores,and infrastructure costs. Infrastructure costs include servers, networking equipment, and data center related depreciation and amortization, rent, utilities, andother expenses necessary to support AWS and other Amazon businesses. Collectively, these costs reflect the investments we make in order to offer a widevariety of products and services to our customers, including expenditures related to initiatives to build and deploy innovative and efficient software andelectronic devices and the development of a satellite network for global broadband service and autonomous vehicles for ride-hailing services.We seek to invest efficiently in numerous areas of technology and infrastructure so we may continue to enhance the customer experience and improveour process efficiency through rapid technology developments, while operating at an ever increasing scale. Our technology and infrastructure investment andcapital spending projects often support a variety of product and service offerings due to geographic expansion and the cross-functionality of our systems andoperations."
4,0.639565,"They may secure better terms from vendors, adopt more aggressive pricing, and devote more resources to technology, infrastructure,fulfillment, and marketing.Competition continues to intensify, including with the development of new business models and the entry of new and well-funded competitors, and asour competitors enter into business combinations or alliances and established companies in other market segments expand to become competitive with ourbusiness. In addition, new and enhanced technologies, including search, web and infrastructure computing services, digital content, and electronic devicescontinue to increase our competition. The Internet facilitates competitive entry and comparison shopping, which enhances the ability of new, smaller, or lesserknown businesses to compete against us. As a result of competition, our product and service offerings may not be successful, we may fail to gain or may losebusiness, and we may be required to increase our spending or lower prices, any of which could materially reduce our sales and profits.Our Expansion into New Products, Services, Technologies, and Geographic Regions Subjects Us to Additional RisksWe may have limited or no experience in our newer market segments, and our customers may not adopt our product or service offerings. These offerings,which can present new and difficult technology challenges, may subject us to claims if customers of these offerings experience, or are otherwise impacted by,service disruptions, delays, setbacks, or failures or quality issues. In addition, profitability, if any, in our newer activities may not meet our expectations, and wemay not be successful enough in these newer activities to recoup our investments in them, which investments are often significant. Failure to realize thebenefits of amounts we invest in new technologies, products, or services could result in the value of those investments being written down or written off. Inaddition, our sustainability initiatives may be unsuccessful for a variety of6"


In [72]:
new_nodes = get_retrieved_nodes(
    "What were key challenges faced by Amazon in year 2022?",
    vector_top_k=10,
    reranker_top_n=3,
    with_reranker=True,
)

IndexError: list index out of range

In [None]:
visualize_retrieved_nodes(new_nodes)