In [11]:
%pip install --upgrade --quiet  azure-search-documents
%pip install --upgrade --quiet  azure-identity
%pip install --upgrade --quiet pypdf

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [14]:
import os

from langchain_community.vectorstores.azuresearch import AzureSearch
from langchain_openai import AzureOpenAIEmbeddings, OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [8]:
from dotenv import load_dotenv
load_dotenv()

True

In [4]:
embeddings = AzureOpenAIEmbeddings(
    model="text-embedding-ada-002"
)

In [10]:
index_name: str = "company-reports-analysis"
vector_store: AzureSearch = AzureSearch(
    azure_search_endpoint=os.getenv('VECTOR_STORE_ADDRESS'),
    azure_search_key=os.getenv('VECTOR_STORE_PASSWORD'),
    index_name=index_name,
    embedding_function=embeddings.embed_query,
    additional_search_client_options={"retry_total": 4}
)


In [16]:
pdf_files = [
    "AAPL_Report.pdf",
    "GOOG_Report.pdf",
    "MSFT_Report.pdf",
]

# Load PDF documents
docs = [PyPDFLoader(file).load() for file in pdf_files]
docs_list = [page for pdf in docs for page in pdf]  # Flatten the list of pages

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=500,
    chunk_overlap=0
)
doc_splits = text_splitter.split_documents(docs_list)

vector_store.add_documents(documents=doc_splits)

['ZWYyOTU4MjktZGFiMC00YjZmLTk0YjItMTk5NzMzM2Y3MmJk',
 'NmExNDNjNjAtZjFhYi00YTM1LWIyZjYtNjUwMWM2NjIzNTli',
 'ZGU2ZmFlMmMtOGQ3NC00MTIxLThlZTgtZTZmOTc1YTY2ZTE3',
 'MDhhNGQ0YTMtNjM3OS00ZjQxLTg0ZjItODkyNTJiYjEzNjcz',
 'MDUyYTgxZDUtM2MxZi00NmJkLTk1MzItMjQxMzU0YTJiN2Mw',
 'MzUxMGRjYjUtYTBmYS00N2VlLWJmYWQtMDRlODlhZTM2ZGE1',
 'YjY0MWIxNjEtOTBmYy00MGQzLThkMjEtMjgxNDUyMzM4NjFl',
 'MDg4MmNhYWItZTYyMy00OGQyLTgyNDUtM2MxYmY0YWY4NzYx',
 'ZWJhMWVhNzYtNjllOS00ZGQzLThhMjgtNmQyYjU0NGRhODNl',
 'ODA0MWNjM2UtMWNkNi00NzYzLTljY2MtMGEyYzMzODVjNzY4',
 'N2ViNTYyOTYtZWM3My00YzFmLWE0MzctNzRjY2EwZWY1ZTg0',
 'M2MxMmZmYjUtMTM2ZC00NjYzLTlmNWEtNjBiODU2NTEwNzJj',
 'NWY2MWJiNzUtYTk5OS00N2VhLTljYjMtODI5ODFlOTAyYWNl',
 'ZDZmNDI3ZTktMmY5MC00OGQ0LWEzOWYtMmY5YmZkMThiNDNk',
 'MWViZDc5YTQtZDk2Yy00NWE5LTkyYzktNjFkODRlNTFkOGVj',
 'OGQ5MDgxZjMtZTkzYS00ZTM2LTkzM2UtMzBlYTk4OWQwNWIz',
 'M2VjZTQ4ZWEtMzc0ZS00MjgwLWIyZjQtN2VlOWRlMmFkOGU2',
 'NGVlYTZkZGMtMzE5NC00ZTM5LTg2ZjAtNjY2OGE3Nzg3ZmM2',
 'NDBiYzBhZDQtY2E1My00ZGQ3LTkzMTItM2U3NThhNTZl

In [17]:
docs = vector_store.similarity_search(
    query="How well do you think AAPL manages its liquidity",
    k=3,
    search_type="similarity",
)
print(docs[0].page_content)

Liquidity analysis  Continued from Page 4
is unusual among U.S. non-financial corporate issuers but Apple maintains substantial cash balances. At F3Q'23, Apple had $25.3 billion of cash and $1.1 billion 
in money market funds. We expect Apple to maintain strong balances of funds with same day access relative to commercial paper maturities. The principal uses 
of excess cash and liquidity are share repurchases. Apple spent $81 billion on share repurchases in the L TM F3Q '23 period. It had $7.2 billion of term debt and 
$38 billion of non-cancelable manufacturing purchase obligations due over the next 12 months.
Profile
Apple Inc. designs, manufactures and markets mobile communication and media devices and personal computers, and sells related software, accessories and 
third-party digital content and applications. Its fiscal year ends on the last Saturday of September.
MOODY'S INVESTORS SERVICEMOODY'S INVESTORS SERVICE
APPLE INC.  AAPL
Page 5 NOVEMBER 10, 2024 Moody's Bond Report: APPL

In [18]:
# Perform a hybrid search using the search_type parameter
docs = vector_store.similarity_search(
    query="How well do you think AAPL manages its liquidity",
    k=3,
    search_type="hybrid",
)
print(docs[0].page_content)

Liquidity analysis  Continued from Page 4
is unusual among U.S. non-financial corporate issuers but Apple maintains substantial cash balances. At F3Q'23, Apple had $25.3 billion of cash and $1.1 billion 
in money market funds. We expect Apple to maintain strong balances of funds with same day access relative to commercial paper maturities. The principal uses 
of excess cash and liquidity are share repurchases. Apple spent $81 billion on share repurchases in the L TM F3Q '23 period. It had $7.2 billion of term debt and 
$38 billion of non-cancelable manufacturing purchase obligations due over the next 12 months.
Profile
Apple Inc. designs, manufactures and markets mobile communication and media devices and personal computers, and sells related software, accessories and 
third-party digital content and applications. Its fiscal year ends on the last Saturday of September.
MOODY'S INVESTORS SERVICEMOODY'S INVESTORS SERVICE
APPLE INC.  AAPL
Page 5 NOVEMBER 10, 2024 Moody's Bond Report: APPL

In [19]:
# Perform a hybrid search using the search_type parameter
docs = vector_store.similarity_search(
    query="How well do you think Microsoft is profitable",
    k=3,
    search_type="hybrid",
)
print(docs[0].page_content)

investments in AI infrastructure. 
The company had $242 billion of RPOs at quarter ended March '24, of which it expects to recognize approximately 45% into revenues in the next 12 months. 
This large revenue backlog provide very high visibility into its revenues and operating cash flow in the intermediate term. Microsoft has maintained shareholder 
returns (dividends and share repurchases) at about 75% of its pre-dividend free cash flow over time. We expect the company to maintain a highly conservative 
financial policy with a robust cash position, low financial leverage, and capital allocation largely consistent with operating cash flow generation. Commercial paper 
borrowings were $20.5 billion at F3Q '24 (down from $25.8 billion at F1Q '24), and we expect these borrowings to decline sharply over the next 12 to 18 months. 
Rapidly evolving technologies, strong competition, and regulatory challenges represent key business risks 
Microsoft's end markets are characterized by rapidly evo