## Dependencies

In [7]:
!pip install -r requirements.txt

Collecting azure.ai.formrecognizer
  Obtaining dependency information for azure.ai.formrecognizer from https://files.pythonhosted.org/packages/60/96/9496960475a578b5bd688eee6be8ec96d9a2673925fc447be9214d73a547/azure_ai_formrecognizer-3.3.2-py3-none-any.whl.metadata
  Downloading azure_ai_formrecognizer-3.3.2-py3-none-any.whl.metadata (63 kB)
     ---------------------------------------- 0.0/63.9 kB ? eta -:--:--
     ------ --------------------------------- 10.2/63.9 kB ? eta -:--:--
     ------------------------ ------------- 41.0/63.9 kB 393.8 kB/s eta 0:00:01
     -------------------------------------- 63.9/63.9 kB 492.2 kB/s eta 0:00:00
Collecting msrest>=0.6.21 (from azure.ai.formrecognizer)
  Downloading msrest-0.7.1-py3-none-any.whl (85 kB)
     ---------------------------------------- 0.0/85.4 kB ? eta -:--:--
     ---------------------------------------- 85.4/85.4 kB 5.0 MB/s eta 0:00:00
Collecting azure-common~=1.1 (from azure.ai.formrecognizer)
  Downloading azure_common-1.1

In [27]:
from dotenv import load_dotenv
load_dotenv()

True

## Azure Storage Access 

Get default Azure access to the Blob Storage. The Service Principal has rights to access the Azure Storage.

In [23]:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

import os

account_url = os.getenv('AZURE_STORAGE_BLOB_URL')

credential = DefaultAzureCredential()

# Create the BlobServiceClient object
blob_service_client = BlobServiceClient(account_url, credential=credential)

### Download Blob Contents

In [69]:
import io

container = os.getenv("STORAGE_CONTAINER")
blob = os.getenv("STORAGE_BLOB_1")

blob_client = blob_service_client.get_blob_client(container=container, blob=blob)

# readinto() downloads the blob contents to a stream
stream = io.BytesIO()
blob_client.download_blob().readinto(stream)

494965

## Azure Document Intelligence

Sends the stream bytes to the Document Analysis API. If you're using the Free version of Azure Document Intelligence, it will only return the first 2 pages.

In [70]:
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

key = os.getenv('OCR_KEY')
endpoint = os.getenv('OCR_ENDPOINT')

document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
)

model_id = "prebuilt-read"

document = blob_client.download_blob()

poller = document_analysis_client.begin_analyze_document(model_id, document.readall())
result = poller.result()

In [71]:
for page in result.pages:
    print("----Analyzing layout from page #{}----".format(page.page_number))
    print(
        "Page has width: {} and height: {}, measured with unit: {}".format(
            page.width, page.height, page.unit
        )
    )

    for line_idx, line in enumerate(page.lines):
        print(line.content)

----Analyzing layout from page #1----
Page has width: 8.2639 and height: 11.6944, measured with unit: inch
Chapter 10
Introduction to quantum
mechanics
David Morin, morin@physics.harvard.edu
This chapter gives a brief introduction to quantum mechanics. Quantum mechanics can be
thought of roughly as the study of physics on very small length scales, although there are
also certain macroscopic systems it directly applies to. The descriptor "quantum" arises
because in contrast with classical mechanics, certain quantities take on only discrete values.
However, some quantities still take on continuous values, as we'll see.
In quantum mechanics, particles have wavelike properties, and a particular wave equa-
tion, the Schrodinger equation, governs how these waves behave. The Schrodinger equation
is different in a few ways from the other wave equations we've seen in this book. But these
differences won't keep us from applying all of our usual strategies for solving a wave equation
and dealing 

## Create Embeddings

From the result from Document Analysis API, we create the embeddings in OpenAPI

In [None]:
import openai

openai_api_key = os.getenv('OPENAI_API_KEY')
openai_api_base = os.getenv('OPENAI_API_BASE')

openai.api_type = "azure"
openai.api_key = openapi_api_key
openai.api_base = openapi_api_base
openai.api_version = "2023-05-15"

response = openai.Embedding.create(
    input=poller.result().content,
    engine="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']
print(embeddings)