# Function calling and structured information extraction

Function calling, also called as Tool use, is a way of extending the capabilities of a Generative AI model. It is a way of integrating external sources of information into the capabilities of the model, or having a Generative AI model perform a specific task in a structured way (e.g., structuring the output into pre-defined JSON schema).

In Amazon Bedrock, the models don't directly call the tool. Rather, when you send a message to a model, you also supply a definition for one or more tools taht could potentially help the model generate a response.

Here is an example of a Tool Definition that would be used to get the current weather for a given location:

```
{
    "toolSpec": {
        "name": "get_weather",
        "description": "Get the current weather for a given location, based on its WGS84 coordinates.",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "latitude": {
                        "type": "string",
                        "description": "Geographical WGS84 latitude of the location.",
                    },
                    "longitude": {
                        "type": "string",
                        "description": "Geographical WGS84 longitude of the location.",
                    },
                },
                "required": ["latitude", "longitude"],
            }
        },
    }
}
```

Multiple tools can be used together to perform a more complex task. For example, you could use the `get_weather` tool in conjunction with the `search_weather_news` tool to get a comprehensive view of the weather of the desired location. When multiple tools are orchestrated together, we refer to them as tool chains. When we combine tool chains with reasoning capabilities in a Generative AI model, we refer to this as an Agentic Behavior.

We will explore the tool use in Amazon Bedrock below.

In [None]:
!pip install pdf2image
!pip install pillow
!sudo yum install -y poppler-utils

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [None]:
import requests
import boto3
import json
import os
from pdf2image import convert_from_path
from PIL import Image

session = boto3.Session()
region = session.region_name
bedrock = session.client(service_name='bedrock-runtime', region_name=region)

### Lab 1: Tool use in Amazon Bedrock

We first define the tool specifications and then implements the logic to retrieve the weather data from the Open-Meteo API.

In [None]:
get_weather_tool_spec = {
    "toolSpec": {
        "name": "get_weather",
        "description": "Get the current weather for a given location, based on its WGS84 coordinates.",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "latitude": {
                        "type": "string",
                        "description": "Geographical WGS84 latitude of the location.",
                    },
                    "longitude": {
                        "type": "string",
                        "description": "Geographical WGS84 longitude of the location.",
                    },
                },
                "required": ["latitude", "longitude"],
            }
        },
    }
}

In [None]:
def get_weather(tool_use_id, input_data):
    """
    Fetches weather data for the given latitude and longitude using the Open-Meteo API.
    Returns the weather data or an error message if the request fails.

    :param input_data: The input data containing the latitude and longitude.
    :return: The weather data or an error message.
    """
    endpoint = "https://api.open-meteo.com/v1/forecast"
    latitude = input_data.get("latitude")
    longitude = input_data.get("longitude")
    params = {"latitude": latitude, "longitude": longitude, "current_weather": True}

    try:
        response = requests.get(endpoint, params=params)
        weather_data = {"weather_data": response.json()}
        response.raise_for_status()
        return {"toolUseId": tool_use_id, "content": [{"json": weather_data}], "status": "success"}
    except Exception as e:
        return {"error": type(e), "message": str(e)}

Then, we can orchestrate the tool use through the Amazon Bedrock Converse API with the function and tool spec we just created.

In [None]:
query = "What's the weather like in Seattle now?"

In [None]:
message = [
    {
        "role": "user",
        "content": [
            {"text": query}
        ]
    }
]

system_prompt = """You are a weather assistant that provides current weather data for user-specified locations using only
the Weather_Tool, which expects latitude and longitude. Infer the coordinates from the location yourself."""

try:
    response = bedrock.converse(
        modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=message,
        inferenceConfig={"maxTokens": 4096, "temperature": 0.0},
        toolConfig={"tools": [get_weather_tool_spec]},
        system=[{"text": system_prompt}]
    )
except Exception as e:
    print("Error invoking Bedrock Converse API:", str(e))
    raise

response_message = response.get("output", {}).get("message", {})
print("\nResponse from the model:")
print(json.dumps(response_message, indent=4))

message.append(response_message)

for content_block in response_message.get("content", []):
    if "toolUse" in content_block:
        tool_use_block = content_block["toolUse"]
        print(f"\nTool used triggered:")
        print(json.dumps(tool_use_block, indent=4))

        tool_input = tool_use_block.get("input", {})
        tool_result = get_weather(tool_use_block.get("toolUseId", "id"), tool_input)
        print("\nSimulated tool result:")
        print(json.dumps(tool_result, indent=4))
        break

In [None]:
tool_user_return = {"role": "user", "content": [{"toolResult": tool_result}]}
message.append(tool_user_return)

print(json.dumps(message, indent=2))

In [None]:
try:
    response = bedrock.converse(
        modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=message,
        inferenceConfig={"maxTokens": 4096, "temperature": 0.0},
        toolConfig={"tools": [get_weather_tool_spec]},
        system=[{"text": system_prompt}]
    )
except Exception as e:
    print("Error invoking Bedrock Converse API:", str(e))
    raise

print(response['output']['message']['content'][0]['text'])

### Lab 2: Structured Information Extraction out of Unstructured Data (text)

Note that the tool use in Amazon Bedrock first extracts the parameters from your query based on the tool spec. From the above example, given a location of `Seattle` from the user's query, the model was able to extract two pieces of information, longitude and latitude, based on the tool spec.

Using this property, we can extract the information out of unstructured data. For this part, we will first extract the data out of text passages. Let's extract the information out of Attending Physician Statement (APS) of patient named Michael Cho.

In [None]:
aps_statement = """I am writing in response to your request for an Attending Physician Statement (APS) regarding 
Mr. Michael Cho's medical history, current condition, and treatment during his recent hospital 
admission for Acute Coronary Syndrome (ACS).  
Patient Information:  
• Patient Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Gender:  Male  
• Medical Record Number:  202300123  
Medical History:  
Mr. Cho has a well -documented medical history that includes the following:  
• Hypertension:  Diagnosed in 2010, and his blood pressure has been well -controlled with 
Lisinopril 10 mg daily.  
• Hyperlipidemia:  Diagnosed in 2015, and managed with Atorvastatin 20 mg daily.  
• Type 2 Diabetes:  Diagnosed in 2013, controlled with Metformin 1000 mg twice daily.  
Mr. Cho has also undergone two surgeries in the past:  
• Appendectomy in 2005  
• Cholecystectomy in 2010  
Current Condition:  
Mr. Cho was admitted to St. John's Medical Center on November 8, 2023, with a chief 
complaint of severe chest pain. The pain, described as a crushing sensation, was rated at 8/10 and 
had been ongoing for approximately 2 hours. He also reported symptoms of shortness of breath 
and diaphoresis.  
Upon admission, Mr. Cho's vital signs were as follows:  
• Blood Pressure:  150/90 mmHg  
• Heart Rate:  110 bpm  
• Respiratory Rate:  18 bpm  
• Temperature:  98.6°F (37°C)  
Assessment and Plan:  
Upon admission, Mr. Cho underwent the following assessments and interventions:  
• Continuous cardiac monitoring to assess for arrhythmias or ST -segment changes.  
• Supplemental oxygen to maintain oxygen saturation above 95%.  
• Analgesia for chest pain relief.  
• Medications administered included aspirin 325 mg chewed immediately and nitroglycerin 
for chest pain relief as needed.  
Laboratory tests included cardiac enzyme (troponin) measurements, complete blood count 
(CBC), and comprehensive metabolic panel (CMP). A chest X -ray was performed to rule out 
other potential causes of chest pain.  
Consultation with the cardiology team was requested for further evaluation. The need for a 
coronary angiogram will be determined based on the initial assessment.  
Follow -up Plan:  
Mr. Cho's progress will be closely monitored, and further interventions will be determined 
based on test results and clinical assessment. The patient's family, particularly his spouse Priya 
Cho, will be kept informed of any changes in his condition.  
For any additional details or clarification, please do not hesitate to contact me at [Your Contact 
Information].  
Sincerely,  
Dr. John A. Smith, MD Attending Physician St. John's Medical Center  
 
Financial Information Report  
Applicant Information:  
• Applicant Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Occupation:  Accountant  
• Insurance Requested:  Life Insurance  
Income and Tax Information (Last 3 Years):  
1. Tax Year:  2022  
o Gross Income:  $95,000  
o Adjusted Gross Income:  $85,000  
o Taxable Income:  $75,000  
o Total Tax Paid:  $12,000  
o Tax Filing Status:  Married Filing Jointly  
2. Tax Year:  2021  
o Gross Income:  $92,000  
o Adjusted Gross Income:  $83,000  
o Taxable Income:  $72,000  
o Total Tax Paid:  $11,500  
o Tax Filing Status:  Married Filing Jointly  
3. Tax Year:  2020  
o Gross Income:  $90,000  
o Adjusted Gross Income:  $80,000  
o Taxable Income:  $70,000  
o Total Tax Paid:  $11,000  
o Tax Filing Status:  Married Filing Jointly  
Financial Statements (Last 2 Years):  
• Year:  2022  
o Total Assets:  $550,000  
▪ Cash and Savings:  $100,000  
▪ Investments (e.g., stocks, bonds):  $250,000  
▪ Real Estate (market value):  $200,000  
o Total Liabilities:  $200,000  
▪ Mortgage:  $150,000  
▪ Outstanding Loans:  $50,000  
o Net Worth:  $350,000  
• Year:  2021  
o Total Assets:  $520,000  
▪ Cash and Savings:  $95,000  
▪ Investments (e.g., stocks, bonds):  $245,000  
▪ Real Estate (market value):  $180,000  
o Total Liabilities:  $190,000  
▪ Mortgage:  $145,000  
▪ Outstanding Loans:  $45,000  
o Net Worth:  $330,000  
Additional Financial Information:  
• Employer:  ABC Accounting Firm  
• Occupational Status:  Full-time employee  
• Annual Employee Benefits:  Includes health insurance coverage through the employer  
• Other Sources of Income:  None reported  
• Monthly Mortgage/Rent Payment:  $1,200  
Comments and Recommendations:  
Mr. Michael Cho is an accountant with a stable and documented employment history. His annual 
income and financial statements indicate a consistent ability to cover insurance premiums and 
fulfill financial obligations.  
Based on his financial stability, the applicant appears well -positioned to meet the financial 
responsibilities associated with the requested life insurance policy. The net worth and absence of 
significant liabilities further suggest financial prudence.  
The underwriter may consider offering the requested life insurance policy based on the 
applicant's strong financial profile.  
 
Main Street Life Insurance - Inspection Report  
Applicant Information:  
• Full Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Gender:  Male  
• Occupation:  Accountant  
• Address:  123 Main Street, Anytown, USA  
• Email Address:  Michael.Cho@email.com  
• Phone Number:  (555) 123 -4567  
Inspection Date:  
• Date of Inspection:  [Date of Inspection]  
Inspector Information:  
• Inspector Name:  John Davis  
• Inspector ID:  12345  
• Contact Information:  [Inspector's Contact Information]  
Observations:  
Physical Measurements:  
• Height:  5 feet 10 inches (178 cm)  
• Weight:  175 lbs (79.4 kg)  
• Body Mass Index (BMI):  25.1 (Within the normal range)  
• Blood Pressure:  130/80 mmHg (Within normal range)  
• Resting Heart Rate:  70 bpm  
Health Observations:  
• Mr. Cho appears to be in good general health, with no apparent signs of acute illness or 
distress.  
• No visible skin conditions, wounds, or scars were observed.  
• He does not exhibit any mobility issues or physical limitations.  
Lifestyle Observations:  
• Mr. Cho appears to maintain a healthy lifestyle. He does not smoke or use tobacco 
products.  
• He reports consuming alcohol occasionally but not to excess.  
• The applicant does not appear to engage in high -risk hobbies or activities.  
Medical History Verification:  
• Mr. Cho confirmed that he has been diagnosed and treated for hypertension, 
hyperlipidemia, and type 2 diabetes as per his application.  
• He takes Lisinopril, Atorvastatin, and Metformin regularly, which is consistent with his 
reported medications.  
Samples Collection:  
• Blood and urine samples were collected from the applicant for further medical 
assessment.  
Applicant's Comments:  
• Mr. Cho had no additional comments or concerns to raise during the inspection.  
Inspector's Recommendations:  
Based on the inspection and observations, Mr. Michael Cho appears to be in good health and 
adheres to a healthy lifestyle. His stated medical history and medication usage align with the 
information provided in the application.  
Inspector's Signature:  [Inspector's Signature]  
Date of Inspection Report:  [Date of Inspection]  
 
Main Street Life Insurance - Life Insurance Application  
Applicant Information:  
• Full Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Gender:  Male  
• Occupation:  Accountant  
• Address:  123 Main Street, Anytown, USA  
• Email Address:  Michael.Cho@email.com  
• Phone Number:  (555) 123 -4567  
Financial Information:  
• Annual Gross Income:  $95,000  
• Existing Insurance Coverage:  None  
Insurance Requested:  
• Type of Insurance:  Life Insurance  
• Coverage Amount:  $500,000  
Reason for Purchasing Insurance:  
• Primary Beneficiary:  Spouse, Priya Cho  
• Secondary Beneficiary:  Children, Aiden and Maya Cho  
Health Questions:  
1. Have you been diagnosed with or treated for any heart condition, stroke, or cancer 
in the past 5 years?  No 
2. Do you currently have any chronic or major medical conditions, such as diabetes or 
hypertension?  No 
3. Are you taking any prescription medications regularly?  Yes (Lisinopril, Atorvastatin, 
and Metformin)  
4. In the past 5 years, have you been hospitalized or had surgery?  Yes (Appendectomy 
in 2020)  
5. Do you smoke or use tobacco products?  No 
6. Do you participate in high -risk hobbies or activities, such as extreme sports?  No 
Additional Information:  
• Preferred Payment Method:  Monthly bank auto -debit  
• Preferred Contact Time:  Anytime  
• Preferred Language for Policy Documents:  English  
• Any Additional Comments or Preferences:  None  
Declaration:  
I hereby declare that the information provided in this application is true and accurate to the best 
of my knowledge. I understand that any false statements or omissions may affect the validity of 
my insurance policy.  
Applicant's Signature:  [Signature]  
Date of Application:  [Date of Application]  
 
Medical Information Bureau (MIB) Report  
Applicant Information:  
• Full Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Gender:  Male  
• Address:  123 Main Street, Anytown, USA  
Report Date:  
• Date of Report:  [Date of MIB Report]  
Summary:  
This MIB report provides information on prior life, health, or disability insurance applications 
made by the applicant, Michael Cho.  
Details:  
1. Insurance Company:  ABC Insurance Company  
o Application Date:  January 15, 2021  
o Policy Type:  Term Life Insurance  
o Policy Amount:  $500,000  
o Outcome:  Approved  
2. Insurance Company:  XYZ Health Insurance  
o Application Date:  March 22, 2019  
o Policy Type:  Health Insurance  
o Outcome:  Approved  
3. Insurance Company:  LMN Disability Insurance  
o Application Date:  June 7, 2018  
o Policy Type:  Disability Insurance  
o Outcome:  Approved  
Notes:  
• Mr. Cho has a history of successfully applying for insurance policies with different 
insurance companies. All prior applications were approved, suggesting no major adverse 
underwriting findings.  
• No records of declined or pending insurance applications were found in the MIB database 
for this applicant.  
MIB Reference Number:  [MIB Reference Number]  
Report Prepared By:  
• MIB Investigator:  Jane Smith  
• Contact Information:  [Investigator's Contact Information]  
 
Medical Records for Michael Cho  
Patient Information:  
• Patient Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Gender:  Male  
• Medical Record Number:  202300123  
Admission Details:  
• Date of Admission:  November 8, 2023  
• Admitting Physician:  Dr. Sarah Patel  
• Admitting Diagnosis:  Acute Coronary Syndrome (ACS)  
• Room Number:  305 
Medical History:  
Mr. Cho's medical history includes:  
• Hypertension: Diagnosed in 2010, well -controlled with Lisinopril 10 mg daily.  
• Hyperlipidemia: Diagnosed in 2015, managed with Atorvastatin 20 mg daily.  
• Type 2 Diabetes: Diagnosed in 2013, controlled with Metformin 1000 mg twice daily.  
• Surgical history includes an appendectomy in 2005 and a cholecystectomy in 2010.  
Presenting Complaint:  
Mr. Michael Cho presented to the emergency department on November 8, 2023, with a severe, 
crushing, substernal chest pain radiating to the left arm, associated with diaphoresis and 
shortness of breath. The pain began approximately 2 hours before admission.  
Vital Signs on Admission:  
• Blood Pressure:  150/90 mmHg  
• Heart Rate:  110 bpm  
• Respiratory Rate:  18 bpm  
• Temperature:  98.6°F (37°C)  
Assessment and Plan:  
Upon admission, Mr. Cho underwent a comprehensive evaluation:  
1. Cardiac Monitoring:  
o Continuous ECG monitoring: Sinus rhythm with occasional premature ventricular 
contractions (PVCs).  
o QT interval monitoring.  
2. Oxygen Therapy:  
o Supplemental oxygen at 2 liters per minute to maintain oxygen saturation above 
95%.  
3. Pain Management:  
o Intravenous morphine sulfate: 2 mg administered to alleviate chest pain.  
4. Medications:  
o Aspirin 325 mg: Chewed immediately upon arrival.  
o Nitroglycerin: Administered sublingually as needed for chest pain.  
5. Laboratory Tests:  
o Cardiac enzymes (troponin): Ordered at admission and to be repeated in 3 -hour 
intervals.  
o Complete blood count (CBC).  
o Comprehensive metabolic panel (CMP).  
6. Imaging:  
o Chest X -ray: Performed to assess for acute pulmonary or other pathology related 
to chest pain.  
7. Consultations:  
o Cardiology consultation requested for further evaluation.  
o Discussion regarding potential coronary angiogram.  
Treatment Summary:  
Mr. Cho was admitted for further evaluation of his chest pain, with a primary concern for 
ACS. Initial interventions included supplemental oxygen, pain management, and administration 
of aspirin and nitroglycerin. Continuous cardiac monitoring and serial  cardiac enzyme testing 
were implemented to assess cardiac status.  
Follow -up Plan:  
The patient's progress was closely monitored, and further interventions were planned based on 
test results and clinical assessment. Patient and family education regarding the diagnosis, 
treatment plan, and potential need for a coronary angiogram was provid ed. 
 
Motor Vehicle Record (MVR)  
Driver Information:  
• Driver Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Driver's License Number:  DL123456789  
• State of Issuance:  Illinois  
Driving History:  
• Driving Experience:  30 years  
• Date of MVR Request:  [Date of MVR Request]  
License Status:  
• Current License Status:  Valid  
• License Expiration Date:  August 12, [Year of Expiration]  
• License Class:  Class C  
Traffic Violations and Convictions (Past 5 Years):  
1. Date of Violation:  [Date of Violation]  
o Violation:  Speeding (Exceeding posted speed limit by 10 -15 mph)  
o Citation Number:  12345  
o Penalty:  Fine of $150  
o Points:  3 
2. Date of Violation:  [Date of Violation]  
o Violation:  Failure to Stop at a Red Light  
o Citation Number:  54321  
o Penalty:  Fine of $200  
o Points:  5 
Accidents (Past 5 Years):  
1. Date of Accident:  [Date of Accident]  
o Type of Accident:  Rear -end Collision  
o At Fault:  Yes 
o Property Damage:  $3,000  
o Injuries:  No 
2. Date of Accident:  [Date of Accident]  
o Type of Accident:  Not-at-Fault  
o Property Damage:  $2,500  
o Injuries:  Minor, No Hospitalization  
DUI/DWI Convictions (Lifetime):  
• DUI/DWI Convictions:  None  
License Suspensions/Revocations (Lifetime):  
• License Suspensions:  None  
Driving Record Notes:  
Mr. Michael Cho holds a valid Class C driver's license issued by the state of Illinois. His driving 
history for the past 5 years includes two traffic violations. The most recent violation was for 
failure to stop at a red light, resulting in a fine of $200 and 5 points on his license. He also had a 
prior speeding violation with a fin e of $150 and 3 points.  
In the past 5 years, Mr. Cho has been involved in two accidents. One was a rear -end collision 
for which he was deemed at fault, resulting in $3,000 in property damage and no injuries. The 
second accident was not at fault, with minor property damage of $ 2,500 and minor injuries that 
did not require hospitalization.  
Mr. Cho has no DUI/DWI convictions and no license suspensions or revocations in his 
lifetime.  
Comments and Recommendations:  
While Mr. Cho has a couple of traffic violations and accidents in his recent driving history, it 
is noted that he has no DUI/DWI convictions or license suspensions/revocations. We recommend 
monitoring his driving record for future incidents and periodic ally reviewing his insurance rates 
based on his driving history.  
 
Prescription History  
Patient Information:  
• Patient Name:  Michael Cho  
• Date of Birth:  August 12, 1967  
• Patient ID:  202300123  
Prescription Medications (Last 5 Years):  
1. Medication:  Lisinopril  
o Prescribing Physician:  Dr. Jane Miller  
o Prescription Date:  January 15, 2018  
o Indication:  Hypertension  
o Dosage:  10 mg daily  
o Duration:  Ongoing  
o Pharmacy:  ABC Pharmacy  
2. Medication:  Atorvastatin  
o Prescribing Physician:  Dr. Emily Clark  
o Prescription Date:  March 22, 2018  
o Indication:  Hyperlipidemia  
o Dosage:  20 mg daily  
o Duration:  Ongoing  
o Pharmacy:  XYZ Pharmacy  
3. Medication:  Metformin  
o Prescribing Physician:  Dr. Michael Brown  
o Prescription Date:  June 7, 2018  
o Indication:  Type 2 Diabetes  
o Dosage:  1000 mg twice daily  
o Duration:  Ongoing  
o Pharmacy:  Medicare Pharmacy  
4. Medication:  Aspirin  
o Prescribing Physician:  Dr. Emily Clark  
o Prescription Date:  January 15, 2018  
o Indication:  Cardiovascular prophylaxis  
o Dosage:  81 mg daily  
o Duration:  Ongoing  
o Pharmacy:  XYZ Pharmacy  
5. Medication:  Nitroglycerin  
o Prescribing Physician:  Dr. Sarah Patel (ER)  
o Prescription Date:  November 8, 2023  
o Indication:  Chest pain  
o Dosage:  As needed  
o Duration:  As needed  
o Pharmacy:  St. John's Hospital Pharmacy  
Prescription History Notes:  
Mr. Michael Cho has a documented prescription history, with the most recent addition being 
nitroglycerin, which was prescribed on November 8, 2023, during his emergency room visit for 
chest pain. This medication is intended for as -needed use to relieve ches t pain.  
In the past five years, Mr. Cho has consistently taken the following medications:  
• Lisinopril for hypertension  
• Atorvastatin for hyperlipidemia  
• Metformin for type 2 diabetes  
• Aspirin for cardiovascular prophylaxis  
These medications have been prescribed by various healthcare providers and are being managed 
by different pharmacies.  
Comments and Recommendations:  
The patient's prescription history indicates adherence to medications for chronic conditions such 
as hypertension, hyperlipidemia, and type 2 diabetes. It is essential to ensure continued 
compliance with these medications for the management of his medical conditions.  
The recent addition of nitroglycerin, prescribed during his emergency room visit, may indicate 
an acute cardiovascular concern. It is recommended to monitor the use of nitroglycerin and 
coordinate with the patient's healthcare providers for a comprehensive  evaluation of his cardiac 
health.  """

We first need to define the tool spec, based on what information do we need to extract.

In [None]:
extract_aps_tool_spec = {
    "toolSpec": {
        "name": "extract_aps",
        "description": "Extract information out of Attending Physician Statement (APS).",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "patient_name": {
                        "type": "string",
                        "description": "Name of the patient",
                    },
                    "gender": {
                        "type": "string",
                        "description": "Gender of patient",
                    },
                    "birthday": {
                        "type": "string",
                        "description": "Birthday of patient in MM/DD/YYYY format",
                    },
                    "occupation": {
                        "type": "string",
                        "description": "Current occupation of patient",
                    },
                    "marital_status": {
                        "type": "string",
                        "description": "Marital status of patient",
                        "enum": ["Married", "Single", "Divorced"],
                    },
                    "medical_record_number": {
                        "type": "integer",
                        "description": "Medical record number of APS form",
                    },
                    "physician_name": {
                        "type": "string",
                        "description": "Name of the physician who wrote the APS form",
                    },
                    "life_insurance_provider": {
                        "type": "string",
                        "description": "Name of the life insurance provider",
                    },
                    "email_address": {
                        "type": "string",
                        "description": "Patient's email address",
                    },
                    "height": {
                        "type": "string",
                        "description": "Patient's height in inches",
                    },
                    "weight": {
                        "type": "string",
                        "description": "Patient's weight in lbs",
                    },
                    "income": {
                        "type": "number",
                        "description": "Patient's annual income",
                    },
                    "health_insurance_provider": {
                        "type": "string",
                        "description": "Name of health insurance provider",
                    },
                    "disability_insurance_provider": {
                        "type": "string",
                        "description": "Name of disability insurance provider",
                    },
                    "diagnosis": {
                        "type": "string",
                        "description": "Patient's diagnosis",
                    },
                    "medication_history": {
                        "type": "array",
                        "description": "List of medications",
                        "items": {
                            "type": "object",
                            "properties": {
                                "medication_name": {
                                    "type": "string",
                                    "description": "Name of the medication",
                                },
                                "prescribing_physician": {
                                    "type": "string",
                                    "description": "Name of the prescribing physician",
                                },
                                "prescription_date": {
                                    "type": "string",
                                    "description": "Date when medication was prescribed",
                                },
                                "indication": {
                                    "type": "string",
                                    "description": "Medical condition for which medication was prescribed",
                                },
                                "dosage": {
                                    "type": "string",
                                    "description": "Medication dosage and frequency",
                                },
                                "duration": {
                                    "type": "string",
                                    "description": "Duration of medication treatment",
                                },
                                "pharmacy": {
                                    "type": "string",
                                    "description": "Pharmacy where medication was filled",
                                },
                            },
                            "required": [
                                "medication_name",
                                "prescribing_physician",
                                "prescription_date",
                                "indication",
                                "dosage",
                                "duration",
                                "pharmacy",
                            ],
                        },
                    },
                },
                "required": [
                    "patient_name",
                    "gender",
                    "birthday",
                    "occupation",
                    "marital_status",
                    "medical_record_number",
                    "physician_name",
                    "life_insurance_provider",
                    "email_address",
                    "height",
                    "weight",
                    "income",
                    "health_insurance_provider",
                    "disability_insurance_provider",
                    "diagnosis",
                    "medication_history",
                ],
            }
        },
    }
}


Now, we will use Converse API with defined tool spec to extract information structurally from text passage.

In [None]:
message = [
    {
        "role": "user",
        "content": [
            {"text": aps_statement}
        ]
    }
]

system_prompt = "Accurately extract all the information based on the given context."

try:
    response = bedrock.converse(
        modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=message,
        inferenceConfig={"maxTokens": 4096, "temperature": 0.0},
        toolConfig={"tools": [extract_aps_tool_spec]},
        system=[{"text": system_prompt}]
    )
except Exception as e:
    print("Error invoking Bedrock Converse API:", str(e))
    raise

In [None]:
response_message = response.get("output", {}).get("message", {})

for content_block in response_message.get("content", []):
    if "toolUse" in content_block:
        tool_use_block = content_block["toolUse"]
        print(f"\nTool used triggered:")
        print(json.dumps(tool_use_block, indent=4))

Now, we can easily use the extracted information for further processing if needed, as all the information is already structured in JSON format.

### Lab 3: Structured Information Extraction out of Unstructured Data (Image)

Next, let's extract the information with the same method, but now from a single image.

#### Information extraction out of single image

![Sample Image](assets/sample_amzn_10k.PNG)

In [None]:
form_10k_tool_spec = {
    "toolSpec": {
        "name": "extract_10k",
        "description": "Extract information out of 10-K document.",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "current_year": {
                        "type": "integer",
                        "description": "Year when the 10-K form is filed",
                    },
                    "current_year_net_sales": {
                        "type": "number",
                        "description": "Current year's net sales, in millions",
                    },
                    "previous_year_net_sales": {
                        "type": "number",
                        "description": "Previous year's net sales, in millions",
                    },
                    "current_year_sales_growth": {
                        "type": "number",
                        "description": "Consolidated percentage growth in net sales",
                    },
                    "previous_year_sales_growth": {
                        "type": "number",
                        "description": "Consolidated percentage growth in net sales",
                    },
                },
                "required": [
                    "current_year",
                    "net_sales",
                    "sales_percentage_growth"
                ],
            }
        },
    }
}

In [None]:
message = [{"role": "user", "content": []}]

with open("assets/sample_amzn_10k.PNG", "rb") as image_file:
    image_bytes = image_file.read()

message[0]["content"].append(
    {"image": {"format": "png", "source": {"bytes": image_bytes}}}
)

system_prompt = "Accurately extract all the information based on the given context."

try:
    response = bedrock.converse(
        modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=message,
        inferenceConfig={"maxTokens": 4096, "temperature": 0.0},
        toolConfig={"tools": [form_10k_tool_spec]},
        system=[{"text": system_prompt}]
    )
except Exception as e:
    print("Error invoking Bedrock Converse API:", str(e))
    raise

In [None]:
response_message = response.get("output", {}).get("message", {})

for content_block in response_message.get("content", []):
    if "toolUse" in content_block:
        tool_use_block = content_block["toolUse"]
        print(f"\nTool used triggered:")
        print(json.dumps(tool_use_block, indent=4))

#### Information extraction out of PDF file / multiple images

Lastly, let's extract the information with the same method, but now from PDF files. Given the current size limit placed on Converse API for including documents (e.g., PDF, CSV, etc.) to messages is far more limiting than including images, we will first break up the PDF file into multiple images, then pass it to the Converse API.

In [None]:
form_4_tool_spec = {
    "toolSpec": {
        "name": "extract_4",
        "description": "Extract information out of form 4.",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "name": {
                        "type": "string",
                        "description": "Name of the reporting person",
                    },
                    "address": {
                        "type": "string",
                        "description": "Street of address of the reporting person",
                    },
                    "city": {
                        "type": "string",
                        "description": "City of address of the reporting person",
                    },
                    "state": {
                        "type": "string",
                        "description": "State of address of the reporting person",
                    },
                    "zip": {
                        "type": "string",
                        "description": "Zip code of address of the reporting person",
                    },
                    "issuer_name": {
                        "type": "string",
                        "description": "Name of issuer",
                    },
                    "ticker": {
                        "type": "string",
                        "description": "Ticker or trading symbol of issuer",
                    },
                    "date_of_earliest_transaction": {
                        "type": "string",
                        "description": "Date of the earliest transaction, in the format of MM/DD/YYYY",
                    },
                    "relationship": {
                        "type": "string",
                        "description": "Relationship of reporting person to issuer",
                        "enum": ["Director", "10% owner", "Officer", "Other"]
                    },
                    "filing_type": {
                        "type": "string",
                        "description": "Filing type of the form",
                        "enum": ["Individual", "Joint/Group"]
                    },
                },
                "required": [
                    "name",
                    "address",
                    "city",
                    "state",
                    "zip",
                    "issuer_name",
                    "ticker",
                    "date_of_earliest_transaction",
                    "relationship",
                    "filing_type",
                ],
            }
        },
    }
}

In [None]:
# Function to convert the PDF file to image, optionally can decrease the image control for more efficient processing

def pdf_to_images(pdf_path, output_folder, quality):
    os.makedirs(output_folder, exist_ok=True)
    images = convert_from_path(pdf_path)
    image_paths = []
    
    scaled_images = []
    for img in images:
        new_width = int(img.width * quality)
        new_height = int(img.height * quality)
        scaled_images.append(img.resize((new_width, new_height), Image.Resampling.LANCZOS))
    images = scaled_images
    
    if len(images) <= 20:
        for i, image in enumerate(images):
            output_path = os.path.join(output_folder, f"page_{i+1}.png")
            image.save(output_path, "PNG")
            image_paths.append(output_path)
    else:
        pages_per_image = (len(images) + 19) // 20
        for i in range(0, len(images), pages_per_image):
            pages_to_combine = images[i:i + pages_per_image]
            total_height = sum(img.height for img in pages_to_combine)
            max_width = max(img.width for img in pages_to_combine)
            
            combined_image = Image.new('RGB', (max_width, total_height), 'white')
            y_offset = 0
            
            for img in pages_to_combine:
                combined_image.paste(img, (0, y_offset))
                y_offset += img.height
            
            output_path = os.path.join(output_folder, f"page_{i//pages_per_image + 1}.png")
            combined_image.save(output_path, "PNG")
            image_paths.append(output_path)
            
    return image_paths

In [None]:
message = [{"role": "user", "content": []}]

image_files = pdf_to_images("assets/sample_amzn_form_4.pdf", "output", 0.5)  # Decrease the image quality by 50% for more efficient processing

for image_path in image_files:
    with open(image_path, "rb") as image_file:
        image_bytes = image_file.read()
    message[0]["content"].append(
        {"image": {"format": "png", "source": {"bytes": image_bytes}}}
    )

system_prompt = "Accurately extract all the information based on the given context."

try:
    response = bedrock.converse(
        modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=message,
        inferenceConfig={"maxTokens": 4096, "temperature": 0.0},
        toolConfig={"tools": [form_4_tool_spec]},
        system=[{"text": system_prompt}]
    )
except Exception as e:
    print("Error invoking Bedrock Converse API:", str(e))
    raise

In [None]:
response_message = response.get("output", {}).get("message", {})

for content_block in response_message.get("content", []):
    if "toolUse" in content_block:
        tool_use_block = content_block["toolUse"]
        print(f"\nTool used triggered:")
        print(json.dumps(tool_use_block, indent=4))