# **Acme Global Solutions - Invoice Text Extraction Project**

## **Overview**
Acme Global Solutions processes hundreds of invoices from vendors every month. To improve efficiency, the company is developing an **automated document processing system** that uses **computer vision** to extract useful text from scanned invoice images.

## **Objective**
The system must extract **critical information** from invoice images, including:
- **Vendor email addresses** (e.g., `alice.brown@acmeglobal.com`)
- **Invoice or transaction numbers** (e.g., `34921`)
- Other **embedded text details** within invoices

Your task is to **integrate OpenAI’s vision model (`gpt-4o-mini`)** into this workflow.

## **Implementation Details**
The automation system will:
1. **Send a POST request** to OpenAI's API.
2. **Pass two inputs** in a single user message:
   - **Text instruction:** `"Extract text from this image."`
   - **Image URL:** A **Base64-encoded URL** representing the invoice image.
3. **Receive extracted text** and populate the **vendor management system**.

## **Expected Request Format**
Your request must: <br>
✅ Use **OpenAI's GPT-4o-mini**  
✅ Send a single user message containing both **text and image URL**  
✅ Ensure the **image_url remains unchanged** (no modifications)  

## **Required JSON Body Structure**
Write a JSON body that:
- Uses **`gpt-4o-mini`** as the model.
- Sends **two types of content** (`text` and `image_url`).
- Ensures the **text instruction appears first**, followed by the **image URL**.

## **Example Invoice Image**
An invoice image contains structured details such as:
- Vendor name and email
- Transaction number
- Invoice date and amount

Your integration must ensure precise extraction of **all embedded text**.
Here is the image : [Invoice.png](../resources/invoice.png)



In [1]:
import base64
import httpx
import os 
from dotenv import load_dotenv

In [None]:
# convert an image into base64 
def image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return encoded_string
encoded_string ="data:image/png;base64,"+image_to_base64("../resources/invoice.png")
print(encoded_string)




In [15]:
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
url="https://aiproxy.sanand.workers.dev/openai/v1/chat/completions"


In [None]:
# create a request to the OpenAI API
headers={
    "Authorization":f"Bearer {OPENAI_API_KEY}",
    "Content-Type":"application/json"
}
json={
    "model":"gpt-4o-mini",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract text from this image as \"vendor_email\" and \"transaction_id\" "
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "detail": "low",
                        "url": encoded_string
                    }
                }
            ]
        }
    ]

}

In [22]:
# send the request to OpenAI API
response=httpx.post(url,headers=headers,json=json,timeout=30)
result=response.json()
print(result)

{'id': 'chatcmpl-BdCez8SQjmo2le8CVbzvRxdY6dLyG', 'object': 'chat.completion', 'created': 1748683157, 'model': 'gpt-4o-mini-2024-07-18', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': '```json\n{\n  "vendor_email": "2410094202@gst.study.jitm.ac.in",\n  "transaction_id": "23814230"\n}\n```', 'refusal': None, 'annotations': []}, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 2856, 'completion_tokens': 34, 'total_tokens': 2890, 'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0}, 'completion_tokens_details': {'reasoning_tokens': 0, 'audio_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}}, 'service_tier': 'default', 'system_fingerprint': 'fp_62a23a81ef', 'monthlyCost': 0.10394699999999998, 'cost': 0.008772, 'monthlyRequests': 91, 'costError': 'crypto.createHash is not a function'}


In [23]:
# extract the response text
print(result['choices'][0]['message']['content'])

```json
{
  "vendor_email": "2410094202@gst.study.jitm.ac.in",
  "transaction_id": "23814230"
}
```
