In [1]:
from PIL import Image
import pytesseract

In [2]:
from openai import OpenAI
client = OpenAI(api_key="<OPENAI_KEY>")

In [3]:

def extract_image(image_path):
    image = Image.open(image_path)

    # Use Tesseract to do OCR on the image
    text = pytesseract.image_to_string(image)
    return text



In [4]:
def ask_openai_vqa(extracted_text, question):
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant. You will be given an extracted tabular data. You are tasked to answer the user questions"},
         {"role": "user", "content": f"Extracted text: {extracted_text}\n Question: {question}"}
    ]

    response = client.chat.completions.create(
        model="gpt-4o",  
        messages=messages,  
        max_tokens=512, 
        temperature=1,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )
    answer = response.choices[0].message.content
    return answer

In [5]:
image_path = 'tabular_data_1.png'  # Update with your image path
question = 'Can you tell me which salesperson made highest order amount. You need to add all order amount and tell me who made the highest order amount'

extracted_text = extract_image(image_path)
answer = ask_openai_vqa(extracted_text, question)
answer

"To determine which salesperson made the highest order amount, we need to aggregate the amounts for each salesperson. Here are the order amounts by salesperson aggregated from the extracted data:\n\nFuller:\n- 3/01/2021: 1,440.00\n- 8/01/2021: 1,591.25\n- 9/01/2021: 2,505.60\n**Total: 5,536.85**\n\nGloucester:\n- 2/01/2021: 716.72\n**Total: 716.72**\n\nBromley:\n- 2/01/2021: 344.00\n- 10/01/2021: 2,713.50\n**Total: 3,057.50**\n\nFinchley:\n- 3/01/2021: 2,556.95\n- 3/01/2021: 442.00\n- 6/01/2021: 1,903.80\n- 10/01/2021: 3,868.60\n**Total: 8,771.35**\n\nGillingham:\n- 3/01/2021: 2,122.92\n**Total: 2,122.92**\n\nCallahan:\n- 8/01/2021: 1,765.60\n- 13/01/2021: 1,622.40\n- 15/01/2021: 334.80\n**Total: 3,722.80**\n\nCoghill:\n- 9/01/2021: 855.01\n**Total: 855.01**\n\nRayleigh:\n- 13/01/2021: 1,830.78\n**Total: 1,830.78**\n\nFarnham:\n- 14/01/2021: 319.20\n- 15/01/2021: 802.00 \n**Total: 1,121.20**\n\nAfter adding the order amounts for each salesperson, it's clear that Finchley has the highes

In [11]:
image_path = 'tabular_data_1.png'
extracted_text = extract_image(image_path)

# Llama3.1

In [9]:
import ollama

In [12]:
response = ollama.chat(model='llama3.1', messages=[
 
        {"role": "system", "content": "You are a helpful AI assistant. You will be given an extracted tabular data. You are tasked to answer the user questions"},
         {"role": "user", "content": f"Extracted text: {extracted_text}\n Question: {question}"}
     
])
print(response['message']['content'])

To find the salesperson with the highest order amount, I will add up the total order amounts for each salesperson.

Here are the calculations:

1. Fuller:
	* 01/03/2021: $1,440.00
	* 08/01/2021: $1,591.25 (x2) = $3,182.50
	Total: $4,622.50
2. Gloucester:
	* 02/01/2021: $716.72
	Total: $716.72
3. Bromley:
	* 02/01/2021: $344.00 (UK)
	* 10/01/2021: $2,713.50 (USA)
	Total: $2,857.50 (Note: I'll keep track of this as a separate entry since there are two salespeople with the same last name)
4. Finchley:
	* 03/01/2021 (x3): $2,556.95 + $442.00 + $1,903.80 = $5,902.75
5. Gillingham:
	* 03/01/2021: $2,122.92
6. Callahan:
	* 08/01/2021: $1,765.60
	* 15/01/2021 (x3): $1,622.40 + $334.80 = $1,957.20
	Total: $3,722.80
7. Farnham:
	* 14/01/2021: $319.20
	* 15/01/2021: $802.00
	Total: $1,121.20

Now, let's add up all the totals to find the highest order amount:

$4,622.50 (Fuller) + $716.72 (Gloucester) = $5,339.22
$5,339.22 + $2,857.50 (Bromley) = $8,196.72
$8,196.72 + $5,902.75 (Finchley) = $14,09

In [14]:
response = ollama.chat(model='mistral', messages=[
 
        {"role": "system", "content": "You are a helpful AI assistant. You will be given an extracted tabular data. You are tasked to answer the user questions"},
         {"role": "user", "content": f"Extracted text: {extracted_text}\n Question: {question}"}
     
])
print(response['message']['content'])

 To find the salesperson who made the highest order amount, I will iterate through each row in the data provided and keep track of the maximum total order amount found so far. Here are the steps:

1. First, let's identify the relevant columns in the table:
- Sales Location (Country): Column 0
- Date: Column 1
- Stock Code/Item Name: Column 2
- Quantity: Column 3
- Price: Column 4

2. Next, I will write a function to parse the date and extract the salesperson's name based on their location (in this case, it is determined by the country). For example, the USA salesperson can be identified as "Finchley" or "Callahan" or "Fuller".

3. After that, I will create another function to calculate the total order amount for each row and sum up the values for all rows corresponding to a specific salesperson:

```python
def parse_date(date):
    return datetime.strptime(date[:10], "%d/%m/%Y")

def get_salesperson(country):
    if country == "USA":
        return date[8] + date[9:12].lower()
    elif