## Downloading the required dependencies

In [None]:
!python -m pip install git+https://github.com/huggingface/transformers
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

Collecting git+https://github.com/huggingface/transformers
  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-a9h3ahb1
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-a9h3ahb1
  Resolved https://github.com/huggingface/transformers to commit f1b7634fc840a96198268eb9b3d61b92b05c7cfb
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Looking in indexes: https://download.pytorch.org/whl/cu117


## Importing Model From Hugging Face

In [None]:
import openpyxl
import cv2
import os
import torch
from PIL import Image
import requests
from io import BytesIO
import json
from transformers import AutoProcessor, AutoModelForCausalLM, Qwen2VLForConditionalGeneration
from base64 import b64decode
from IPython.display import display, Javascript
from google.colab.output import eval_js
from datetime import datetime, timedelta
import re


# Load the Qwen2 model and processor
model = Qwen2VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2-VL-2B-Instruct",
    torch_dtype="auto",
    device_map="auto",
)

processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.20k [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/56.4k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/3.99G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/429M [00:00<?, ?B/s]

`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/272 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/4.19k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/2.78M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/1.67M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

chat_template.json:   0%|          | 0.00/1.05k [00:00<?, ?B/s]

## Initialise Excel File and Necessary Functions to Prcoess Data

In [None]:
import pytz

# Define the Excel file path
file_path = "product_analysis.xlsx"

# Initialize workbook and sheets
if os.path.exists(file_path):
    workbook = openpyxl.load_workbook(file_path)
    if "Packaged Products" in workbook.sheetnames:
        packaged_sheet = workbook["Packaged Products"]
    else:
        packaged_sheet = workbook.create_sheet("Packaged Products")
    if "Fresh Produce" in workbook.sheetnames:
        produce_sheet = workbook["Fresh Produce"]
    else:
        produce_sheet = workbook.create_sheet("Fresh Produce")
else:
    workbook = openpyxl.Workbook()
    packaged_sheet = workbook.active
    packaged_sheet.title = "Packaged Products"
    produce_sheet = workbook.create_sheet("Fresh Produce")

    # Add headers for Packaged Products
    packaged_headers = ["Sl No", "Timestamp", "Brand", "Count", "Expiry Date", "Expired", "Expected Life Span (Days)"]
    packaged_sheet.append(packaged_headers)

    # Add headers for Fresh Produce
    produce_headers = ["Sl No", "Timestamp", "Produce Name", "Freshness (1-10)", "Expected Life Span (Days)"]
    produce_sheet.append(produce_headers)

# Initialize row counters
packaged_sl_no = len(packaged_sheet['A']) if len(packaged_sheet['A']) > 1 else 1
produce_sl_no = len(produce_sheet['A']) if len(produce_sheet['A']) > 1 else 1

# Function to get the current timestamp in IST
def get_ist_timestamp():
    # Define IST time zone
    IST = pytz.timezone('Asia/Kolkata')

    # Get current time in UTC and convert to IST
    utc_now = datetime.now(pytz.utc)
    ist_now = utc_now.astimezone(IST)

    # Return the formatted timestamp
    return ist_now.strftime("%Y-%m-%d %H:%M:%S")


def normalize_date(date_str):
    """
    Normalize various date formats to 'YYYY-MM-DD'.
    """
    patterns = [
        "%d-%m-%Y", "%d/%m/%Y", "%d/%m/%y", "%m/%y", "%b %Y", "%b-%Y"
    ]

    for pattern in patterns:
        try:
            normalized_date = datetime.strptime(date_str, pattern)

            # For MM/YY format, assume the 1st day of the month
            if pattern == "%m/%y":
                normalized_date = normalized_date.replace(day=1)

            return normalized_date.strftime("%Y-%m-%d")
        except ValueError:
            continue
    return "Invalid Date"


def calculate_expiration(expiry_date, mfg_date=None, best_before=None):
    """
    Calculate expiration status and expected life span.
    """
    today = datetime.today()

    if expiry_date:
        normalized_expiry_date = normalize_date(expiry_date)
        if normalized_expiry_date != "Invalid Date":
            expiry_date_obj = datetime.strptime(normalized_expiry_date, "%Y-%m-%d")

            if expiry_date_obj < today:
                expired_status = 1
                expected_life_span = 0
            else:
                expired_status = 0
                expected_life_span = (expiry_date_obj - today).days
            return normalized_expiry_date, expired_status, expected_life_span

    if mfg_date and best_before:
        normalized_mfg_date = normalize_date(mfg_date)
        if normalized_mfg_date != "Invalid Date":
            expiry_date_obj = datetime.strptime(normalized_mfg_date, "%Y-%m-%d")
            expiry_date_obj += timedelta(days=int(best_before) * 30)
            normalized_expiry_date = expiry_date_obj.strftime("%Y-%m-%d")

            if expiry_date_obj < today:
                expired_status = 1
                expected_life_span = 0
            else:
                expired_status = 0
                expected_life_span = (expiry_date_obj - today).days
            return normalized_expiry_date, expired_status, expected_life_span

    return False, 0, "Invalid Date"


def append_to_excel(product_details_dict):
    global packaged_sl_no, produce_sl_no
    timestamp = timestamp = get_ist_timestamp()
    # Print the product details dictionary to debug
    print(f"Received Product Details: {product_details_dict}")

    if product_details_dict.get("type") == "Packed Product":
        packaged_sl_no += 1
        brand = product_details_dict.get("Brand", "Unknown")
        product_count = product_details_dict.get("Product Count", 0)
        expiry_date = product_details_dict.get("Expiry Date", None)
        mfg_date = product_details_dict.get("Manufacturing Date", None)
        best_before = product_details_dict.get("Best Before Months", None)

        # Call the modified calculate_expiration function
        normalized_expiry_date, expired_status, expected_life_span = calculate_expiration(
            expiry_date, mfg_date, best_before
        )

        # Append to the "Packaged Products" sheet
        packaged_sheet.append([
            packaged_sl_no,
            timestamp,
            brand,
            product_count,
            normalized_expiry_date,  # Use normalized expiry date
            expired_status,  # Expired (1) or Not Expired (0)
            expected_life_span  # Expected life span in days
        ])

    elif product_details_dict.get("type") == "Fresh Product":
        produce_sl_no += 1
        produce_name = product_details_dict.get("Produce Name", "Unknown")
        freshness = product_details_dict.get("Freshness", "Unknown")
        expected_life_span = product_details_dict.get("Expected Life Span (Days)", "Unknown")

        # Append to the "Fresh Produce" sheet
        produce_sheet.append([
            produce_sl_no,
            timestamp,
            produce_name,
            freshness,
            expected_life_span  # Expected life span (days)
        ])
    else:
        print("Error: Invalid product type")

    # Save the workbook after appending the product details
    workbook.save(file_path)


def process_image(image):
    """
    Process the image and extract product details using Qwen model.
    """
    image = image.resize((512, 512))

    # Prepare the text prompt for Qwen model to get general product details
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "image", "image_url": "Captured from webcam"},
                {
                    "type": "text",
                    "text": """Analyze this image fetch the details of below & return the Python dictionary format:
                    If it contains packaged products, include:
                    {
                        "type": "Packed Product",
                        "Brand": "string",
                        "Product Count": int,
                        "Expiry Date": "string", (maybe found with Exp date, use by)
                        "Mfg Date": "string", (maybe found with mfg date)
                        "Best Before Months": int  # If 'Best Before months & Use By' is mentioned else None.
                    }

                    If it contains fresh produce, include:
                    {
                        "type": "Fresh Product",
                        "Produce Name": "string",
                        "Freshness": int,  # Based on visual clues (e.g., color, ripeness, bruising, etc.). 1-10 scale with dull color fruit image score less and bright & shiny score more don't give 8 to all.
                        "Expected Life Span (Days)": int  # Estimate based on visual signs of freshness and decay. It should be varied with images not simply marked
                    }
                    """
                }
            ]
        }
    ]


    # Prepare the text prompt for processing
    text_prompt = processor.apply_chat_template(messages, add_generation_prompt=True)

    # Process the image and prompt for model input
    inputs = processor(
        text=[text_prompt],
        images=[image],
        padding=True,
        return_tensors="pt"
    )

    # Move inputs to GPU if available
    inputs = inputs.to("cuda" if torch.cuda.is_available() else "cpu")

    # Generate output from the model
    output_ids = model.generate(**inputs, max_new_tokens=1024)

    # Decode the generated output
    generated_ids = [
        output_ids[len(input_ids):]
        for input_ids, output_ids in zip(inputs.input_ids, output_ids)
    ]
    output_text = processor.batch_decode(
        generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
    )[0]

    print("Model Output:", output_text)

    # Parse the output as a dictionary and append to Excel
    try:
        product_details_dict = eval(output_text)  # Assuming the model returns a Python dictionary as a string
        append_to_excel(product_details_dict)
        print("Product details saved successfully.")
    except Exception as e:
        print(f"Error processing output: {e}")


def take_photo(filename='photo.jpg', quality=0.8):
    js = Javascript('''
      async function takePhoto(quality) {
        const div = document.createElement('div');
        const capture = document.createElement('button');
        capture.textContent = 'Capture';
        div.appendChild(capture);

        const video = document.createElement('video');
        video.style.display = 'block';
        const stream = await navigator.mediaDevices.getUserMedia({video: true});

        document.body.appendChild(div);
        div.appendChild(video);
        video.srcObject = stream;
        await video.play();

        // Resize the output to fit the video element.
        google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

        // Wait for Capture to be clicked.
        await new Promise((resolve) => capture.onclick = resolve);

        const canvas = document.createElement('canvas');
        canvas.width = video.videoWidth;
        canvas.height = video.videoHeight;
        canvas.getContext('2d').drawImage(video, 0, 0);
        stream.getVideoTracks()[0].stop();
        div.remove();
        return canvas.toDataURL('image/jpeg', quality);
      }
      ''')
    display(js)
    data = eval_js('takePhoto({})'.format(quality))
    binary = b64decode(data.split(',')[1])
    with open(filename, 'wb') as f:
      f.write(binary)
    return filename

# Function to Capture and Process Image from Webcam
def capture_and_process_image():
    try:
        filename = take_photo()
        print('Saved to {}'.format(filename))

        # Show the image which was just taken.
        display(IPImage(filename))
        # Open the image file
        image = Image.open(filename)

        # Process the image
        process_image(image)
    except Exception as err:
        print(str(err))


# Function to Upload and Process Image from Local System
def upload_and_process_image():
    try:
        # Prompt the user to upload an image file
        uploaded = files.upload()

        if not uploaded:
            print("No file uploaded.")
            return

        # Get the uploaded file name
        for filename in uploaded.keys():
            print(f'Uploaded file: {filename}')
            # Open the image file
            image = Image.open(BytesIO(uploaded[filename]))

            # Process the image
            process_image(image)
    except Exception as err:
        print(f"Error processing uploaded image: {err}")


# => Final Function to Run Each Time (For Camera)

In [None]:
capture_and_process_image()

# => Final Function to Run Each Time (For Uploading)

In [None]:
upload_and_process_image()

# Debug File

In [None]:
image = Image.open('1.png')
process_image(image)

Model Output: {
    "type": "Packed Product",
    "Brand": "Maybelline",
    "Product Count": 3,
    "Expiry Date": "2019.01.01",
    "Mfg Date": "2018.02.02",
    "Best Before Months": None
}
Received Product Details: {'type': 'Packed Product', 'Brand': 'Maybelline', 'Product Count': 3, 'Expiry Date': '2019.01.01', 'Mfg Date': '2018.02.02', 'Best Before Months': None}
Product details saved successfully.


In [None]:
from PIL import Image
import os

# Folder containing images
image_folder = '/content/'  # Replace this with your directory path

# Loop through all the files in the directory
for filename in os.listdir(image_folder):
    # Check if the file is an image (you can add more file extensions as needed)
    if filename.lower().endswith(('.jpg', '.jpeg', '.png')):
        # Create the full path to the image file
        image_path = os.path.join(image_folder, filename)

        # Open the image using PIL
        image = Image.open(image_path)

        # Process the image
        print(f"Processing {filename}...")
        process_image(image)


Processing 20240929_235441.jpg...
Model Output: {
    "type": "Packed Product",
    "Brand": "Haldiram's",
    "Product Count": 1,
    "Expiry Date": "Best Before 06/2023",
    "Mfg Date": "2023-06-01",
    "Best Before Months": None
}
Received Product Details: {'type': 'Packed Product', 'Brand': "Haldiram's", 'Product Count': 1, 'Expiry Date': 'Best Before 06/2023', 'Mfg Date': '2023-06-01', 'Best Before Months': None}
Product details saved successfully.
Processing 20241002_154925.jpg...
Model Output: {
    "type": "Packed Product",
    "Brand": "Haldiram's",
    "Product Count": 1,
    "Expiry Date": "22/01/25",
    "Mfg Date": "22/01/25",
    "Best Before Months": None
}
Received Product Details: {'type': 'Packed Product', 'Brand': "Haldiram's", 'Product Count': 1, 'Expiry Date': '22/01/25', 'Mfg Date': '22/01/25', 'Best Before Months': None}
Product details saved successfully.
Processing 20241002_154835.jpg...
Model Output: {
    "type": "Packed Product",
    "Brand": "Haldiram's",

Old Code (below) - Phase I

In [None]:
# Create a new Excel workbook
workbook = openpyxl.Workbook()
sheet = workbook.active
sheet.title = "Product Analysis"

# Add headers
headers = ["Product Name", "Category", "Quantity", "Count", "Expiry Date", "Freshness Index", "Shelf Life"]
sheet.append(headers)

# Regular expression patterns to extract data
packaged_product_pattern = r"Product Name: (.*)\n  - Product Category: (.*)\n  - Product Quantity: (.*)\n  - Product Count: (.*)\n  - Expiry Date: (.*)"
fruits_vegetables_pattern = r"Type of fruit/vegetable: (.*)\n  - Freshness Index: (.*)\n  - Estimated Shelf Life: (.*)"

from PIL import Image
import os

# Folder containing images
image_folder = '/content/'  # Replace this with your directory path

# Loop through all the files in the directory
for filename in os.listdir(image_folder):
    # Check if the file is an image (you can add more file extensions as needed)
    if filename.lower().endswith(('.jpg', '.jpeg', '.png')):
        # Create the full path to the image file
        image_path = os.path.join(image_folder, filename)


        # Open the image using PIL
        image = Image.open(image_path)
        image = image.resize((512, 512))

        # Prepare the text prompt for predicting product details and freshness
        messages = [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "image_url": "Captured from webcam"  # Description for internal use
                    },
                    {
                        "type": "text",
                        "text": """This image contains fruits, vegetables, or packaged products.
                        Please analyze the image and provide:
                        - For packaged products:
                            - Product Name
                            - Product Category
                            - Product Quantity
                            - Product Count
                            - Expiry Date (if available)
                        - For fruits and vegetables:
                            - Type of fruit/vegetable
                            - Freshness Index (based on visual cues)
                            - Estimated Shelf Life"""
                    }
                ]
            }
        ]

        # Prepare the text prompt for processing
        text_prompt = processor.apply_chat_template(messages, add_generation_prompt=True)

        # Process the image and prompt for model input
        inputs = processor(
            text=[text_prompt],
            images=[image],
            padding=True,
            return_tensors="pt"
        )

        # Move inputs to GPU if available
        inputs = inputs.to("cuda" if torch.cuda.is_available() else "cpu")

        # Generate output from the model
        output_ids = model.generate(**inputs, max_new_tokens=1024)

        # Decode the generated output
        generated_ids = [
            output_ids[len(input_ids):]
            for input_ids, output_ids in zip(inputs.input_ids, output_ids)
        ]
        output_text = processor.batch_decode(
            generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
        )[0]

        # Print the output text for the current image
        print(f"Analysis Output for the captured frame:")
        print(output_text)

        # Extract packaged product information
        packaged_product_match = re.search(packaged_product_pattern, output_text)
        fruits_vegetables_match = re.search(fruits_vegetables_pattern, output_text)

        if packaged_product_match:
            product_name = packaged_product_match.group(1).strip()
            category = packaged_product_match.group(2).strip()
            quantity = packaged_product_match.group(3).strip()
            count = packaged_product_match.group(4).strip()
            expiry_date = packaged_product_match.group(5).strip()
        else:
            product_name = category = quantity = count = expiry_date = "N/A"

        if fruits_vegetables_match:
            # Overwrite product fields with fruit/vegetable data if applicable
            product_name = fruits_vegetables_match.group(1).strip()
            category = "Fruit/Vegetable"
            freshness_index = fruits_vegetables_match.group(2).strip()
            shelf_life = fruits_vegetables_match.group(3).strip()
        else:
            freshness_index = shelf_life = "N/A"

        # Insert row into Excel
        sheet.append([product_name, category, quantity, count, expiry_date, freshness_index, shelf_life])

# Save the workbook
workbook.save("product_analysis.xlsx")
print("Data saved to product_analysis.xlsx")


Analysis Output for the captured frame:
- **For packaged products:**
  - **Product Name:** Haldiram's Panga Achaari Masala
  - **Product Category:** Snacks
  - **Product Quantity:** 1 pack
  - **Product Count:** 1
  - **Expiry Date:** Not visible in the image

- **For fruits and vegetables:**
  - **Type of fruit/vegetable:** Not applicable as the image shows a packaged product, not a fresh fruit or vegetable.
  - **Freshness Index:** Not applicable as the image does not provide any visual cues about freshness.
  - **Estimated Shelf Life:** Not applicable as the image does not provide any information about the shelf life of the product.
Analysis Output for the captured frame:
- **For packaged products:**
  - **Product Name:** Haldiram's Aloo Bhujia
  - **Product Category:** Snacks
  - **Product Quantity:** 20g
  - **Product Count:** 1
  - **Expiry Date:** 22/01/25

- **For fruits and vegetables:**
  - **Type of fruit/vegetable:** Aloo (Potato)
  - **Freshness Index:** Based on visual cu

KeyboardInterrupt: 