# Sardinia 2025 Active Lab

This notebook guides you on the use of AI to analyze food waste images. It is made to be run in a Google Colab environment, if you want to run it locally you should handle the installation of software dependecies.

The notebook requires a free [Groq](https://groq.com/) account in order to process the images using a Vision Language Model.

## Import libraries

In [15]:
import base64
import time
from io import BytesIO


try:
    import pymongo
except ImportError:
    !pip install pymongo > /dev/null 2>&1
    import pymongo

try:
    import PIL
except ImportError:
    !pip install Pillow > /dev/null 2>&1
    import PIL

try:
    import openai
    import outlines
    from outlines.inputs import Image
except ImportError:
    !pip install "outlines[openai]" > /dev/null 2>&1
    import openai
    import outlines
    from outlines.inputs import Image

try:
    from pydantic import BaseModel, Field
except ImportError:
    !pip install pydantic > /dev/null 2>&1
    from pydantic import BaseModel, Field

try:
    import pandas as pd
except ImportError:
    !pip install pandas > /dev/null 2>&1
    import pandas as pd


## Set variables and data models

In [16]:
mongo_connect_string = "mongodb://USER:PWD@IPADDR:27017/?tls=false"
groq_api_key = "you-api-key"

In [17]:
text_prompt = """You are a food waste analyst. You received an image of food leftovers for your research.
Given the image, provide the following information:
- The type of the main food in the image
- The estimated weight of the food in the image in grams
In general, food leftover weights are between 10 grams and 750 grams.
Aim to be as accurate as possible in your weight estimation.
"""

In [18]:
class FoodAnalysis(BaseModel):
    food_type: str = Field(description="the type of the main food in the image")
    estimate_weight: int = Field(description="estimated weight in grams for the food in the image")

## Get data from MongoDB

In [19]:
client = pymongo.MongoClient(mongo_connect_string)
coll = client['iot']['active_lab']

In [20]:
data = coll.find()
data = [d for d in data]

## Setup Vision Language Model on Groq

In [21]:
client = openai.OpenAI(
    base_url="https://api.groq.com/openai/v1",  # Groq server URL - OpenAI API
    api_key=groq_api_key,
)

In [22]:
m = "meta-llama/llama-4-scout-17b-16e-instruct"
print(f"Using model: {m}")
model = outlines.from_openai(client, m)

m_safename = m.replace("/", "_")

Using model: meta-llama/llama-4-scout-17b-16e-instruct


## Image processing

In [27]:
result_list = []

for i, img in enumerate(data):
    time.sleep(4)  # To avoid rate limiting
    if (i+1) % 10 == 0:
        print(f"Processing image {i+1}/{len(data)}")
    prompt = [
        text_prompt,
        Image(PIL.Image.open(BytesIO(base64.b64decode(img['image']))))
    ]
    response = model(prompt, FoodAnalysis, max_tokens=128)
    casted_result = FoodAnalysis.model_validate_json(response)
    result_list.append({
        **img,
        "estimated_type": casted_result.food_type,
        "estimated_weight": casted_result.estimate_weight,
        "model": m
    })

Processing image 10/31
Processing image 20/31
Processing image 30/31


## Data analysis

In [29]:
data_df = pd.DataFrame(result_list)

In [30]:
data_df.head()

Unnamed: 0,_id,clientid,datestring,weight,type,image,estimated_type,estimated_weight,model
0,68eef08bcbc9fd077633d411,active_lab,20251007,43,rice with chicken,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBw...,Chicken and Rice,250,meta-llama/llama-4-scout-17b-16e-instruct
1,68eef08bcbc9fd077633d412,active_lab,20251011,5,peanuts and raisins,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBw...,Trail mix,50,meta-llama/llama-4-scout-17b-16e-instruct
2,68eef08bcbc9fd077633d413,active_lab,20251012,480,three-delicacy rice noodles,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBw...,Tom Yum Soup,350,meta-llama/llama-4-scout-17b-16e-instruct
3,68eef08bcbc9fd077633d414,active_lab,20251007,30,rice,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBw...,rice,80,meta-llama/llama-4-scout-17b-16e-instruct
4,68eef08ccbc9fd077633d415,active_lab,20251003,150,Curd rice,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBw...,yogurt-based side dish,200,meta-llama/llama-4-scout-17b-16e-instruct


In [31]:
data_df = data_df[["clientid", "datestring", "weight", "type", "estimated_weight", "estimated_type", "model"]]

In [32]:
data_df.head()

Unnamed: 0,clientid,datestring,weight,type,estimated_weight,estimated_type,model
0,active_lab,20251007,43,rice with chicken,250,Chicken and Rice,meta-llama/llama-4-scout-17b-16e-instruct
1,active_lab,20251011,5,peanuts and raisins,50,Trail mix,meta-llama/llama-4-scout-17b-16e-instruct
2,active_lab,20251012,480,three-delicacy rice noodles,350,Tom Yum Soup,meta-llama/llama-4-scout-17b-16e-instruct
3,active_lab,20251007,30,rice,80,rice,meta-llama/llama-4-scout-17b-16e-instruct
4,active_lab,20251003,150,Curd rice,200,yogurt-based side dish,meta-llama/llama-4-scout-17b-16e-instruct


In [33]:
data_df["error"] = data_df["estimated_weight"] - data_df["weight"]

In [34]:
data_df["pct_error"] = (data_df["error"] / data_df["weight"]) * 100

In [36]:
data_df["abs_error"] = abs(data_df["error"])
data_df["abs_pct_error"] = abs(data_df["pct_error"])

In [37]:
data_df.head()

Unnamed: 0,clientid,datestring,weight,type,estimated_weight,estimated_type,model,error,pct_error,abs_error,abs_pct_error
0,active_lab,20251007,43,rice with chicken,250,Chicken and Rice,meta-llama/llama-4-scout-17b-16e-instruct,207,481.395349,207,481.395349
1,active_lab,20251011,5,peanuts and raisins,50,Trail mix,meta-llama/llama-4-scout-17b-16e-instruct,45,900.0,45,900.0
2,active_lab,20251012,480,three-delicacy rice noodles,350,Tom Yum Soup,meta-llama/llama-4-scout-17b-16e-instruct,-130,-27.083333,130,27.083333
3,active_lab,20251007,30,rice,80,rice,meta-llama/llama-4-scout-17b-16e-instruct,50,166.666667,50,166.666667
4,active_lab,20251003,150,Curd rice,200,yogurt-based side dish,meta-llama/llama-4-scout-17b-16e-instruct,50,33.333333,50,33.333333


In [44]:
print("Mean Algebric Error: {:.2f} grams".format(data_df["error"].mean()))
print("Mean Absolute Error: {:.2f} grams".format(data_df["abs_error"].mean()))
print("Mean Algebric Percentage Error: {:.2f} %".format(data_df["pct_error"].mean()))
print("Mean Absolute Percentage Error: {:.2f} %".format(data_df["abs_pct_error"].mean()))

Mean Algebric Error: 61.32 grams
Mean Absolute Error: 124.55 grams
Mean Algebric Percentage Error: 169.84 %
Mean Absolute Percentage Error: 183.68 %


In [45]:
data_df.to_csv(f"image_analysis_results.csv", index=False)