# VLM Benchmark for Object Property Abstraction

This notebook implements a benchmark for evaluating Vision Language Models (VLMs) on object property abstraction and visual question answering (VQA) tasks. The benchmark includes three types of questions:

1. Direct Recognition
2. Property Inference
3. Counterfactual Reasoning

And three types of images:
- REAL
- ANIMATED
- AI GENERATED

## Setup and Imports

First, let's import the necessary libraries and set up our environment.

In [1]:
# Install required packages
!pip install transformers torch Pillow tqdm









In [2]:
# Import required libraries
import torch
import json
from pathlib import Path
from PIL import Image
import gc
import re
from tqdm import tqdm
from typing import List, Dict, Any

# Check if CUDA is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

Using device: cuda


## Benchmark Tester Class

This class handles the evaluation of models against our benchmark.

In [3]:
class BenchmarkTester:
    def __init__(self, benchmark_path="/var/scratch/ave303/OP_bench/benchmark.json", data_dir="/var/scratch/ave303/OP_bench/"):
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        with open(benchmark_path, 'r') as f:
            self.benchmark = json.load(f)
        self.data_dir = data_dir
    
    def format_question(self, question, model_name):
        """Format a question for the model."""

        if model_name=="blip2":
            return f"Question: {question['question']} Answer(total number):" # Provide just the total count and the list of objects in the given format \n Format: number [objects] Answer: "
        else:
            return f"Question: {question['question']} Answer(total number):"

    def clean_answer(self, answer):
        """Clean the model output to extract just the number."""
        # Remove any text that's not a number
        # import re
        # numbers = re.findall(r'\d+', answer)
        # if numbers:
        #     return numbers[0]  # Return the first number found
        # return answer
        """Extract number and reasoning from the model's answer."""
        # Try to extract number and reasoning using regex
        import re
        pattern = r'(\d+)\s*\[(.*?)\]'
        match = re.search(pattern, answer)
        
        if match:
            number = match.group(1)
            objects = [obj.strip() for obj in match.group(2).split(',')]
            return {
                "count": number,
                "reasoning": objects
            }
        else:
            # Fallback if format isn't matched
            numbers = re.findall(r'\d+', answer)
            return {
                "count": numbers[0] if numbers else "0",
                "reasoning": []
            }

    def model_generation(self, model_name, model, inputs, processor):
        """Generate answer and decode."""
        outputs = None  # Initialize outputs to None
        
        if model_name=="blip2":
            outputs = model.generate(**inputs)
            answer = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip()
            
        elif model_name=="fuyu-8b":
            outputs = model.generate(
                **inputs,
                max_new_tokens=30,  # Increased from 10 to 200
                pad_token_id=processor.tokenizer.eos_token_id
            )
            answer = processor.batch_decode(outputs[:, -30:], skip_special_tokens=True)[0]
        else:
            print(f"Warning: Unknown model name '{model_name}' in model_generation.")
            answer = ""  # Return an empty string

        return answer, outputs
    
    def evaluate_model(self, model_name, model, processor, save_path, start_idx=0, batch_size=5):
        results = []
        print(f"\nEvaluating {model_name}...")
        print(f"Using device: {self.device}")
        
        # Force garbage collection before starting
        gc.collect()
        torch.cuda.empty_cache()

        try:
            images = self.benchmark['benchmark']['images'][start_idx:start_idx + batch_size]
            total_images = len(images)
            
            for idx, image_data in enumerate(tqdm(images, desc="Processing images")):
                try:
                    print(f"\nProcessing image {idx+1}/{total_images}: {image_data['image_id']}")
                    image_path = Path(self.data_dir)/image_data['path']
                    if not image_path.exists():
                        print(f"Warning: Image not found at {image_path}")
                        continue
                    
                    # Load and preprocess image
                    image = Image.open(image_path).convert("RGB")
                    image_results = []  # Store results for current image
                    
                    for question in image_data['questions']:
                        try:
                            prompt = self.format_question(question, model_name)
                            print(f"Question: {question['question']}")
                            
                            # Clear cache before processing each question
                            torch.cuda.empty_cache()
                            
                            # Process image and text
                            inputs = processor(images=image, text=prompt, return_tensors="pt").to(self.device)
                            
                            # Generate answer with better settings
                            with torch.no_grad():
                                answer, outputs = self.model_generation(model_name, model, inputs, processor)    #call for model.generate
                                
                            cleaned_answer = self.clean_answer(answer)
                            
                            image_results.append({
                                "image_id": image_data["image_id"],
                                "image_type": image_data["image_type"],
                                "question_id": question["id"],
                                "question": question["question"],
                                "ground_truth": question["answer"],
                                "model_answer": cleaned_answer["count"],
                                "model_reasoning": cleaned_answer["reasoning"],
                                "raw_answer": answer,  # Keep raw answer for debugging
                                "property_category": question["property_category"]
                            })
                            
                            # Clear memory
                            del outputs, inputs
                            torch.cuda.empty_cache()
                            
                        except Exception as e:
                            print(f"Error processing question: {str(e)}")
                            continue
                    
                    # Add results from this image
                    results.extend(image_results)
                    
                    # Save intermediate results only every 2 images or if it's the last image
                    if (idx + 1) % 2 == 0 or idx == total_images - 1:
                        with open(f"{save_path}_checkpoint.json", 'w') as f:
                            json.dump(results, f, indent=4)
                            
                except Exception as e:
                    print(f"Error processing image {image_data['image_id']}: {str(e)}")
                    continue
            
            # Save final results
            if results:
                with open(save_path, 'w') as f:
                    json.dump(results, f, indent=4)
            
        except Exception as e:
            print(f"An error occurred during evaluation: {str(e)}")
            if results:
                with open(f"{save_path}_error_state.json", 'w') as f:
                    json.dump(results, f, indent=4)
        
        return results

## Test Fuyu Model

Let's evaluate the Fuyu-8b model on our benchmark.

In [4]:
def test_fuyu():
    #from transformers import AutoModelForCausalLM, AutoTokenizer
    from transformers import FuyuProcessor, FuyuForCausalLM
    
    print("Loading Fuyu-8b model...")
    model = FuyuForCausalLM.from_pretrained(
        "/var/scratch/ave303/models/fuyu-8b",
        # load_in_8bit=True,
        torch_dtype=torch.float16,
        device_map="auto",
        low_cpu_mem_usage=True
    ).eval()
    processor = FuyuProcessor.from_pretrained("/var/scratch/ave303/models/fuyu-8b")

    ## fuyu-8b is very slow and average performance

    # Optional: Enable memory efficient attention
    if hasattr(model.config, 'use_memory_efficient_attention'):
        model.config.use_memory_efficient_attention = True
        
    tester = BenchmarkTester()
    fuyu_results = tester.evaluate_model(
        "fuyu-8b",
        model, 
        processor, 
        "fuyu_8b_results.json", 
        batch_size=50
    )
    # tester.save_results("fuyu_results.json")

    if fuyu_results is not None:
        print("Initial test successful!")
    
    # Clean up
    del model, processor
    torch.cuda.empty_cache()
    gc.collect()

## Test BLIP-2 Model

Now let's evaluate the blip2 model.

In [5]:
def test_blip2():
    from transformers import Blip2Processor, Blip2ForConditionalGeneration
    
    print("Loading BLIP-2 model...")
    model = Blip2ForConditionalGeneration.from_pretrained(
        "/var/scratch/ave303/models/blip2opt6.7b",
        # load_in_8bit=True,
        torch_dtype=torch.float16,
        device_map="auto",
        # temperature=0.8,
        low_cpu_mem_usage=True
    ).to('cuda').eval()
    processor = Blip2Processor.from_pretrained("/var/scratch/ave303/models/blip2opt6.7b")

    ## opt-2.7b average performance, better instruction following 
        # Format - Answer(total number):
    ## opt-6.7b(8bit) better performance with atleast answering, not well-instruction tuned, but provides number for answers
        # Format - Answer(total number):
    ## flan-t5-xl does fine but needs a lot of post processing, does not follow instructions to clearly
        # Format - Answer(provide total number):
    ## flan-t5-xxl(8bit) decent performance, better with instruction I think, slight postprocessing needed
        # Format - Answer:
    
    # Optional: Enable memory efficient attention
    if hasattr(model.config, 'use_memory_efficient_attention'):
        model.config.use_memory_efficient_attention = True
    
    tester = BenchmarkTester()
    blip2_results = tester.evaluate_model(
        "blip2",
        model, 
        processor, 
        "blip2-opt6.7b_results.json", 
        batch_size=50
    )
    # tester.save_results("blip2_results.json")

    if blip2_results is not None:
        print("Initial test successful!")
    
    # Clean up
    del model, processor
    torch.cuda.empty_cache()
    gc.collect()

## Run Evaluation

Now we can run our evaluation. Let's start with the Fuyu model:

In [6]:
# test_fuyu()

And then the BLIP-2 model:

In [7]:
test_blip2()

  from .autonotebook import tqdm as notebook_tqdm


Loading BLIP-2 model...


Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:  14%|█▍        | 1/7 [00:04<00:29,  5.00s/it]

Loading checkpoint shards:  29%|██▊       | 2/7 [00:09<00:24,  4.92s/it]

Loading checkpoint shards:  43%|████▎     | 3/7 [00:14<00:19,  4.88s/it]

Loading checkpoint shards:  57%|█████▋    | 4/7 [00:19<00:14,  4.85s/it]

Loading checkpoint shards:  71%|███████▏  | 5/7 [00:24<00:09,  4.82s/it]

Loading checkpoint shards:  86%|████████▌ | 6/7 [00:29<00:04,  4.80s/it]

Loading checkpoint shards: 100%|██████████| 7/7 [00:31<00:00,  3.93s/it]

Loading checkpoint shards: 100%|██████████| 7/7 [00:31<00:00,  4.45s/it]


Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.



Evaluating blip2...
Using device: cuda


Processing images:   0%|          | 0/50 [00:00<?, ?it/s]


Processing image 1/50: image01
Question: How many objects made of wood are present?


Question: Count the number of breakable items?
Question: If one of the metal objects were replaced by a wooden object, how many wooden objects would be there in the image?


Processing images:   2%|▏         | 1/50 [00:01<01:10,  1.44s/it]


Processing image 2/50: image02
Question: How many mammals are present in the image?
Question: Count the number of items that can store other items?


Processing images:   4%|▍         | 2/50 [00:01<00:37,  1.26it/s]

Question: If one of the zebra were replaced by a tree, how many mammals would be present in the image?

Processing image 3/50: image03
Question: How many objects made of rubber are present?


Question: How many objects with the primary purpose of illumination can be seen?
Question: If the person riding one of the bicycles were replaced by a pedestrian, how many objects that have handles would be present?


Processing images:   6%|▌         | 3/50 [00:02<00:30,  1.54it/s]


Processing image 4/50: image04
Question: How many tools are visible in the image?
Question: How many cutting tools are present in this image?


Processing images:   8%|▊         | 4/50 [00:02<00:24,  1.92it/s]

Question: If the red handle were replaced by a wooden handle, how many colored artifacts would remain in the image?

Processing image 5/50: image05
Question: How many furniture items are present that have legs?


Question: Count the number of containers that cannot hold hot liquids?
Question: If the room were transformed into an open workspace instead of a meeting room, how many privacy features would need to be removed?


Processing images:  10%|█         | 5/50 [00:02<00:21,  2.13it/s]


Processing image 6/50: image06
Question: How many reptiles are visible in this enclosure?
Question: How many reptilian couples, at maximum, are present?


Processing images:  12%|█▏        | 6/50 [00:03<00:18,  2.34it/s]

Question: If all the small pebbles forming the mosaic floor were replaced with sand, how many natural elements would still be visible in the enclosure?

Processing image 7/50: image07
Question: How many birds are visible in this image?
Question: How many objects are present that can comfortably seat a human?


Processing images:  14%|█▍        | 7/50 [00:03<00:15,  2.71it/s]

Question: If the birds sitting together only on one railing were to fly away, how many birds would remain?

Processing image 8/50: image08
Question: How many reptiles are visible in this image?
Question: How many objects are present that act as support?


Processing images:  16%|█▌        | 8/50 [00:03<00:13,  3.05it/s]

Question: If one turtle slid off the log into the water, how many turtles would be in the water?

Processing image 9/50: image09
Question: How many different types of vegetables are present in the image?
Question: How many objects are used as containers?


Processing images:  18%|█▊        | 9/50 [00:04<00:12,  3.32it/s]

Question: If the bag of limes were removed and replaced with two additional avocados, how many fruits would be present in total on the table, considering avocados are fruits?

Processing image 10/50: image10
Question: How many objects are present that are flexible?
Question: Count the number of items that are battery powered?


Processing images:  20%|██        | 10/50 [00:04<00:11,  3.52it/s]

Question: If two phones with three camera lenses were replaced with phones having two camera lenses, how many phones with two camera lenses would be present?

Processing image 11/50: image11
Question: How many objects made of glass are present on the table?
Question: How many objects are present at the table that can be used for sitting?


Processing images:  22%|██▏       | 11/50 [00:04<00:11,  3.49it/s]

Question: If the tables in the center are removed, how many objects are visible that have legs?

Processing image 12/50: image12
Question: How many pieces of gym equipment are visible in the image?
Question: How many objects are present that provide shade?


Processing images:  24%|██▍       | 12/50 [00:04<00:10,  3.58it/s]

Question: If two of the stationary bikes were replaced by two treadmills, how many objects would be present that have pedals?

Processing image 13/50: image13
Question: How many furniture items are present in the room?
Question: How many individual storage compartments are present in the furniture items in the room?


Processing images:  26%|██▌       | 13/50 [00:05<00:09,  3.77it/s]

Question: If the two bedside lamps were removed, how many objects are present that need electricity?

Processing image 14/50: image14
Question: How many objects are present that are transparent?
Question: How many objects are positioned for student use to place other items?


Processing images:  28%|██▊       | 14/50 [00:05<00:09,  3.81it/s]

Question: If the signages were removed, how many objects would be present that hang from the ceiling?

Processing image 15/50: image15
Question: How many objects made of rubber are present?


Question: How many objects are visible that can be used to move up?
Question: If the car on the ground is driven out of the garage, how many objects are present that is used to indicate slowing down to a stop?


Processing images:  30%|███       | 15/50 [00:05<00:09,  3.51it/s]


Processing image 16/50: image16
Question: How many objects made of rubber are present?
Question: How many objects can be used as modes of transport if fixed?


Processing images:  32%|███▏      | 16/50 [00:06<00:10,  3.26it/s]

Question: If the car in the center is fixed and driven out of the garage, how many objects made of rubber would be visible in the image?

Processing image 17/50: image17
Question: How many yellow colored objects are present?
Question: How many objects are visible that are used to protect the head?


Processing images:  34%|███▍      | 17/50 [00:06<00:09,  3.50it/s]

Question: If one person leaves the cleaning group, how many mammals would remain?

Processing image 18/50: image18
Question: How many mammals are visible in the image?
Question: How many objects are present that provide shelter?


Processing images:  36%|███▌      | 18/50 [00:06<00:08,  3.65it/s]

Question: If the mammals are to all step inside the shelters, how many natural elements are visible in the image?

Processing image 19/50: image19
Question: How many gardening tools are present that are made of metal?
Question: How many objects are present in the garden that can hold other items?


Processing images:  38%|███▊      | 19/50 [00:06<00:08,  3.70it/s]

Question: If half the woven baskets are filled, how many containers would remain empty?

Processing image 20/50: image20
Question: How many objects in the background are present that have legs?
Question: How many objects in the foreground are visible that are foldable?


Processing images:  40%|████      | 20/50 [00:07<00:08,  3.60it/s]

Question: If the stack of books on the table in the foreground was moved to the shelf, how many objects in physical contact with the table would be present?

Processing image 21/50: image01
Question: How many mammals are present in total?
Question: How many objects are visible that can store items?


Processing images:  42%|████▏     | 21/50 [00:07<00:07,  3.78it/s]

Question: If the bear were to be replaced by a tree, how many different types of mammals would be there at the zoo?

Processing image 22/50: image02
Question: How many kitchen tools are visible in the image?
Question: Count the number of items that require electricity to operate?


Processing images:  44%|████▍     | 22/50 [00:07<00:07,  3.90it/s]

Question: If blinds were installed for the windows above the sink, how many transparent objects would remain?

Processing image 23/50: image03
Question: How many objects made of glass are present?
Question: How many tools are visible that can be used for cutting?


Processing images:  46%|████▌     | 23/50 [00:07<00:06,  3.89it/s]

Question: If the worker was not wearing ear protection, how many protective items would remain?

Processing image 24/50: image04
Question: How many objects made of rubber are present?
Question: Excluding the drawers, how many items in the workshop serve as containers for storage?


Processing images:  48%|████▊     | 24/50 [00:08<00:06,  3.75it/s]

Question: If an electric fan were placed on the workstation to provide ventilation, how many objects in the room would require electricity to operate?

Processing image 25/50: image05
Question: How many birds are visible in the image?
Question: How many objects are present that act as support?


Processing images:  50%|█████     | 25/50 [00:08<00:06,  3.90it/s]

Question: If the clouds were to completely cover the sky, blocking the sunlight, how many natural elements would still be visible?

Processing image 26/50: image06
Question: How many objects are present that have chimneys?
Question: How many objects are visible that are means of transportation?


Processing images:  52%|█████▏    | 26/50 [00:08<00:06,  3.60it/s]

Question: If the bus were replaced by a pedestrian, how many mammals would be present?

Processing image 27/50: image07
Question: How many objects made of glass are present?


Processing images:  54%|█████▍    | 27/50 [00:08<00:06,  3.76it/s]

Question: Count the number of items that can be used to carry liquid?
Question: If the waste to be disposed was color-coded to match the bins, how many objects are to be thrown in the bin on the right?

Processing image 28/50: image08
Question: How many objects are present that have legs?


Processing images:  56%|█████▌    | 28/50 [00:09<00:05,  3.69it/s]

Question: How many items are visible that are openable?
Question: If the bottle was removed from the table, how many objects are present on top of the table?

Processing image 29/50: image09
Question: How many objects made of wood are present?


Processing images:  58%|█████▊    | 29/50 [00:09<00:05,  3.71it/s]

Question: How many kitchen items are visible that can be used for cutting?
Question: If the two jars on the top shelf were removed, how many breakable items would be present in the image?

Processing image 30/50: image10
Question: How many objects made of plastic are visible?


Processing images:  60%|██████    | 30/50 [00:09<00:05,  3.81it/s]

Question: How many items are visible that can record audio?
Question: If the microphones were replaced with headsets for every character, how many objects in total would be present that are worn on the head?

Processing image 31/50: image11


Question: How many different food items are present on the kitchen countertop?
Question: How many objects are visible that need electricity to operate?


Processing images:  62%|██████▏   | 31/50 [00:10<00:05,  3.24it/s]

Question: If all the objects on the two shelves above the counter were placed inside the cabinet, how many items that are breakable would be present on the counter?

Processing image 32/50: image12
Question: How many different types of plants are present?


Question: How many objects are visible that behave as containers?
Question: If all the visible plants were potted individually and placed on the stand, how many pots would be present on the stand?


Processing images:  64%|██████▍   | 32/50 [00:10<00:05,  3.10it/s]


Processing image 33/50: image13
Question: How many mammals are visible in the image?


Question: How many objects are present that can be used for sitting?
Question: If the character standing upright took a seat for themself and the huddled group are seated in pairs, that is two characters per seat. How many objects would remain that can be used for sitting?


Processing images:  66%|██████▌   | 33/50 [00:11<00:07,  2.37it/s]


Processing image 34/50: image14
Question: How many cardboard objects are visible in the image?
Question: How many objects are visible that can be used for sitting?
Question: If the bottled objects and the white cups are packed away, how many objects are present that can be used to drink out of?


Processing images:  68%|██████▊   | 34/50 [00:11<00:05,  2.69it/s]


Processing image 35/50: image15
Question: How many objects that are present have wheels?
Question: How many items are visible that can be used to hold liquids?


Processing images:  70%|███████   | 35/50 [00:11<00:05,  2.82it/s]

Question: If the car drives away, how many objects made of rubber are visible?

Processing image 36/50: image16
Question: How many objects made of glass are present?


Question: How many tools designed for gathering or sweeping are visible?
Question: If there was a flood and the water washed up the beach, completely submerging it, how many natural elements would be present in the image?


Processing images:  72%|███████▏  | 36/50 [00:12<00:05,  2.61it/s]


Processing image 37/50: image17
Question: How many objects are visible that have legs?
Question: How many objects are visible that are attached to the wall or ceiling?
Question: If the blinds are pulled over the window, how many sources of illumination would remain?


Processing images:  74%|███████▍  | 37/50 [00:12<00:04,  2.81it/s]


Processing image 38/50: image18
Question: How many objects made of rubber are visible?
Question: How many objects are present that can hold liquids?
Question: If the tools hanging on the wall were to be placed on the shelf, how many objects would be present on the shelf?


Processing images:  76%|███████▌  | 38/50 [00:12<00:03,  3.09it/s]


Processing image 39/50: image19
Question: How many different types of gym equipment are present?
Question: How many pieces of exercise equipment primarily designed for cardiovascular workouts are visible?
Question: If the blinds were pulled over the windows, how many sources of illumination would remain?


Processing images:  78%|███████▊  | 39/50 [00:12<00:03,  3.29it/s]


Processing image 40/50: image20
Question: How many objects are present that have legs?
Question: How many objects are visible that act as protection or shade?
Question: If the laptop were placed on the shelf next to the TV, how many objects would be present on the shelf?


Processing images:  80%|████████  | 40/50 [00:13<00:02,  3.37it/s]


Processing image 41/50: image01
Question: How many objects made of rubber are visible?
Question: How many objects are visible that are means of transportation?
Question: If the car in the driveway were to leave, how many objects primarily made of metal would be present?


Processing images:  82%|████████▏ | 41/50 [00:13<00:02,  3.45it/s]


Processing image 42/50: image02
Question: How many objects made of concrete are present?
Question: How many objects are visible that can be used for lifting?
Question: If the orange paint spilled all over one of the plexiglass sheets, how many objects would remain that are transparent?


Processing images:  84%|████████▍ | 42/50 [00:13<00:02,  3.47it/s]


Processing image 43/50: image03
Question: How many mammals are present in the image?
Question: How many objects are visible that are used for both meat and wool production?
Question: If the two sheep were replaced by a cow grazing in the same area, how many objects would be present in between the two fences?


Processing images:  86%|████████▌ | 43/50 [00:14<00:02,  3.48it/s]


Processing image 44/50: image04
Question: How many objects are visible that are made of paper?
Question: How many objects are present that behave as storage spaces?
Question: If the glasses were placed inside the ceramic container, and we use this container as a dividing line between the left and right sides of the bookshelf, how many objects would be on the right side?


Processing images:  88%|████████▊ | 44/50 [00:14<00:01,  3.44it/s]


Processing image 45/50: image05
Question: How many objects are visible that are made of porcelain?
Question: How many decoration items are present in the image?
Question: If the drinks were split evenly between the two humans, how many drinks would each human consume?


Processing images:  90%|█████████ | 45/50 [00:14<00:01,  3.55it/s]


Processing image 46/50: image06
Question: How many mammals are present in the image?
Question: How many objects are visible that are designed to contain liquids?
Question: If the trash bags and bottles on the sand are only thrown into the black bin, how many mammals are actively holding some other object?


Processing images:  92%|█████████▏| 46/50 [00:14<00:01,  3.61it/s]


Processing image 47/50: image07
Question: How many mammals are present in the image?
Question: How many objects are present that provide shelter?
Question: If one of the mammals douses the fire, how many objects are present that can be switched off?


Processing images:  94%|█████████▍| 47/50 [00:15<00:00,  3.56it/s]


Processing image 48/50: image08
Question: How many different types of gym equipment are present?
Question: How many objects are visible that are positioned between the row of treadmills and the bench press station?
Question: If one of the treadmills is faulty and removed from the gym, how many objects are present that convey some kind of information?


Processing images:  96%|█████████▌| 48/50 [00:15<00:00,  3.49it/s]


Processing image 49/50: image09
Question: How many objects made of rubber are visible in the image?
Question: How many objects are visible that need electricity to operate?


Processing images:  98%|█████████▊| 49/50 [00:15<00:00,  3.45it/s]

Question: If one of the workers took a wrench off the table, how many objects would remain in physical contact with the table?

Processing image 50/50: image10
Question: How many objects are visible that are made of metal?
Question: How many objects present are breakable?


Processing images: 100%|██████████| 50/50 [00:16<00:00,  3.47it/s]

Processing images: 100%|██████████| 50/50 [00:16<00:00,  3.12it/s]

Question: If the bowls with the tomatoes and the chickpeas were emptied into the steaming pot, how many containers would still have something remaining in them?
Initial test successful!



