# ü§ñ Distinguishing Humans from Humanoids (Photo & Video)

This cookbook demonstrates how to use Gemini 1.5 Pro's **multimodal capabilities** to distinguish between biological humans and humanoid robots in both **static images** and **video feeds**.

**Key Capabilities Covered:**
1.  **Spatial Reasoning (Images)**: Bounding box detection of humans vs robots.
2.  **Temporal Reasoning (Video)**: Analyzing movement patterns (gait, fluidity) to distinguish biological motion from servo-driven motion.

In [None]:
# 1. Setup & API Key
!pip install -q -U google-generativeai pillow

import google.generativeai as genai
import os
import time
import PIL.Image
from IPython.display import display, Image, Markdown

# üîë ENTER YOUR API KEY HERE
# Get one at https://aistudio.google.com/
os.environ["GEMINI_API_KEY"] = "YOUR_API_KEY_HERE" 
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

## üì∏ Part 1: Image Analysis
We use a specialized prompt to detect fine-grained visual cues (skin texture vs cabling).

In [None]:
def analyze_image(image_path):
    print(f"üîç Analyzing Image: {image_path}...")
    model = genai.GenerativeModel('gemini-1.5-pro-latest')
    
    prompt = """
    Analyze this scene for safety classification. 
    
    1. DETECT all bipedal figures in the image.
    2. For each figure, CLASSIFY as either 'HUMAN' or 'HUMANOID_ROBOT'.
    3. PROVIDE EVIDENCE for your classification based on visual features:
       - Human Cues: Skin texture/subsurface scattering, breathing motion, flexible clothing folds, hair, imperfect posture.
       - Robot Cues: Exposed cabling, rigid joint segments, metallic/plastic sheen, LED indicators, rigid gait, perfect symmetry.
       
    Output format: JSON list of objects { "box_2d": [y,x,y,x], "type": "HUMAN" | "HUMANOID", "confidence": 0-1, "evidence": "..." }
    """
    
    if not os.path.exists(image_path):
        return "‚ùå Error: Image file not found. Please add it to the assets folder."

    img = PIL.Image.open(image_path)
    response = model.generate_content([prompt, img])
    return response.text

# RUN IMAGE ANALYSIS
# Ensure you have added 'human_vs_robot.jpg' to the '../assets/' folder
image_path = "../assets/human_vs_robot.jpg"
print(analyze_image(image_path))

## üé• Part 2: Video Analysis (Temporal Gait Analysis)
For video, we upload the file to Gemini's context cache. This allows the model to watch the **movement** quality (fluid vs rigid) which is often the best differentiator.

In [None]:
def analyze_video(video_path):
    print(f"üé¨ Uploading Video: {video_path}...")
    
    if not os.path.exists(video_path):
        return "‚ùå Error: Video file not found. Please add it to the assets folder."

    # 1. Upload Video
    video_file = genai.upload_file(path=video_path)
    
    # 2. Wait for Processing
    while video_file.state.name == "PROCESSING":
        print(".", end="")
        time.sleep(1)
        video_file = genai.get_file(video_file.name)
        
    if video_file.state.name == "FAILED":
        return "‚ùå Video processing failed."

    print("\n‚úÖ Video Ready. Asking Gemini...")
    
    # 3. Temporal Prompt
    model = genai.GenerativeModel('gemini-1.5-pro-latest')
    prompt = """
    Watch this video carefully. Distinguish between BIOLOGICAL HUMANS and HUMANOID ROBOTS based on movement dynamics.
    
    task:
    1. Identify the timestamps where a figure enters the frame.
    2. Analyze the gait (walking cycle):
       - Human: Fluid weight transfer, heel-strike, hip rotation, arm swing variance.
       - Robot: ZMP (Zero Moment Point) walking, bent-knee 'groucho' walk, rigid torso, mechanical servo noise (if audio exists).
    
    RETURN a classification report for each figure tracked.
    """
    
    response = model.generate_content([video_file, prompt])
    return response.text

# RUN VIDEO ANALYSIS
# Ensure you have added 'human_vs_robot.mp4' to the '../assets/' folder
video_path = "../assets/human_vs_robot.mp4"
print(analyze_video(video_path))