# 01 - MediaPipe Hands Detection

This notebook demonstrates **real-time hand landmark detection** using the
**MediaPipe Tasks Vision API**.

**Goal of this notebook:**
- Verify that the camera works
- Detect up to **2 hands**
- Visualize the **21 hand landmarks**
- Quit cleanly with **Q**

> No gesture logic or control mapping is implemented here.


### Imports

Libraries required for:
- Webcam access (OpenCV)
- Real-time processing
- MediaPipe Hands detection

In [None]:
import cv2
import mediapipe as mp
import time

from pathlib import Path

### MediaPipe HandLandmarker Model

Download the official MediaPipe hand landmark model **only if it is not already present**.


In [None]:
MODEL_PATH = Path("hand_landmarker.task")

if not MODEL_PATH.exists():
    !wget -O hand_landmarker.task https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/1/hand_landmarker.task
else:
    print("hand_landmarker.task already exists")

### HandLandmarker Initialization (Tasks API)

Initialize the MediaPipe **HandLandmarker** using the Tasks API.


In [None]:
from mediapipe.tasks import python
from mediapipe.tasks.python import vision

base_options = python.BaseOptions(
    model_asset_path=str(MODEL_PATH)
)

options = vision.HandLandmarkerOptions(
    base_options=base_options,
    num_hands=2,
    min_hand_detection_confidence=0.7,
    min_hand_presence_confidence=0.7
)

detector = vision.HandLandmarker.create_from_options(options)

print("MediaPipe Tasks (Hands) successfully initialized!")

### Real-time hand landmarks visualization (points only)

Captures the webcam stream, detects hand landmarks using MediaPipe Tasks,
and draws only landmark points (no connections).  
Displays the real-time FPS on the top-left corner.  
Press **'q'** to quit the camera feed.

In [None]:
# Camera initialization
cap = cv2.VideoCapture(0)
if not cap.isOpened():
    raise RuntimeError("Cannot open camera")

prev_time = 0 # Used to compute FPS

try:
    while cap.isOpened():
        # Capture a frame
        ret, frame = cap.read()
        if not ret:
            print("Failed to read frame")
            break

        # Mirror the frame for natural interaction
        frame = cv2.flip(frame, 1)

        # Convert BGR frame (OpenCV) to RGB (MediaPipe)
        rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb_frame)

        # Detect hand landmarks
        result = detector.detect(mp_image)

        if result.hand_landmarks:
            for hand_landmarks in result.hand_landmarks:
                # Draw all landmark points
                for lm in hand_landmarks:
                    x = int(lm.x * frame.shape[1])
                    y = int(lm.y * frame.shape[0])
                    cv2.circle(frame, (x, y), 4, (0, 0, 255), -1)

        # Compute and display FPS
        curr_time = time.time()
        fps = 1 / (curr_time - prev_time) if prev_time else 0
        prev_time = curr_time
        cv2.putText(
            frame,
            f"FPS: {int(fps)}",
            (10, 30),
            cv2.FONT_HERSHEY_SIMPLEX,
            1,
            (255, 0, 0),
            2
        )

        # Show the frame
        cv2.imshow("MediaPipe Hands", frame)

        # Quit on 'q'
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

finally:
    cap.release()
    cv2.destroyAllWindows()
    print("Camera closed cleanly")
