<a href="https://colab.research.google.com/github/victoriamazilu/human-detection/blob/main/human_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator`, set it to `GPU`, and then click `Save`.

In [2]:
!nvidia-smi

Thu Jun  5 02:48:05 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   55C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [3]:
import os
HOME = os.getcwd()
print(HOME)

/content


## Install YOLOv8

YOLOv8 can be installed in two ways - from the source and via pip. This is because it is the first iteration of YOLO to have an official package.

In [4]:
# Pip install method (recommended)
!pip install numpy==1.24.3 --quiet

!pip install ultralytics==8.2.103 -q

from IPython import display
display.clear_output()

import ultralytics
ultralytics.checks()

Ultralytics YOLOv8.2.103 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 41.7/112.6 GB disk)


In [5]:
from ultralytics import YOLO

from IPython.display import display, Image

In [6]:
model = YOLO(f'{HOME}/yolov8s-seg.pt')
#results = model.predict(source='https://media.roboflow.com/notebooks/examples/dog.jpeg', conf=0.25)

In [7]:
%%html
<video id="video" autoplay playsinline style="display:none;"></video>
<canvas id="canvas" style="display:none;"></canvas>
<script>
(async () => {
  const video  = document.getElementById('video');
  const canvas = document.getElementById('canvas');
  const ctx    = canvas.getContext('2d');

  try {
    const stream = await navigator.mediaDevices.getUserMedia({video:true});
    video.srcObject = stream;
    await video.play();
  } catch (e) {
    console.error('Camera access denied or not available', e);
    return;
  }

  canvas.width  = video.videoWidth;
  canvas.height = video.videoHeight;
  google.colab.output.setIframeHeight(canvas.height + 20);

  window.captureFrame = () => {
    ctx.drawImage(video, 0, 0, canvas.width, canvas.height);
    return canvas.toDataURL('image/jpeg', 0.8);
  };
</script>


# **Second Day Challenge**.


In [8]:
from ultralytics import YOLO
pose_model = YOLO("yolov8n-pose.pt")  # or yolov8m-pose.pt for better accuracy


## Ultralytics YOLOv8-Pose follows the 17-keypoint “COCO” convention
---

## 1. How the angle calculation works

1. **Gather the three 2-D points** that form the joint:  
   * **S** (start / proximal …)  
   * **E** (elbow – vertex)  
   * **W** (end / distal …)

2. **Form two limb-segment vectors** that share the vertex  

$$
\vec{v_1} = \mathbf{S} - \mathbf{E},\qquad
\vec{v_2} = \mathbf{W} - \mathbf{E}
$$


3. **Compute the cosine**  

$$
\cos\theta = \frac{\,\vec{v_1}\cdot\vec{v_2}\,}
                 {\lVert\vec{v_1}\rVert\,\lVert\vec{v_2}\rVert}
$$


4. **Convert to degrees**  


$$
\theta = \arccos(\cos\theta)\times\frac{180}{\pi}
$$

The tiny `1e-6` in the denominator of `angle_at_joint()` avoids division-by-zero.

---

## 2. YOLOv8 / COCO 17-keypoint index map

| Index | Landmark | Typical use-case | Notes |
|-----:|-----------|------------------|-------|
| 0 | Nose | Head orientation | Center of face |
| 1 | Left Eye | Gaze / blink | |
| 2 | Right Eye | | |
| 3 | Left Ear | Head yaw | |
| 4 | Right Ear | | |
| **5** | **Left Shoulder** | Upper-arm root | Start of **left elbow angle** |
| **6** | **Right Shoulder** | | Start of **right elbow angle** |
| **7** | **Left Elbow** | Elbow flexion | Vertex |
| **8** | **Right Elbow** | | Vertex |
| 9 | Left Wrist | Wrist pose / elbow | |
| 10 | Right Wrist | | |
| 11 | Left Hip | Trunk inclination | |
| 12 | Right Hip | | |
| 13 | Left Knee | Knee flexion | |
| 14 | Right Knee | | |
| 15 | Left Ankle | Gait analysis | |
| 16 | Right Ankle | | |

### Example – elbow flexion  
* **Left:** `[5 → 7 → 9]` (shoulder–elbow–wrist)  
* **Right:** `[6 → 8 → 10]`

---

## 3. Confidence filtering & best practices

* **Keypoint confidence**: use the 3ʳᵈ value in each keypoint row (0–1) to skip uncertain detections.  
* **Units**: coordinates are pixels; angles are unit-free, but segment *lengths* need scaling.  
* **2-D vs 3-D**: for true joint angles in space, capture depth or multi-view 3-D first.

---

## 4. Further documentation & resources

* **Ultralytics Pose guide** – quick-start, skeleton diagram.  
https://docs.ultralytics.com/tasks/pose/#models

https://github.com/Alimustoofaa/YoloV8-Pose-Keypoint-Classification


In [None]:
os.kill(os.getpid(), 9)

In [3]:
# ─── One‑Cell Live Pose + Elbow‑Angle Estimation (fixed keypoint extraction) ───
from IPython.display import Javascript, display
from google.colab.output import eval_js
from base64 import b64decode
import cv2, numpy as np, time
import ipywidgets as widgets
from ultralytics import YOLO

# 1) Inject JS for webcam + captureFrame()
display(Javascript("""
(async () => {
  const v = document.createElement('video');
  v.autoplay = v.playsInline = true;
  v.style.display = 'none';
  document.body.appendChild(v);

  const c = document.createElement('canvas');
  c.style.display = 'none';
  document.body.appendChild(c);
  const ctx = c.getContext('2d');

  const stream = await navigator.mediaDevices.getUserMedia({video:true});
  v.srcObject = stream;
  await v.play();

  c.width = v.videoWidth; c.height = v.videoHeight;
  google.colab.output.setIframeHeight(c.height + 20);

  window.captureFrame = () => {
    ctx.drawImage(v, 0, 0, c.width, c.height);
    return c.toDataURL('image/jpeg', 0.8);
  };
})();
"""))

# 2) Wait for the JS function to be ready
print("⏳ Initializing camera…")
for _ in range(25):
    try:
        if eval_js("typeof captureFrame") == 'function':
            print("✅ Camera ready!")
            break
    except Exception:
        pass
    time.sleep(0.2)
else:
    raise RuntimeError("Camera never initialized—did you allow access?")

# 3) Helpers to grab frames and compute angles
def get_frame():
    data = eval_js('captureFrame()')
    _, b64 = data.split(',', 1)
    arr = np.frombuffer(b64decode(b64), dtype=np.uint8)
    return cv2.imdecode(arr, cv2.IMREAD_COLOR)

def angle_at_joint(kps, si, ei, wi):
    S, E, W = kps[si, :2], kps[ei, :2], kps[wi, :2]
    v1, v2 = S - E, W - E
    cosang = np.dot(v1, v2) / (np.linalg.norm(v1)*np.linalg.norm(v2) + 1e-6)
    return np.degrees(np.arccos(np.clip(cosang, -1, 1)))

# 4) Load YOLOv8‑Pose model
pose_model = YOLO("yolov8n-pose.pt")  # or your custom pose weights

# 5) Create a persistent widget for display
img_wid = widgets.Image(format='jpeg')
display(img_wid)

# 6) Live loop: predict, draw skeleton, compute & overlay elbow angles
print("▶️ Live pose + elbow angles—Interrupt (⏹) to stop.")
try:
    while True:
        frame = get_frame()
        res   = pose_model.predict(frame, stream=False)[0]
        vis   = res.plot()

        # **Fix**: extract the raw tensor from Coordinates, then to numpy
        kps_arr = res.keypoints.data.detach().cpu().numpy()  # shape (n_people, 17, 3)
        print(kps_arr)
        for kps in kps_arr:
            # optional: skip if elbow keypoints are low‑confidence
            if kps[7,2] < 0.3 or kps[8,2] < 0.3:
                continue

            # left = [5→7→9], right = [6→8→10]
            aL = angle_at_joint(kps, 5, 7, 9)
            aR = angle_at_joint(kps, 6, 8, 10)

            # overlay angles onto the visualization
            pt_el = tuple(kps[7,:2].astype(int))
            pt_er = tuple(kps[8,:2].astype(int))
            cv2.putText(vis, f"{int(aL)}°", (pt_el[0]-20, pt_el[1]),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,255,0), 2)
            cv2.putText(vis, f"{int(aR)}°", (pt_er[0]+5, pt_er[1]),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,255,0), 2)

        _, jpg = cv2.imencode('.jpg', vis)
        img_wid.value = jpg.tobytes()
        time.sleep(0.03)  # tune this for latency vs. CPU/GPU load

except KeyboardInterrupt:
    print("⏹ Segmentation stopped.")


<IPython.core.display.Javascript object>

⏳ Initializing camera…
✅ Camera ready!


Image(value=b'', format='jpeg')

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  [          0           0     0.05259]
  [          0           0     0.01619]
  [          0           0    0.022502]
  [          0           0   0.0061803]
  [          0           0     0.15999]
  [          0           0     0.10356]
  [          0           0   0.0086988]
  [          0           0   0.0050686]
  [          0           0    0.025655]
  [          0           0    0.015684]
  [          0           0    0.027117]
  [          0           0    0.018834]]]

0: 480x640 1 person, 11.4ms
Speed: 1.6ms preprocess, 11.4ms inference, 2.9ms postprocess per image at shape (1, 3, 480, 640)
[[[     206.17      59.322     0.60579]
  [          0           0     0.40478]
  [          0           0     0.27396]
  [          0           0     0.13536]
  [          0           0     0.02886]
  [     382.91      236.04     0.50726]
  [          0           0     0.27937]
  [          0           0      0.1551]
  [    

IndexError: index 7 is out of bounds for axis 0 with size 0