# YOLOv8 Pose + Face — Satisfaction Monitoring

This notebook contains the **full pipeline** you provided, split into runnable cells with explanatory markdown describing design choices and operation. Run cells in order. The notebook is intended for **local execution** (it opens OpenCV windows) and may download model weights the first time it's run.

**What this notebook contains:**
- setup (install + imports)
- model download / loading
- helper functions and scoring logic (posture & face)
- client-zone selection UI
- main processing loop as a callable function
- insight generation using OpenRouter/OpenAI client (optional)

**Notes / choices:**
- I kept your original scoring logic intact but added comments and small robustness checks.
- The face detector and emotion models are downloaded if missing. For fast testing you can swap to lighter models.
- The processing loop is in a function so you can stop it by interrupting the kernel.
- Visual output uses OpenCV windows for interactive selection and real-time view; inline frames are not used to avoid flooding notebook output.

---


In [1]:
%pip install --quiet ultralytics opencv-python tqdm timm torch torchvision python-dotenv openai

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:

# Imports and environment
import time, collections, math, os, threading, textwrap
import numpy as np
import cv2
from ultralytics import YOLO
import urllib.request
from tqdm import tqdm
import torch, timm
from openai import OpenAI
import dotenv

dotenv.load_dotenv()


  from .autonotebook import tqdm as notebook_tqdm


True

In [3]:

# CONFIG - change these paths to your local setup if desired
VIDEO_PATH = './testcases/siu.mp4'  # change to your video file
TINY_FACE_URL  = 'https://github.com/lindevs/yolov8-face/releases/download/v1.0.0/yolov8n-face-lindevs.pt'
TINY_FACE_PATH = 'yolov8n-face-lindevs.pt'
EMOTION_MODEL_URL = 'https://github.com/sb-ai-lab/EmotiEffLib/raw/main/models/affectnet_emotions/enet_b0_8_best_afew.pt'
EMOTION_MODEL_PATH = 'enet_b0_8_best_afew.pt'

def _download(url, dst):
    if os.path.exists(dst):
        print(f'{dst} already exists, skipping download')
        return
    print(f'Downloading {dst} ...')
    with tqdm(unit='B', unit_scale=True, desc=os.path.basename(dst)) as t:
        def _reporthook(b, bs, ts):
            if ts != -1:
                t.total = ts
            t.update(bs)
        urllib.request.urlretrieve(url, dst, reporthook=_reporthook)
    print('Done!')

# Download models if missing (comment out if you already have them)
_download(TINY_FACE_URL, TINY_FACE_PATH)
_download(EMOTION_MODEL_URL, EMOTION_MODEL_PATH)

# Open video capture (will error later if path invalid)
cap = cv2.VideoCapture(VIDEO_PATH)
if not cap.isOpened():
    print('Warning: Could not open video file. Make sure VIDEO_PATH is correct.')


yolov8n-face-lindevs.pt already exists, skipping download
enet_b0_8_best_afew.pt already exists, skipping download


## Model loading

Load YOLO pose model, tiny face detector and the emotion classifier. The emotion checkpoint may be either a state dictionary or a serialized model object — we attempt to load robustly and fall back gracefully.

In [4]:

print('Loading YOLO pose model -- this may take a while...')
model = YOLO('yolov8x-pose.pt')  # replace if you want a smaller model for testing
print('Loading tiny face model...')
face_model = YOLO(TINY_FACE_PATH)

print('Loading emotion model architecture...')
emotion_model = timm.create_model('efficientnet_b0', num_classes=8, pretrained=False)

# Try robust loading of the checkpoint
try:
    ckpt = torch.load(EMOTION_MODEL_PATH, map_location='cpu')
    if isinstance(ckpt, dict) and 'state_dict' in ckpt:
        # many checkpoints store weights under 'state_dict'
        state = ckpt['state_dict']
        # fix any 'module.' prefixes if present
        new_state = {}
        for k, v in state.items():
            nk = k.replace('module.', '') if k.startswith('module.') else k
            # adapt classification head key if necessary
            new_state[nk] = v
        emotion_model.load_state_dict(new_state, strict=False)
    elif isinstance(ckpt, dict):
        emotion_model.load_state_dict(ckpt, strict=False)
    else:
        # checkpoint appears to be a full model object
        emotion_model = ckpt
except Exception as e:
    print('Warning: fallback loading for emotion model failed:', e)
    try:
        emotion_model = torch.load(EMOTION_MODEL_PATH, map_location='cpu', weights_only=False)
    except Exception as e2:
        print('Final fallback failed. Emotion model may not be usable:', e2)

emotion_model.eval()
print('Model loading complete.')


Loading YOLO pose model -- this may take a while...
Loading tiny face model...
Loading emotion model architecture...
Model loading complete.


  ckpt = torch.load(EMOTION_MODEL_PATH, map_location='cpu')


## Helpers and scoring functions

This cell contains the posture/face scoring functions. I preserved your core logic but added defensive checks and small clarifying comments.

In [5]:

HISTORY_LEN = 20
TORSO_ACTIVITY_THRESHOLD = 0.0025
ARM_ACTIVITY_THRESHOLD = 0.005
MIN_SHOULDER_WIDTH_FRAC = 0.10
SCORE_INTERVAL = 1.0  # seconds for scoring

COCO_CONNECTIONS = [
    (0,1),(0,2),(1,3),(2,4),(0,5),(0,6),(5,7),(7,9),(6,8),(8,10),
    (5,6),(5,11),(6,12),(11,13),(13,15),(12,14),(14,16),(11,12)
]

def _angle_between_deg(v1, v2):
    n1 = np.linalg.norm(v1); n2 = np.linalg.norm(v2)
    if n1 == 0 or n2 == 0: return 0.0
    c = np.clip(np.dot(v1, v2) / (n1 * n2), -1.0, 1.0)
    return math.degrees(math.acos(c))

def _clamp01(x): return max(0.0, min(1.0, x))
def _map_to_01(v, a, b):
    if b == a: return 0.0
    return _clamp01((v - a) / (b - a))

def compute_satisfaction_score(history, fw, fh, prev_score=None, alpha=0.6):
    # returns (label, score, diagnostics)
    if len(history) < 3: return None, 0, {}
    pts_now = history[-1]; pts_prev = history[-2]; pts_old = history[-3]
    if pts_now.shape[0] < 17: return None, 0, {}

    disp1 = np.linalg.norm(pts_now - pts_prev, axis=1)
    disp2 = np.linalg.norm(pts_prev - pts_old, axis=1)
    activity = float((disp1.mean() + disp2.mean()) / 2.0)

    L_SH, R_SH = 5,6
    L_HIP, R_HIP = 11,12
    NOSE = 0
    L_ELB, R_ELB = 7,8
    L_WRIST, R_WRIST = 9,10

    shoulders = (pts_now[L_SH] + pts_now[R_SH]) / 2.0
    hips      = (pts_now[L_HIP] + pts_now[R_HIP]) / 2.0
    nose      = pts_now[NOSE]

    torso_vec = hips - shoulders
    torso_len = np.linalg.norm(torso_vec) + 1e-6
    torso_dir = torso_vec / torso_len
    vertical  = np.array([0.0, 1.0])
    torso_angle = min(_angle_between_deg(torso_dir, vertical), 180 - _angle_between_deg(torso_dir, vertical))

    head_vec = nose - shoulders
    head_len = np.linalg.norm(head_vec) + 1e-6
    head_dir = head_vec / head_len
    head_torso_angle = min(_angle_between_deg(head_dir, torso_dir), 180 - _angle_between_deg(head_dir, torso_dir))

    sh_ys = abs(pts_now[L_SH][1] - pts_now[R_SH][1])
    shoulder_sym = sh_ys * fh / (torso_len * fh + 1e-6)

    shoulder_width = np.linalg.norm(pts_now[L_SH] - pts_now[R_SH]) + 1e-6
    wrist_dist = np.linalg.norm(pts_now[L_WRIST] - pts_now[R_WRIST])
    arm_openness = _clamp01((wrist_dist / shoulder_width) / 2.5)

    wrist_to_nose = min(np.linalg.norm(pts_now[L_WRIST] - nose),
                        np.linalg.norm(pts_now[R_WRIST] - nose))
    hands_face = _map_to_01(wrist_to_nose, 0.01, 0.20)

    crossed_arms_penalty = 0.0
    try:
        if all(pts_now[i].any() for i in [L_ELB, R_ELB, L_WRIST, R_WRIST, L_SH, R_SH, L_HIP, R_HIP]):
            if (pts_now[L_WRIST][0] > pts_now[R_SH][0] and pts_now[R_WRIST][0] < pts_now[L_SH][0]) or                (pts_now[L_WRIST][0] > pts_now[R_ELB][0] and pts_now[R_WRIST][0] < pts_now[L_ELB][0]):
                chest_y_min = min(pts_now[L_SH][1], pts_now[R_SH][1])
                chest_y_max = max(pts_now[L_HIP][1], pts_now[R_HIP][1])
                if chest_y_min < pts_now[L_WRIST][1] < chest_y_max and chest_y_min < pts_now[R_WRIST][1] < chest_y_max:
                    left_arm_vec = pts_now[L_WRIST] - pts_now[L_ELB]
                    right_arm_vec = pts_now[R_WRIST] - pts_now[R_ELB]
                    cross_prod = left_arm_vec[0] * right_arm_vec[1] - left_arm_vec[1] * right_arm_vec[0]
                    if abs(cross_prod) > 0.01:
                        crossed_arms_penalty = 0.50
    except Exception:
        crossed_arms_penalty = 0.0

    activity_score = _map_to_01(activity, 0.0008, 0.018)
    upright_score = 1.0 if torso_angle <= 10 else 0.0 if torso_angle >= 40 else 1.0 - ((torso_angle-10)/(40-10))
    head_align_score = _clamp01(1.0 - (head_torso_angle/40.0))
    shoulder_sym_score = 1.0 - _clamp01(shoulder_sym*3.0)
    hands_open_score = hands_face
    arm_open_score = arm_openness

    combined = (0.40*upright_score + 0.20*head_align_score + 0.10*activity_score +
                0.10*hands_open_score + 0.10*arm_open_score + 0.10*shoulder_sym_score)
    combined -= crossed_arms_penalty

    harsh_penalty = 0.0
    if activity > 0.018:
        harsh_penalty = 0.30
    combined -= harsh_penalty

    if arm_openness < 0.3:
        combined -= 0.20

    if arm_openness > 0.7 and upright_score > 0.8 and activity < 0.005:
        combined += 0.20

    combined = _clamp01(combined)

    if prev_score is not None:
        combined = alpha*combined + (1-alpha)*(prev_score/100.0)

    score = int(combined*100)

    if harsh_penalty > 0:
        label = 'angry'
    elif crossed_arms_penalty > 0:
        label = 'bored'
    elif arm_openness < 0.3:
        label = 'uncomfortable'
    elif arm_openness > 0.7 and upright_score > 0.8:
        label = 'comfortable'
    else:
        label = 'satisfied' if score >= 70 else 'neutral' if score >= 45 else 'dissatisfied'

    return label, score, {}


## Face detection helper and drawing utilities

We throttle face detection to every 0.5s to save compute and re-use a cached crop when appropriate.

In [6]:

last_face_check = 0.0
cached_face_box = None
cached_face_crop = None

def box_overlap(box1, box2):
    ix1 = max(box1[0], box2[0])
    iy1 = max(box1[1], box2[1])
    ix2 = min(box1[2], box2[2])
    iy2 = min(box1[3], box2[3])
    ia  = max(0, ix2 - ix1) * max(0, iy2 - iy1)
    a1  = (box1[2] - box1[0]) * (box1[3] - box1[1])
    return ia / a1 if a1 > 0 else 0.0

def get_face_for_person(person_bbox, now, frame):
    global last_face_check, cached_face_box, cached_face_crop
    if now - last_face_check < 0.5:
        if cached_face_box and box_overlap(person_bbox, cached_face_box) > 0.4:
            return cached_face_box, cached_face_crop

    x1, y1, x2, y2 = person_bbox
    h, w = frame.shape[:2]
    x1 = max(0, min(w-1, x1)); x2 = max(0, min(w, x2))
    y1 = max(0, min(h-1, y1)); y2 = max(0, min(h, y2))
    crop = frame[y1:y2, x1:x2]
    if crop.size == 0:
        return None, None

    results = face_model(crop, verbose=False, conf=0.25)
    if results[0].boxes is None or len(results[0].boxes) == 0:
        cached_face_box = None; cached_face_crop = None
        return None, None

    best = results[0].boxes[0]
    fx1, fy1, fx2, fy2 = map(int, best.xyxy[0].tolist())
    fx1 += x1; fy1 += y1; fx2 += x1; fy2 += y1
    face_box = (fx1, fy1, fx2, fy2)

    crop_face_y1 = max(0, fy1 - y1)
    crop_face_y2 = min(crop.shape[0], fy2 - y1)
    crop_face_x1 = max(0, fx1 - x1)
    crop_face_x2 = min(crop.shape[1], fx2 - x1)
    face_crop = crop[crop_face_y1:crop_face_y2, crop_face_x1:crop_face_x2]
    if face_crop.size == 0:
        face_crop = None

    cached_face_box = face_box
    cached_face_crop = face_crop
    last_face_check = now
    return face_box, face_crop


## Face expression scoring

The emotion model outputs 8 classes (assumed mapping). We compute a simplified 'face satisfaction score' from a weighted sum of relevant probabilities.

In [7]:

def compute_face_expression_score(face_crop, prev_score=None, alpha=0.6):
    if face_crop is None or face_crop.size == 0:
        return None, 0, {}
    resized = cv2.resize(face_crop, (224, 224))
    normalized = resized / 255.0
    tensor = torch.tensor(normalized, dtype=torch.float32).permute(2, 0, 1).unsqueeze(0)
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    tensor = (tensor - mean) / std
    with torch.no_grad():
        output = emotion_model(tensor)
        probs = torch.softmax(output, dim=1)
    # class indices (assumed): 0:anger,1:contempt,2:disgust,3:fear,4:happy,5:neutral,6:sad,7:surprise
    happy_prob = probs[0,4].item()
    neutral_prob = probs[0,5].item()
    surprise_prob = probs[0,7].item()
    score = happy_prob * 100 + neutral_prob * 50 + surprise_prob * 30
    score = min(score, 100.0)
    if prev_score is not None:
        score = alpha * score + (1 - alpha) * prev_score
    label = 'satisfied' if score >= 70 else 'neutral' if score >= 45 else 'dissatisfied'
    return label, int(score), {}


## Select client zone (interactive)

Run the next cell and draw a rectangle around the area where clients stand using the mouse. Press 'q' to accept the selection. If the video can't be opened, fix `VIDEO_PATH` first.

In [None]:

drawing = False
ix, iy = -1, -1
rect_drawn = False
client_box = None

def draw_rectangle(event, x, y, flags, param):
    global ix, iy, drawing, frame_copy, rect_drawn, client_box
    if event == cv2.EVENT_LBUTTONDOWN:
        drawing = True; ix, iy = x, y
    elif event == cv2.EVENT_MOUSEMOVE:
        if drawing:
            img = frame_copy.copy()
            cv2.rectangle(img, (ix, iy), (x, y), (0, 255, 0), 2)
            cv2.imshow('Select Client Zone', img)
    elif event == cv2.EVENT_LBUTTONUP:
        drawing = False
        cv2.rectangle(frame_copy, (ix, iy), (x, y), (0, 255, 0), 2)
        cv2.imshow('Select Client Zone', frame_copy)
        client_box = (min(ix, x), min(iy, y), max(ix, x), max(iy, y))
        rect_drawn = True

cap.set(cv2.CAP_PROP_POS_MSEC, 1000)
ret, frame = cap.read()
if ret:
    frame_copy = frame.copy()
    cv2.namedWindow('Select Client Zone')
    cv2.setMouseCallback('Select Client Zone', draw_rectangle)
    print('Draw a rectangle around the client area. Press q to accept.')
    while True:
        cv2.imshow('Select Client Zone', frame_copy)
        if cv2.waitKey(1) & 0xFF == ord('q') or rect_drawn:
            break
    cv2.destroyWindow('Select Client Zone')
    cap.set(cv2.CAP_PROP_POS_MSEC, 0)
else:
    raise RuntimeError('Could not read frame for client zone selection. Check VIDEO_PATH.')

def run_processing(cap, client_box, monitoring_mode=True):
    landmark_histories = {}
    prev_body_scores = {}
    prev_face_scores = {}
    client_logs = []
    current_client_tid = None
    client_start_time = None
    client_scores = []
    last_score_time = 0.0
    agg_history = []
    insight_text = ''
    insight_start = 0.0
    INSIGHT_DURATION = 10.0

    cv2.namedWindow('YOLO + Pose + Face (Satisfaction)', cv2.WINDOW_NORMAL)
    esc_pressed = False
    last_body_frame = None

    while True:
        ret, frame = cap.read()
        if not ret: break
        fh, fw = frame.shape[:2]
        results = model.track(frame, persist=True, classes=[0], verbose=False)
        annotated = frame.copy()
        now = time.monotonic()

        active_tracks_in_box = []
        for box in results[0].boxes:
            if box.id is None: continue
            tid = int(box.id.item())
            x1,y1,x2,y2 = map(int, box.xyxy[0].tolist())
            overlap = box_overlap((x1,y1,x2,y2), client_box)
            if overlap >= 0.2:
                active_tracks_in_box.append((tid, overlap, (x1,y1,x2,y2)))

        front_tid = None
        if active_tracks_in_box:
            high_overlap_candidates = [t for t in active_tracks_in_box if t[1] >= 0.90]
            current_overlap = next((ov for tid,ov,_ in active_tracks_in_box if tid == current_client_tid), 0.0)
            if current_overlap > 0:
                front_tid = current_client_tid
            elif high_overlap_candidates:
                high_overlap_candidates.sort(key=lambda x: x[1], reverse=True)
                front_tid = high_overlap_candidates[0][0]

        if front_tid is not None:
            if current_client_tid is None or current_client_tid != front_tid:
                if current_client_tid is not None:
                    total_time = now - client_start_time
                    if client_scores:
                        posture_scores = [s[0] for s in client_scores]
                        face_scores    = [s[1] for s in client_scores]
                        agg_scores     = [s[2] for s in client_scores]
                        mean_posture   = np.mean(posture_scores)
                        mean_face      = np.mean(face_scores)
                        mean_total     = np.mean(agg_scores)
                    else:
                        mean_posture = mean_face = mean_total = 0
                    print(f"[{time.strftime('%H:%M:%S')}] Client ID {current_client_tid} left.")
                    print(f"Time spent: {total_time:.1f}s")
                    print(f"Scores list: {client_scores}")
                    client_logs.append((current_client_tid, total_time, mean_posture, mean_face, mean_total))
                    threading.Thread(target=generate_insight_thread, args=(total_time, mean_posture, mean_face, mean_total, client_scores)).start()

                current_client_tid = front_tid
                client_start_time  = now
                client_scores      = []
                last_score_time    = now
                agg_history = []
                insight_text = ''
                print(f"[{time.strftime('%H:%M:%S')}] New client ID {front_tid} entered.")

            tid = current_client_tid
            if tid not in landmark_histories:
                landmark_histories[tid] = collections.deque(maxlen=HISTORY_LEN)
                prev_body_scores[tid] = None
                prev_face_scores[tid] = None

            person_bbox = next(b for t,_,b in active_tracks_in_box if t == tid)
            x1, y1, x2, y2 = person_bbox

            has_pose = False
            pts = None
            if results[0].keypoints is not None and len(results[0].keypoints) > 0:
                for i, kp in enumerate(results[0].keypoints):
                    if results[0].boxes[i].id is not None and int(results[0].boxes[i].id.item()) == tid:
                        keypoints = kp.data.cpu().numpy()
                        pts = keypoints[0, :, :2]
                        confidences = keypoints[0, :, 2]
                        low_conf_mask = confidences < 0.5
                        pts[low_conf_mask] = [0, 0]
                        pts[:, 0] /= fw; pts[:, 1] /= fh
                        has_pose = True
                        for idx1, idx2 in COCO_CONNECTIONS:
                            if np.all(pts[idx1] != 0) and np.all(pts[idx2] != 0):
                                pt1 = (int(pts[idx1][0] * fw), int(pts[idx1][1] * fh))
                                pt2 = (int(pts[idx2][0] * fw), int(pts[idx2][1] * fh))
                                cv2.line(annotated, pt1, pt2, (255, 0, 0), 2)
                        break

            face_box, face_crop = get_face_for_person(person_bbox, now, frame)
            if face_box:
                fx1, fy1, fx2, fy2 = face_box
                cv2.rectangle(annotated, (fx1, fy1), (fx2, fy2), (0, 255, 255), 2)

            cv2.rectangle(annotated, (x1,y1), (x2,y2), (0,255,0), 2)
            cv2.putText(annotated, f'ID:{tid}', (x1, y1+20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255,255,255), 2)

            if has_pose and pts is not None and pts.shape[0] >= 17:
                landmark_histories[tid].append(pts)

            if now - last_score_time >= SCORE_INTERVAL:
                last_score_time = now
                body_sc = 0
                if tid in landmark_histories and len(landmark_histories[tid]) >= 3:
                    _, body_sc, _ = compute_satisfaction_score(landmark_histories[tid], fw, fh, prev_body_scores.get(tid))
                    prev_body_scores[tid] = body_sc
                face_sc = 0
                if face_crop is not None:
                    _, face_sc, _ = compute_face_expression_score(face_crop, prev_face_scores.get(tid))
                    prev_face_scores[tid] = face_sc
                scores = [s for s in (body_sc, face_sc) if s > 0]
                agg_sc = sum(scores)/len(scores) if scores else 0
                client_scores.append((body_sc, face_sc, agg_sc))
                agg_history.append(agg_sc)

        else:
            if current_client_tid is not None:
                total_time = now - client_start_time
                if client_scores:
                    posture_scores = [s[0] for s in client_scores]
                    face_scores    = [s[1] for s in client_scores]
                    agg_scores     = [s[2] for s in client_scores]
                    mean_posture   = np.mean(posture_scores)
                    mean_face      = np.mean(face_scores)
                    mean_total     = np.mean(agg_scores)
                else:
                    mean_posture = mean_face = mean_total = 0
                print(f"[{time.strftime('%H:%M:%S')}] Client ID {current_client_tid} left.")
                print(f"Time spent: {total_time:.1f}s")
                print(f"Scores list: {client_scores}")
                client_logs.append((current_client_tid, total_time, mean_posture, mean_face, mean_total))
                threading.Thread(target=generate_insight_thread, args=(total_time, mean_posture, mean_face, mean_total, client_scores)).start()
                current_client_tid = None

        if client_box:
            cv2.rectangle(annotated, (client_box[0], client_box[1]), (client_box[2], client_box[3]), (255,0,0), 2)

        # display small info
        cv2.putText(annotated, f'face : 0', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
        cv2.putText(annotated, f'posture : 0', (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
        cv2.putText(annotated, f'total : 0', (10, 90), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

        cv2.imshow('YOLO + Pose + Face (Satisfaction)', annotated)
        key = cv2.waitKey(1) & 0xFF
        if key == 27:
            esc_pressed = True
            break

    if current_client_tid is not None:
        total_time = time.monotonic() - client_start_time
        if client_scores:
            posture_scores = [s[0] for s in client_scores]
            face_scores    = [s[1] for s in client_scores]
            agg_scores     = [s[2] for s in client_scores]
            mean_posture   = np.mean(posture_scores)
            mean_face      = np.mean(face_scores)
            mean_total     = np.mean(agg_scores)
        else:
            mean_posture = mean_face = mean_total = 0
        print(f"[{time.strftime('%H:%M:%S')}] Client ID {current_client_tid} left (video end).")
        client_logs.append((current_client_tid, total_time, mean_posture, mean_face, mean_total))

    cap.release()
    cv2.destroyAllWindows()
    print('All client logs:')
    for log in client_logs:
        print(log)
    return client_logs



Draw a rectangle around the client area. Press q to accept.


KeyboardInterrupt: 

## Main pipeline (callable)

The processing loop is wrapped in `run_processing(...)`. Run it to start; interrupt the kernel to stop. The function returns `client_logs` at the end.

## Optional: Insight generation via OpenRouter/OpenAI

This cell uses the `openai` client wrapper you used previously (OpenRouter). You need `OPENROUTER_API_KEY` in your environment for it to work. It gracefully falls back on error.

In [None]:

client = OpenAI(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY'))

def generate_insight(total_time, mean_posture, mean_face, mean_total, client_scores):
    prompt = f"Customer stayed for {total_time:.1f} seconds.\nMean posture score: {mean_posture:.1f}/100\nMean face score: {mean_face:.1f}/100\nTotal mean satisfaction: {mean_total:.1f}/100\nScores over time (posture, face, total): {client_scores}\n\nGive a short, professional insight (1-2 sentences) about possible customer mood or service issue. Focus on trends in posture (e.g. crossed arms = bored/impatient) and face expressions."
    try:
        response = client.chat.completions.create(
            model='meta-llama/llama-3-8b-instruct',
            messages=[{'role':'system','content':'You are a customer service analyst. Be concise, professional, and insightful.'},
                      {'role':'user','content':prompt}],
            max_tokens=100
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        print('Insight generation failed (no action taken):', e)
        return ''

def generate_insight_thread(total_time, mean_posture, mean_face, mean_total, client_scores):
    insight = generate_insight(total_time, mean_posture, mean_face, mean_total, client_scores)
    print('Insight:', insight)


## Run the pipeline

After running the previous cells (including the client-zone selection), run the cell below to start processing. Interrupt the kernel to stop.


In [None]:
logs = run_processing(cap, client_box)

# You can save logs if desired:
# import json
# with open('client_logs.json','w') as f:
#     json.dump(logs, f, indent=2)


### Notes & tips
- For faster local testing swap `yolov8x-pose.pt` to `yolov8n-pose.pt` or similar smaller model.
- If you run on a GPU, ensure `torch` has CUDA available and the models will use it automatically.
- The emotion checkpoint format varies; if loading fails, consider converting to a proper state_dict matching `efficientnet_b0` head.
- If you want frames inline in the notebook instead of OpenCV windows, I can modify the notebook to display periodic frames using `matplotlib`.

---

Done — the notebook file is saved and ready to download.
