# Hand Gesture Recognition using Logistic Regression

This notebook demonstrates the complete pipeline for hand gesture recognition:
1. Data loading and preprocessing
2. Model training with Logistic Regression
3. Model evaluation
4. Real-time gesture recognition using webcam

The dataset contains hand landmarks detected by MediaPipe, normalized and ready for training.

In [18]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

## Data Loading and Preprocessing

Load the processed dataset containing hand landmarks and gesture labels.

In [19]:
df = pd.read_csv("../data/processed/data.csv")

## Feature Selection

Select features for training. All columns except 'gesture' are features (63 landmark coordinates + handedness).
The 'gesture' column is our target variable.

In [20]:

feature_cols = [c for c in df.columns if c not in ("gesture")]

X = df[feature_cols].values
y = df["gesture"].values

## Label Encoding

Convert gesture class names to numerical labels using LabelEncoder.

In [21]:
le = LabelEncoder()

y_encoded = le.fit_transform(y)

## Train/Test Split

Split the data into training and testing sets with stratification to maintain class distribution.

In [22]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

## Model Training

Train a Logistic Regression model with multinomial classification using the SAGA solver.
This configuration was chosen based on cross-validation results showing the highest accuracy and stability.

Parameters:

* `multi_class='multinomial'`: Suitable for multi-class gesture classification
* `solver='saga'`: Efficient for larger datasets and supports multinomial loss with L2 regularization
* `penalty='l2'`: Standard regularization to prevent overfitting
* `C=10`: Low regularization to allow the model to fit complex patterns in normalized hand landmark data
* `max_iter=1000`: Sufficient for convergence while keeping training time reasonable


In [None]:
LogisticRegression(
    multi_class='multinomial',
    solver='saga',
    penalty='l2',
    C=10,
    max_iter=1000,
    n_jobs=-1
)

clf.fit(X_train, y_train)

## Model Evaluation

Evaluate the trained model on the test set and print a detailed classification report.

In [None]:
y_pred = clf.predict(X_test)

print(classification_report(y_test, y_pred, target_names=le.classes_))

## Model Saving

Save the trained model and label encoder to a pickle file for later use in inference.

In [25]:
import pickle

with open("gesture_lr.pkl", "wb") as f:

    pickle.dump({"model": clf, "label_encoder": le}, f)

## Real-time Gesture Recognition

Implement real-time gesture recognition using webcam feed and MediaPipe hand detection.

In [26]:
import cv2
import mediapipe as mp
import numpy as np
import pickle
import time
from collections import deque

## Load Trained Model

Load the saved model and label encoder for inference.

In [27]:
with open("gesture_lr.pkl", "rb") as f:
    obj = pickle.load(f)

clf = obj["model"]
le = obj["label_encoder"]

## Initialize MediaPipe Hands

Initialize MediaPipe Hands solution with appropriate parameters for real-time detection.

In [None]:
mp_hands = mp.solutions.hands

hands = mp_hands.Hands(
    static_image_mode=False,
    max_num_hands=1,
    min_detection_confidence=0.8,
    min_tracking_confidence=0.5,

)

## Initialize Webcam Capture

Open webcam for video capture.

In [None]:
cap = cv2.VideoCapture(0)

## FPS Tracking Setup

Initialize variables for FPS calculation and smoothing.

In [30]:
fps_history = deque(maxlen=10)
last_time = time.time()

## Main Recognition Loop

Process video frames in real-time:
1. Capture frame from webcam
2. Calculate FPS
3. Mirror the frame
4. Process with MediaPipe
5. Extract hand landmarks if detected
6. Normalize coordinates
7. Make prediction
8. Display results

In [31]:
while True:
    ret, frame = cap.read()
    if not ret:
        break

    now = time.time()
    fps = 1.0 / (now - last_time)
    last_time = now
    fps_history.append(fps)
    smooth_fps = sum(fps_history) / len(fps_history)

    frame_mirrored = cv2.flip(frame, 1)
    image_rgb = cv2.cvtColor(frame_mirrored, cv2.COLOR_BGR2RGB)
    result = hands.process(image_rgb)

    gesture_text = "No hand"
    conf_text = ""

    if result.multi_hand_landmarks and result.multi_handedness:
        hand_landmarks = result.multi_hand_landmarks[0]
        handedness_obj = result.multi_handedness[0].classification[0]
        handedness = handedness_obj.label
        confidence = handedness_obj.score 
        conf_text = f"{confidence:.2f}"

        is_right = 1 if handedness == "Right" else 0

        wrist_x = hand_landmarks.landmark[0].x
        wrist_y = hand_landmarks.landmark[0].y
        wrist_z = hand_landmarks.landmark[0].z


        distances = [
            np.linalg.norm([
                lm.x - wrist_x,
                lm.y - wrist_y,
                lm.z - wrist_z
            ])

            for lm in hand_landmarks.landmark
        ]

        scale = max(distances)
        coords = []

        for lm in hand_landmarks.landmark:
            x = (lm.x - wrist_x) / scale
            y = (lm.y - wrist_y) / scale
            z = (lm.z - wrist_z) / scale
            coords.extend([x, y, z])



        input_vec = np.array(coords + [is_right]).reshape(1, -1)
        pred_class = clf.predict(input_vec)[0]
        gesture_text = le.inverse_transform([pred_class])[0]



    cv2.putText(frame_mirrored, f"Gesture: {gesture_text}",

                (10, 40), cv2.FONT_HERSHEY_SIMPLEX, 1.2, (200, 200, 0), 3)

    cv2.putText(frame_mirrored, f"Conf: {conf_text}",
                (10, 80), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 200, 255), 3)

    cv2.putText(frame_mirrored, f"FPS: {smooth_fps:.1f}",
                (10, 120), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 200, 0), 3)


    cv2.imshow("Gesture Recognition", frame_mirrored)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
hands.close()
cv2.destroyAllWindows()