# Indian Sign Language Detection using Gestures

## 1. Problem Definition & Objective

### Problem Statement
Indian Sign Language (ISL) is a primary means of communication for the
deaf and hard-of-hearing community. However, the lack of automated,
real-time interpretation systems limits accessibility.

### Objective
To build a real-time Indian Sign Language detection system using
hand gestures captured via a webcam and classified using a deep
learning model.

### Real-World Application
Assistive communication, inclusive education, and human–computer interaction.


## 2. Data Understanding & Preparation

### Data Source
Hand gestures are captured in real time using a webcam.

### Feature Extraction
- 21 hand landmarks detected using MediaPipe
- (x, y) coordinates extracted per landmark
- Relative normalization applied

### Preprocessing
- Landmark normalization
- Conversion to a flattened feature vector


## 3. Model / System Design

### AI Technique Used
Deep Learning (Neural Network)

### System Architecture
Webcam → MediaPipe Hand Detection → Landmark Processing →
Trained Neural Network → Gesture Prediction

### Justification
MediaPipe enables fast and accurate hand tracking suitable
for real-time applications.


## 4. Core Implementation

### Libraries Used
- OpenCV
- MediaPipe
- TensorFlow / Keras
- NumPy, Pandas

### Implementation Overview
The trained model (`model.h5`) is loaded and used to predict
hand gestures in real time based on extracted landmarks.


In [None]:
import cv2
import mediapipe as mp
import copy
import itertools
import os
from tensorflow import keras
import numpy as np
import pandas as pd
import string


BASE_DIR = os.getcwd()
MODEL_PATH = os.path.join(BASE_DIR, "model.h5")

if not os.path.exists(MODEL_PATH):
    raise FileNotFoundError(f"model.h5 not found in {BASE_DIR}")

model = keras.models.load_model(MODEL_PATH)
print("Model loaded successfully")


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands

alphabet = ['1','2','3','4','5','6','7','8','9']
alphabet += list(string.ascii_uppercase)


def calc_landmark_list(image, landmarks):
    image_width, image_height = image.shape[1], image.shape[0]
    landmark_point = []

    for landmark in landmarks.landmark:
        landmark_x = min(int(landmark.x * image_width), image_width - 1)
        landmark_y = min(int(landmark.y * image_height), image_height - 1)
        landmark_point.append([landmark_x, landmark_y])

    return landmark_point


def pre_process_landmark(landmark_list):
    temp_landmark_list = copy.deepcopy(landmark_list)

    base_x, base_y = temp_landmark_list[0][0], temp_landmark_list[0][1]
    for i in range(len(temp_landmark_list)):
        temp_landmark_list[i][0] -= base_x
        temp_landmark_list[i][1] -= base_y

    temp_landmark_list = list(itertools.chain.from_iterable(temp_landmark_list))

    max_value = max(list(map(abs, temp_landmark_list)))
    if max_value == 0:
        return temp_landmark_list

    temp_landmark_list = [x / max_value for x in temp_landmark_list]
    return temp_landmark_list



cap = cv2.VideoCapture(0)

with mp_hands.Hands(
    model_complexity=0,
    max_num_hands=2,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
) as hands:

    while cap.isOpened():
        success, image = cap.read()
        if not success:
            print("Ignoring empty camera frame.")
            continue

        image = cv2.flip(image, 1)
        image.flags.writeable = False
        image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        results = hands.process(image_rgb)

        image.flags.writeable = True
        image = cv2.cvtColor(image_rgb, cv2.COLOR_RGB2BGR)

        if results.multi_hand_landmarks:
            for hand_landmarks in results.multi_hand_landmarks:

                landmark_list = calc_landmark_list(image, hand_landmarks)
                pre_processed_landmark_list = pre_process_landmark(landmark_list)

                # Draw landmarks
                mp_drawing.draw_landmarks(
                    image,
                    hand_landmarks,
                    mp_hands.HAND_CONNECTIONS,
                    mp_drawing_styles.get_default_hand_landmarks_style(),
                    mp_drawing_styles.get_default_hand_connections_style()
                )

               
                df = pd.DataFrame([pre_processed_landmark_list])

               
                predictions = model.predict(df, verbose=0)
                predicted_class = np.argmax(predictions)
                confidence = np.max(predictions)

                label = alphabet[predicted_class]

                
                cv2.putText(
                    image,
                    f"{label} ({confidence:.2f})",
                    (30, 50),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    1.5,
                    (0, 0, 255),
                    3
                )

        cv2.imshow("Indian Sign Language Detector", image)

        if cv2.waitKey(5) & 0xFF == 27:  # ESC key
            break

cap.release()
cv2.destroyAllWindows()


## 5. Evaluation & Analysis

### Evaluation Method
Qualitative evaluation using real-time webcam predictions.

### Observations
- Correct detection for clear hand postures
- Real-time inference with low latency

### Limitations
- Sensitive to lighting
- Limited gesture classes


## 6. Ethical Considerations & Responsible AI

### Bias
The system may not generalize equally across all users due
to limited gesture variations.

### Responsible Use
Designed strictly for assistive and educational purposes.


## 7. Conclusion & Future Scope

### Conclusion
A real-time Indian Sign Language detection system was successfully
implemented using computer vision and deep learning.

### Future Enhancements
- Word and sentence-level recognition
- Hindi text output
- Mobile application deployment
