# Real-Time Hand Detection and Finger Counting using OpenCV and MediaPipe

This notebook demonstrates a method for **real-time hand detection** and **finger counting** using **computer vision** and **machine learning** tools. We leverage **MediaPipe**, a framework developed by Google, which provides robust hand-tracking solutions, and **OpenCV** for image processing. 

This technique has applications in **human-computer interaction (HCI)**, **sign language recognition**, and even in **gesture-based control systems**.


In [1]:
# Import necessary libraries for image processing and hand tracking
import cv2  # OpenCV for real-time computer vision
import mediapipe as mp  # MediaPipe for hand tracking

## Libraries Overview

- **OpenCV (cv2):** A library designed for real-time image processing. We use it for handling camera input and displaying the processed image with detected hands.
  
- **MediaPipe (mp):** A cross-platform framework that provides efficient and fast hand-tracking algorithms. The `Hands` model from MediaPipe will detect multiple hands and their key landmarks in each frame.


In [2]:
# Define the class for hand detection and finger counting
class handDetector():
    def __init__(self, mode=False, maxHands=2, detectionCon=0.5, trackCon=0.5):
        """
        Initialize the handDetector class with parameters for:
        - mode: Whether to treat the input as a static image or a video stream.
        - maxHands: Maximum number of hands to detect.
        - detectionCon: Minimum confidence value for hand detection.
        - trackCon: Minimum confidence value for hand tracking.
        """
        self.mode = mode
        self.maxHands = maxHands
        self.detectionCon = detectionCon
        self.trackCon = trackCon

        # Initialize the MediaPipe Hands module and its drawing utilities
        self.mpHands = mp.solutions.hands
        self.hands = self.mpHands.Hands(static_image_mode=self.mode,
                                        max_num_hands=self.maxHands,
                                        min_detection_confidence=self.detectionCon,
                                        min_tracking_confidence=self.trackCon)
        self.mpDraw = mp.solutions.drawing_utils

        # Landmark indices for fingertips (thumb, index, middle, ring, pinky)
        self.tipIds = [4, 8, 12, 16, 20]

    def findHands(self, img, draw=True):
        """
        Process the input image to detect hands. Optionally, draw landmarks.
        - img: Input image (frame from video stream).
        - draw: Boolean to determine if landmarks should be drawn on the image.
        """
        imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert image to RGB
        self.results = self.hands.process(imgRGB)  # Process the image with MediaPipe
        
        if self.results.multi_hand_landmarks:  # Check if any hands are detected
            for handLms in self.results.multi_hand_landmarks:
                if draw:
                    self.mpDraw.draw_landmarks(img, handLms, self.mpHands.HAND_CONNECTIONS)
        return img

    def findPosition(self, img, handNo=0, draw=True):
        """
        Retrieve the positions of the landmarks of the detected hands.
        - img: Input image (frame).
        - handNo: Hand index to extract landmarks from.
        - draw: Boolean to draw circles on landmarks.
        """
        lmList = []
        if self.results.multi_hand_landmarks:
            myHand = self.results.multi_hand_landmarks[handNo]
            for id, lm in enumerate(myHand.landmark):
                h, w, c = img.shape  # Image dimensions
                cx, cy = int(lm.x * w), int(lm.y * h)  # Convert normalized coordinates to pixel values
                lmList.append([id, cx, cy])
                if draw:
                    cv2.circle(img, (cx, cy), 10, (255, 0, 255), cv2.FILLED)  # Draw circles on the landmarks
        return lmList

    def fingersUp(self, lmList, handType):
        """
        Determine which fingers are up by analyzing landmark positions.
        - lmList: List of landmarks for a hand.
        - handType: Either 'Left' or 'Right' for the handedness of the hand.
        """
        fingers = []
        
        # Thumb detection varies depending on left or right hand
        if handType == "Right":
            fingers.append(1 if lmList[self.tipIds[0]][1] < lmList[self.tipIds[0] - 1][1] else 0)
        else:
            fingers.append(1 if lmList[self.tipIds[0]][1] > lmList[self.tipIds[0] - 1][1] else 0)

        # Detect if other fingers are up
        for id in range(1, 5):
            fingers.append(1 if lmList[self.tipIds[id]][2] < lmList[self.tipIds[id] - 2][2] else 0)

        return fingers

    def getHandType(self, handIndex):
        """
        Retrieve the type of hand (left or right) based on handedness classification.
        - handIndex: Index of the detected hand in the results.
        """
        if self.results.multi_handedness:
            handType = self.results.multi_handedness[handIndex].classification[0].label
            return handType
        return None


## Hand Detector Class Overview

The `handDetector` class encapsulates the functionality for:

- **Detecting hands:** Converts the input image to RGB, processes it with MediaPipe, and detects hand landmarks.
- **Finding positions:** Returns pixel coordinates of the hand landmarks.
- **Determining finger state:** Checks if each finger is up by analyzing landmark positions.
- **Classifying hand type:** Identifies if the detected hand is left or right based on MediaPipe's classification.

This modular approach allows for easy detection and tracking of multiple hands in real-time.


In [5]:
# Main function for real-time video capture and processing
def main():
    cap = cv2.VideoCapture(0)  # Start video capture from the webcam
    detector = handDetector()   # Create an instance of the hand detector

    while True:
        success, img = cap.read()  # Read the current frame from the webcam
        img = detector.findHands(img)  # Detect hands in the frame
        
        totalFingers = 0  # Initialize total finger count

        if detector.results.multi_hand_landmarks:  # If hands are detected
            for handIndex, handLms in enumerate(detector.results.multi_hand_landmarks):
                lmList = detector.findPosition(img, handIndex)  # Get landmark positions
                handType = detector.getHandType(handIndex)  # Identify hand type (left or right)
                if handType:
                    fingers = detector.fingersUp(lmList, handType)  # Determine which fingers are up
                    totalFingers += fingers.count(1)  # Count how many fingers are up for this hand

        # Display the total number of fingers raised
        cv2.putText(img, f'Total Fingers: {totalFingers}', (10, 100), cv2.FONT_HERSHEY_PLAIN, 3, (255, 0, 0), 3)

        cv2.imshow("Image", img)  # Show the processed frame with the finger count
        cv2.waitKey(1)  # Wait for 1 millisecond before processing the next frame

# Execute the main function
if __name__ == "__main__":
    main()




## Main Function for Real-Time Finger Counting

- **Video Capture:** Uses `cv2.VideoCapture(0)` to access the webcam and capture a live video stream.
- **Hand Detection:** For each frame, hands are detected and their landmarks are retrieved.
- **Finger Counting:** By checking the position of each landmark, we determine if a finger is raised. 
- **Display:** The total number of fingers raised is shown on the frame using OpenCV's `putText` method.

This system can handle multiple hands, and it updates the finger count dynamically as the hands move.


## Conclusion

This notebook demonstrates how to create a real-time finger-counting system using **OpenCV** and **MediaPipe**. By leveraging state-of-the-art hand tracking algorithms, we can accurately detect hand landmarks and identify which fingers are raised in a live video stream.

### Possible Extensions:
- **Gesture Recognition:** Extend the system to detect specific hand gestures.
- **Sign Language Recognition:** Adapt the code to recognize individual sign language letters or words.
- **Touchless Control Systems:** Use this method to implement gesture-based interfaces for controlling software or hardware.
