# **Lesson Notes: MediaPipe for Computer Vision & AI**

---

## ✅ **Lesson Introduction**

MediaPipe is an open-source, cross-platform framework by Google for building **AI-powered real-time computer vision pipelines**. It provides **pre-trained solutions** for face detection, hand tracking, pose estimation, object detection, and more.

---

## ✅ **Real-World Hook**

* **Face Filters** in Instagram and Snapchat.
* **Hand Gesture Control** in AR/VR games.
* **Pose Tracking** in fitness apps like **Google Fit**.
* **Face Mesh** in virtual makeup or face beautification apps.

---

## ✅ **Introduce MediaPipe as a Solution**

Traditional computer vision approaches require **manual implementation** of detection algorithms (e.g., Haar cascades, keypoint extraction).
MediaPipe simplifies this with:

* **Pre-trained ML solutions** (Face Detection, Hand Tracking, etc.).
* **Lightweight & Fast** (supports real-time inference on mobile, desktop, and web).
* **Cross-platform**: Python, C++, Android, iOS, Web.

---

## ✅ **Theory**

### **What is MediaPipe?**

* A framework for building **perception pipelines** for real-time computer vision and ML.
* Developed by **Google AI**.
* Provides **ready-to-use solutions** for:

  * Face Detection & Mesh
  * Hand Tracking
  * Pose Estimation
  * Objectron (3D Object Detection)
  * Holistic Tracking (Face + Hands + Pose)

---

### **Why use MediaPipe?**

* **Real-time performance**
* **Pre-trained, optimized models**
* **Runs on CPU, GPU, and Edge devices**
* **Easy Python integration**

---

### **When to use it?**

* Gesture-based interaction systems.
* AR/VR applications.
* Fitness/Healthcare apps for posture analysis.
* Sign language recognition.
* Virtual try-on (makeup, glasses, etc.).

---

### **How does MediaPipe work?**

* **Graph-based Framework:**

  * Nodes (calculators) → perform operations (detection, tracking).
  * Packets → carry data between nodes.
* Uses **ML models + Computer Vision techniques** for detection & tracking.



In [None]:
# conda create -n cvenv python=3.10
# pip install ipykernel

In [None]:
### ✅ **Syntax & Installation**
!pip install mediapipe opencv-python

In [2]:
## ✅ **Practical Example: Face Detection**
import cv2
import mediapipe as mp

# Initialize MediaPipe
mp_face_detection = mp.solutions.face_detection
mp_drawing = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)

with mp_face_detection.FaceDetection(min_detection_confidence=0.5) as face_detection:
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        
        # Convert to RGB
        rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = face_detection.process(rgb_frame)
        
        if results.detections:
            for detection in results.detections:
                mp_drawing.draw_detection(frame, detection)
        
        cv2.imshow('Face Detection', frame)
        if cv2.waitKey(1) & 0xFF == 27:
            break
cap.release()
cv2.destroyAllWindows()

# 2. HandTracking project 1

In [2]:
import cv2
import mediapipe as mp
mp_hands = mp.solutions.hands
hands = mp_hands.Hands()

cap = cv2.VideoCapture(0)
while cap.isOpened():
    success, image = cap.read()
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    result = hands.process(image_rgb)

    if result.multi_hand_landmarks:
        for hand_landmarks in result.multi_hand_landmarks:
            mp.solutions.drawing_utils.draw_landmarks(
                image, hand_landmarks, mp_hands.HAND_CONNECTIONS)

    cv2.imshow('Hand Tracking', image)
    if cv2.waitKey(1) & 0xFF == 27:
        break
cap.release()
cv2.destroyAllWindows()

# 3. Feshmesh --> face motion detections

In [3]:
import cv2
import mediapipe as mp

mp_face_mesh = mp.solutions.face_mesh
face_mesh = mp_face_mesh.FaceMesh()

cap = cv2.VideoCapture(0)
while cap.isOpened():
    success, image = cap.read()
    if not success:
        break

    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = face_mesh.process(image_rgb)

    if results.multi_face_landmarks:
        for landmarks in results.multi_face_landmarks:
            mp.solutions.drawing_utils.draw_landmarks(
                image, landmarks, mp_face_mesh.FACEMESH_TESSELATION)

    cv2.imshow('FaceMesh', image)
    if cv2.waitKey(1) & 0xFF == 27:
        break
cap.release()
cv2.destroyAllWindows()


# 4. PoseDetection projects

In [4]:
import cv2
import mediapipe as mp

mp_pose = mp.solutions.pose
pose = mp_pose.Pose()

cap = cv2.VideoCapture(0)
while cap.isOpened():
    success, image = cap.read()
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    result = pose.process(image_rgb)

    if result.pose_landmarks:
        mp.solutions.drawing_utils.draw_landmarks(
            image, result.pose_landmarks, mp_pose.POSE_CONNECTIONS)

    cv2.imshow('Pose Detection', image)
    if cv2.waitKey(1) & 0xFF == 27:
        break
cap.release()
cv2.destroyAllWindows()



# 5. Holistic Model

In [5]:
import cv2
import mediapipe as mp
mp_holistic = mp.solutions.holistic
holistic = mp_holistic.Holistic()

cap = cv2.VideoCapture(0)
while cap.isOpened():
    success, image = cap.read()
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    result = holistic.process(image_rgb)

    # Draw all components
    mp.solutions.drawing_utils.draw_landmarks(image, result.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
    mp.solutions.drawing_utils.draw_landmarks(image, result.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
    mp.solutions.drawing_utils.draw_landmarks(image, result.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
    mp.solutions.drawing_utils.draw_landmarks(image, result.pose_landmarks, mp_holistic.POSE_CONNECTIONS)

    cv2.imshow('Holistic Model', image)
    if cv2.waitKey(1) & 0xFF == 27:
        break
cap.release()
cv2.destroyAllWindows()




---

## ✅ **Business Scenario**

A fitness company wants to **track user poses in real-time** during online workouts. Using **MediaPipe Pose**, they can:

* Detect **key body landmarks** (shoulders, knees, etc.).
* Calculate angles for **exercise form correction**.
* Provide **real-time feedback** without expensive hardware.

---

## ✅ **Practice Session**

### **Questions**

1. What is MediaPipe, and why is it used?
2. List 5 real-world applications of MediaPipe.
3. Which function is used to initialize Face Detection in MediaPipe?
4. Explain the difference between **Face Detection** and **Face Mesh**.
5. Write Python code to **detect hand landmarks** using MediaPipe.

---

## ✅ **Case Study**

**Project:** **Hand Gesture Volume Control**

* Use **MediaPipe Hands** to detect hand landmarks.
* Recognize **thumb & index finger distance**.
* Control system volume based on finger distance.

---

## ✅ **Key MediaPipe Solutions**

| **Solution**   | **Description**                |
| -------------- | ------------------------------ |
| Face Detection | Detects faces in images/videos |
| Face Mesh      | Detects 468 facial landmarks   |
| Hands          | Detects 21 hand landmarks      |
| Pose           | Detects 33 body landmarks      |
| Holistic       | Combines face, hands, and pose |
| Objectron      | Detects 3D objects             |

---

👉 Do you want me to **prepare a full course module** on MediaPipe for your students, including:
✅ **Lesson Plan (Beginner → Advanced)**
✅ **Real-time Projects** (Face Detection, Gesture Control, Pose Estimation)
✅ **Assignments + Case Studies + Datasets**


In [3]:
pip install streamlit

Collecting streamlit
  Downloading streamlit-1.49.1-py3-none-any.whl.metadata (9.5 kB)
Collecting altair!=5.4.0,!=5.4.1,<6,>=4.0 (from streamlit)
  Using cached altair-5.5.0-py3-none-any.whl.metadata (11 kB)
Collecting blinker<2,>=1.5.0 (from streamlit)
  Using cached blinker-1.9.0-py3-none-any.whl.metadata (1.6 kB)
Collecting cachetools<7,>=4.0 (from streamlit)
  Downloading cachetools-6.2.0-py3-none-any.whl.metadata (5.4 kB)
Collecting click<9,>=7.0 (from streamlit)
  Using cached click-8.2.1-py3-none-any.whl.metadata (2.5 kB)
Collecting pandas<3,>=1.4.0 (from streamlit)
  Downloading pandas-2.3.2-cp310-cp310-win_amd64.whl.metadata (19 kB)
Collecting pyarrow>=7.0 (from streamlit)
  Using cached pyarrow-21.0.0-cp310-cp310-win_amd64.whl.metadata (3.4 kB)
Collecting requests<3,>=2.27 (from streamlit)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting tenacity<10,>=8.1.0 (from streamlit)
  Using cached tenacity-9.1.2-py3-none-any.whl.metadata (1.2 kB)
Collecting 