# Contents
- Basics
    - Image
        - Load
        - Display
        - Save
    - Video
        - Capture
        - Save

# Introduction
**Installation:**
```shell
pip install opencv-python
```
**Import:**

In [1]:
import cv2

# Basics
## Image
### Load
```python
cv2.imread(filename, flags)
```
**Parameters:**
- `filename`: The path of the image file.
- `flags`: Specifies how the image should be read. Some common values:
    - `cv2.IMREAD_COLOR` (default) – Loads a color image (ignores transparency).
    - `cv2.IMREAD_GRAYSCALE` – Loads the image in grayscale mode.
    - `cv2.IMREAD_UNCHANGED` – Loads the image as it is (including alpha channel, if present).
### Display
```python
cv2.imshow(window_name, image)
```
**Parameters:**
- `window_name`: A string representing the name of the display window.
- `image`: The image data read by cv2.imread().
### Save
```python
cv2.imwrite(filename, image)
```
**Parameters:**
- `filename`: Name of the output image file.
- `image`: The image data to be saved.

In [2]:
image = cv2.imread("./images/redbox.jpg", cv2.IMREAD_COLOR)

if image is None:
    print("Error: Unable to load image")
else:
    print("Image Loaded Successfully!")
    cv2.imshow("Original Image", image)

    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    cv2.imwrite("gray_example.jpg", gray_image)
    print("Grayscale image saved.")
    cv2.imshow("Grayscale Image", gray_image)

    cv2.waitKey(0)
    cv2.destroyAllWindows()

Image Loaded Successfully!
Grayscale image saved.


- `openCV.waitKey(0)` -  allows users to display a window for given milliseconds or until any key is pressed. If the parameter value is 0, you have to press any key from your keyboard to destroy the window, untill it will keep open. If the parameter value is other value instead of 0, it will automatically destroy the window after that amount of milliseconds. It return value is the key that was pressed.
- `openCV.destroyAllWindows()` - close all open window. [View More](https://www.geeksforgeeks.org/python-opencv-destroyallwindows-function/)
- `destroyWindow(windName)` - close a specif window

## Video
### Capture
```python
cv2.VideoCapture(source)
```
**Parameters:**
- `source`: Specifies the video source.
    - `0` for the default webcam.
    - `1, 2, ...` for external cameras.
    - `"filename.mp4"` to load a video file.
### Save
```python
cv2.VideoWriter(filename, fourcc, fps, frame_size)
```
**Parameters:**
- `filename`: Name of the output file.
- `fourcc`: Codec used for compression.
- `fps`: Frames per second.
- `frame_size`: Width and height of the frame (`width`, `height`).

In [6]:
cap = cv2.VideoCapture("./images/sample.mp4")

frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
fps = int(cap.get(cv2.CAP_PROP_FPS))

fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('webcam_output.avi', fourcc, fps, (frame_width, frame_height))

if not cap.isOpened():
    print("Error: Cannot open webcam")
    exit()

while True:
    ret, frame = cap.read()
    
    if not ret:
        print("Error: Can't receive frame. Exiting...")
        print("Video End")
        break

    out.write(frame)
    cv2.imshow('Live Video', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
out.release()
cv2.destroyAllWindows()

Error: Can't receive frame. Exiting...
Video End


- `cv2.VideoWriter_fourcc(*'XVID')` specifies the codec (other options: 'MJPG', 'MP4V').
- `cv2.VideoWriter('output.avi', fourcc, 20.0, (width, height))` creates a writer object that saves at 20 FPS.
- `cap.isOpened()` checks if the webcam is accessible.
- Inside the while loop:
    - `cap.read()` captures a frame.
    - If `ret` is `True`, the frame is displayed using `cv2.imshow()`.
    - The loop continues until the user presses the 'q' key (`cv2.waitKey(1) & 0xFF == ord('q')`).
- `cap.release()` releases the camera resource.
- `cv2.destroyAllWindows()` closes all OpenCV windows.cap.get(3) and cap.get(4) get the frame width and height.
- `out.write(frame)` writes each frame to the file.

# Drawing
## Draw Line
- `dtype=np.uint8` - specifies the data type of the elements in the array.
- `np.zeros((3, 3))` - creates a 3x3 array filled with zeros
- `np.ones((3, 3))` - creates a 3x3 array filled with ones
- `np.full((2, 2), 5)` - creates a 2x2 array filled with the value 5
- `np.empty((2, 2))` - creates a 2x2 uninitialized array
- `np.eye(3)` or `np.identity(3)` - creates a 3x3 identity matrix
- `np.ones((512, 512, 3))` - creates 512x512 array which filled with 1x3 array which is filled with the value 1
- `255*np.ones((512, 512, 3))` - creates a 3-dimensional with 512 rows, 512 columns and 3 depth or color channels. It means each of the value of 512*512 array have another array which contains a list of 3 value represnt rgb color code. It is used to create white image.
## Draw Circle
- `thickness=-1` or `thickness=openCV.FILLED` - filled the shape with color
## Draw Rectangle
- `top_left` - specify the position of top left corner of the rectangle
- `bottom_right` - specify the position of bottom right corner of the rectangle
## Draw Ellipse
- `center_coordinates` - Specifies the center of the ellipse.
- `axes_length` - Specifies the length of horizontal & vertical axes
- `angle` - Specifies the rotation angle of the ellipse (in degrees) from horizontal line.
- `startAngle` - Specifies the angle (in degrees) at which the ellipse arc starts. It defines the beginning of the arc.
- `endAngle` - Specifies the angle (in degrees) at which the ellipse arc ends. It defines the termination point of the arc.
## Draw Polygon
- `isClosed` - A boolean flag indicating whether the last point should be connected to the first point to form a closed loop.
- `pts` - it ontains the coordinates of the vertices of the polygon. Each row represents a vertex, and the polygon will be formed by connecting these vertices in the order they appear in the array.
## Draw Text
- `position` - The coordinates (x, y) where the text should be positioned on the image.
- `fontFamily` - The font style to be used for the text. 
- `fontScale` - The scale factor that multiplies the font size.
## Example

In [2]:
import numpy as np
# Create a black image
image = np.zeros((500, 500, 3), dtype=np.uint8)

# Draw shapes
cv2.line(image, (50, 50), (450, 50), (0, 255, 0), 5)
cv2.rectangle(image, (100, 100), (400, 300), (255, 0, 0), 3)
cv2.circle(image, (250, 250), 100, (0, 0, 255), -1)
cv2.ellipse(image, (250, 250), (150, 100), 0, 0, 180, (255, 255, 0), 2)

# Draw a polygon
points = np.array([[100, 300], [200, 400], [300, 400], [400, 300]], np.int32)
points = points.reshape((-1, 1, 2))
cv2.polylines(image, [points], isClosed=True, color=(0, 255, 255), thickness=3)

# Add text
cv2.putText(image, 'OpenCV Drawing', (50, 450), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)

# Show the image
cv2.imshow("Drawing Shapes", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

- `np.zeros((500, 500, 3), dtype=np.uint8)`: Creates a black image of size 500x500 with 3 color channels (BGR).

# Operation
## Cropping
Cropping refers to extracting a specific region from an image.

In OpenCV, images are represented as NumPy arrays of pixels. We can crop an image using array slicing.
```python
cropped_image = image[y1:y2, x1:x2]
```
- `(x1, y1)`: Top-left corner of the region to crop.
- `(x2, y2)`: Bottom-right corner of the region to crop.
- `image[y1:y2, x1:x2]` extracts the required portion.

In [6]:
cropped = image[50:300, 100:400]  
cv2.imshow("Cropped Image", cropped)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Ratation
Rotation refers to turning an image around its center by a specific angle.

To rotate an image:

1. Compute the rotation matrix using `cv2.getRotationMatrix2D()`
2. Apply the transformation using `cv2.warpAffine()`

In [None]:
# Get image dimensions
(h, w) = image.shape[:2]

# Define the center of rotation
center = (w // 2, h // 2)

# Compute the rotation matrix (Rotate 45 degrees)
M = cv2.getRotationMatrix2D(center, 45, 1.0)

# Apply the affine transformation
rotated = cv2.warpAffine(image, M, (w, h))

# Show the rotated image
cv2.imshow("Rotated Image", rotated)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Rotation Without Cropping

In [None]:
# Compute the new bounding dimensions
new_w = int(h * abs(np.sin(np.radians(45))) + w * abs(np.cos(np.radians(45))))
new_h = int(h * abs(np.cos(np.radians(45))) + w * abs(np.sin(np.radians(45))))

# Compute the new rotation matrix
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, 45, 1.0)

# Adjust translation to fit the entire image
M[0, 2] += (new_w - w) / 2
M[1, 2] += (new_h - h) / 2

# Apply the rotation
rotated = cv2.warpAffine(image, M, (new_w, new_h))

# Show the rotated image
cv2.imshow("Rotated Image Without Cropping", rotated)
cv2.waitKey(0)
cv2.destroyAllWindows()

# MediaPipe
MediaPipe is a powerful framework by Google that enables real-time detection of hands, face, and pose landmarks using deep learning.

**Installation:**
```shell
pip install mediapipe
```

In [5]:
import mediapipe as mp

### Hand Landmarks
MediaPipe Hands consists of two primary models:
- **Palm Detection Model**: It detects the location of the hand in the image.
- **Hand Landmark Model**: It predicts 21 keypoints (landmarks) for each detected hand.

**Palm Detection Model:**
- Identifies the general region where a hand is located.
- It does not detect individual fingers or landmarks.
- Runs once per video sequence and updates as needed.

**Hand Landmark Model:**
- Identifies 21 hand landmarks once the palm is detected.
- Works frame-by-frame, refining detection over time.

![Image](https://ai.google.dev/static/edge/mediapipe/images/solutions/hand-landmarks.png)

#### Steps
1. **Initialize MediaPipe Hands**

In [None]:
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(static_image_mode=False, max_num_hands=2, min_detection_confidence=0.5)
mp_draw = mp.solutions.drawing_utils

- `mp.solutions.hands` - is used to access the Hands module from MediaPipe's solutions.
- `Hands()` initializes the hand tracking model.
- `static_image_mode=False`: Detect hands in a continuous video stream.
- `max_num_hands=2`: Detect up to 2 hands.
- `min_detection_confidence=0.5`: Minimum confidence for detection (range 0.0 to 1.0).
- `mp_draw` is used for drawing landmarks on the image.

2. **Process Frame:**

In [None]:
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

- Converts the frame to RGB (because MediaPipe requires RGB input).

In [None]:
result = hands.process(frame_rgb)

The `results` object is of type `mediapipe.python.solution_base.SolutionOutputs` and has the following attributes:

a. `results.multi_hand_landmarks`
- A list of detected hands, where each hand contains 21 landmarks.
- Each landmark has `x, y, z` coordinates.

In [None]:
if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
        print(hand_landmarks)  # Prints all 21 landmarks

Each landmark is represented as `landmark.x`, `landmark.y`, and `landmark.z`:

In [None]:
for idx, landmark in enumerate(hand_landmarks.landmark):
    print(f"Landmark {idx}: x={landmark.x}, y={landmark.y}, z={landmark.z}")

The `x` and `y` coordinates are normalized (range: `0-1`), so multiply them by image width/height to get pixel values.

The `z` value represents depth but is relative to the wrist (not in actual units like meters).

In [None]:
for idx, landmark in enumerate(hand_landmarks.landmark):
    h, w, c = frame.shape
    cx, cy = int(landmark.x * w), int(landmark.y * h)
    print(f"Landmark {idx}: ({cx}, {cy})")

b. `results.multi_hand_world_landmarks`
- Similar to `multi_hand_landmarks`, but provides 3D coordinates in real-world space

c. `results.multi_handedness`
- Contains information about which hand (left or right) was detected.

In [None]:
if results.multi_handedness:
    for idx, hand in enumerate(results.multi_handedness):
        print(f"Hand {idx}: {hand.classification[0].label}")  # "Left" or "Right"

3. **Draw Landmarks**

In [None]:
if result.multi_hand_landmarks:
    for hand_landmarks in result.multi_hand_landmarks:
        mp_draw.draw_landmarks(frame, hand_landmarks, mp_hands.HAND_CONNECTIONS)

If hands are detected, iterate over each hand and draw landmarks using `mp_draw.draw_landmarks()`.