# **Image & Video Operations with OpenCV**

## Overview
This notebook demonstrates essential OpenCV operations for:
- **Image I/O**: Reading, writing, and displaying images
- **Image Properties**: Shape, dimensions, and pixel values
- **Image Transformations**: Resizing and basic manipulations
- **Video Capture**: Real-time video processing and recording

Master these fundamentals to build computer vision applications.

In [1]:
# Import required libraries
import cv2
import numpy as np
import os

# Verify OpenCV installation
print(f"✅ OpenCV Version: {cv2.__version__}")
print(f"✅ NumPy  Version: {np.__version__}")

# Create output directories if they don't exist
os.makedirs('../assets/outputs', exist_ok=True)
print("✅ Setup completed!")

✅ OpenCV Version: 4.10.0
✅ NumPy  Version: 1.26.4
✅ Setup completed!
✅ Setup completed!


## **1. Image Reading Operations**

### `cv2.imread()` Function
Load images from file with different modes:

| Parameter | Value | Description |
|-----------|-------|-------------|
| `cv2.IMREAD_COLOR` | `1` (default) | Load as BGR color image |
| `cv2.IMREAD_GRAYSCALE` | `0` | Load as grayscale |
| `cv2.IMREAD_UNCHANGED` | `-1` | Load with alpha channel |

In [5]:
# Read image in grayscale mode
image_path = "../assets/input-images/image1.png"
img_gray = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

if img_gray is not None:
    print(f"✅ Image loaded successfully")
    print(f"📊 Image shape: {img_gray.shape}")
    print(f"📊 Data  type : {img_gray.dtype}")
    print(f"📊 Min pixel value: {img_gray.min()}")
    print(f"📊 Max pixel value: {img_gray.max()}")
else:
    print("❌ Failed to load image. Check the file path.")

# Also load color version for comparison
img_color = cv2.imread(image_path, cv2.IMREAD_COLOR)
if img_color is not None:
    print(f"\n🎨 Color image shape: {img_color.shape} (H, W, C)")
else:
    print("❌ Failed to load color image")

✅ Image loaded successfully
📊 Image shape: (486, 739)
📊 Data  type : uint8
📊 Min pixel value: 13
📊 Max pixel value: 255

🎨 Color image shape: (486, 739, 3) (H, W, C)


## **2. Image Display & Interaction**

### `cv2.imshow()` Function
Display images with keyboard interaction:

| Key | ASCII | Action |
|-----|-------|--------|
| `ESC` | 27 | Close windows |
| `s` | 115 | Save image |
| Any other | - | Close windows |

In [None]:
img = cv2.imread(image_path)
cv2.imshow("image", img)                                        # show image

key = cv2.waitKey(0) & 0xFF                                     # 0 means wait while user press any key
if key == 27:                                                   # 27 is ASCII code for ESC
    cv2.destroyAllWindows()
elif key == ord('s'):                                           # 's' is ASCII code for 's'
    cv2.imwrite("../assets/outputs/saved_image.jpg", img)       # save image
    cv2.destroyAllWindows()
else:
    print(f"🔑 Key '{chr(key)}' pressed - Closing windows")
    cv2.destroyAllWindows()

### `cv2.waitKey()` Options

```python
cv2.waitKey(0)      # Wait indefinitely for key press
cv2.waitKey(5000)   # Wait 5 seconds, then continue
cv2.waitKey(1)      # Wait 1ms (for video loops)
```

## **3. Image Properties & Pixel Access**

### Understanding Image Dimensions
- **Height × Width × Channels** (for color images)
- **Height × Width** (for grayscale images)
- **BGR channel order** in OpenCV (not RGB)

In [10]:
image = cv2.imread("../assets/input-images/image1.png")

# Analyze image properties
if image is not None:
    # Get image dimensions
    h, w, channels = image.shape
    
    print("📏 Image Properties:")
    print(f"   • Shape        : {image.shape}")
    print(f"   • Height       : {h} pixels")
    print(f"   • Width        : {w} pixels") 
    print(f"   • Channels     : {channels} (BGR)")
    print(f"   • Total pixels : {h * w:,}")
    print(f"   • Data type    : {image.dtype}")
    print(f"   • Memory size  : {image.nbytes:,} bytes")
else:
    print("❌ No color image loaded for analysis")

📏 Image Properties:
   • Shape        : (486, 739, 3)
   • Height       : 486 pixels
   • Width        : 739 pixels
   • Channels     : 3 (BGR)
   • Total pixels : 359,154
   • Data type    : uint8
   • Memory size  : 1,077,462 bytes


In [13]:
# Access individual pixel values
if img_color is not None:
    # Choose a pixel coordinate (y, x)
    y, x = 100, 100
    
    # Access BGR values at specific coordinate
    if y < img_color.shape[0] and x < img_color.shape[1]:
        pixel = img_color[y, x]
        b, g, r = pixel
        
        print(f"🎯 Pixel at coordinate ({x}, {y}):")
        print(f"   • BGR values : {pixel}")
        print(f"   • Blue  (B)  : {b}")
        print(f"   • Green (G)  : {g}")
        print(f"   • Red   (R)  : {r}")
        
        # Access individual channels
        print(f"\n📊 Channel-wise access:")
        print(f"   • Blue  channel: {img_color[y, x, 0]}")
        print(f"   • Green channel: {img_color[y, x, 1]}")
        print(f"   • Red   channel: {img_color[y, x, 2]}")
    else:
        print(f"❌ Coordinates ({x}, {y}) are out of bounds")
else:
    print("❌ No color image available for pixel access")

🎯 Pixel at coordinate (100, 100):
   • BGR values : [252 242 242]
   • Blue  (B)  : 252
   • Green (G)  : 242
   • Red   (R)  : 242

📊 Channel-wise access:
   • Blue  channel: 252
   • Green channel: 242
   • Red   channel: 242


## **4. Image Transformations**

### `cv2.resize()` Function
Resize images while maintaining quality:

| Parameter | Description |
|-----------|-------------|
| `src` | Source image |
| `dsize` | Desired size (width, height) |
| `interpolation` | Method (INTER_LINEAR, INTER_CUBIC, etc.) |

In [17]:
# Image resizing examples
if img_color is not None:
    original_height, original_width = img_color.shape[:2]
    print(f"📐 Original size: {original_width} × {original_height}")
    
    # Method 1: Fixed size
    resized_fixed = cv2.resize(img_color, (300, 300))
    print(f"✅ Fixed  resize: {resized_fixed.shape[1]} × {resized_fixed.shape[0]}")
    
    # Method 2: Scale by factor
    scale_factor   = 0.5
    new_width      = int(original_width * scale_factor)
    new_height     = int(original_height * scale_factor)
    resized_scaled = cv2.resize(img_color, (new_width, new_height))
    print(f"✅ Scaled resize (50%): {resized_scaled.shape[1]} × {resized_scaled.shape[0]}")
    
    # Method 3: Using fx and fy parameters (more flexible)
    resized_fx = cv2.resize(img_color, None, fx=0.75, fy=0.75, interpolation=cv2.INTER_CUBIC)
    print(f"✅ FX/FY  resize (75%): {resized_fx.shape[1]} × {resized_fx.shape[0]}")
    
    # Display resized image
    cv2.imshow("Original Image", img_color)
    cv2.imshow("Resized  Image (300x300)", resized_fixed)
    cv2.imshow("Scaled   Image (50%)", resized_scaled)
    cv2.imshow("FX/FY    Image (75%)", resized_fx)
    
    print("\n💡 Press any key to close all windows...")
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    
    # Save resized images
    cv2.imwrite("../assets/outputs/resized_fixed.png", resized_fixed)
    cv2.imwrite("../assets/outputs/resized_scaled1.png", resized_scaled)
    cv2.imwrite("../assets/outputs/resized_scaled2.png", resized_fx)
    
    print("💾 Resized images saved to outputs folder")
    
else:
    print("❌ No image available for resizing")

📐 Original size: 739 × 486
✅ Fixed  resize: 300 × 300
✅ Scaled resize (50%): 369 × 243
✅ FX/FY  resize (75%): 554 × 364

💡 Press any key to close all windows...

💡 Press any key to close all windows...
💾 Resized images saved to outputs folder
💾 Resized images saved to outputs folder


## **5. Video Capture & Recording**

### Key Components for Video Processing:

1. **`cv2.VideoCapture()`** - Initialize video source
2. **`cv2.VideoWriter()`** - Record video output  
3. **FourCC Codes** - Video compression formats
4. **Frame Processing Loop** - Real-time processing

### Video Setup Parameters

| Component | Options | Description |
|-----------|---------|-------------|
| **Video Source** | `0` (default camera)<br>`'video.mp4'` (file path) | Input source |
| **FourCC Codes** | `'XVID'` → `.avi`<br>`'MP4V'` → `.mp4`<br>`'MJPG'` → `.avi` | Compression format |
| **Frame Rate** | `20.0`, `30.0` | Frames per second |
| **Frame Size** | `(640, 480)`, `(1280, 720)` | Width × Height |

### Video Processing Flow:
1. **Initialize** video capture and writer
2. **Read frames** in loop while camera is open
3. **Process** each frame (e.g., convert to grayscale)
4. **Display** processed frame
5. **Write** frame to output file
6. **Clean up** resources

In [23]:
# Video capture and recording
print("🎥 Setting up video capture...")

# Initialize video capture (0 = default camera)
cap = cv2.VideoCapture(0)

# Check if camera opened successfully
if not cap.isOpened():
    print("❌ Error: Could not open camera")
    print("💡 Tip  : Make sure camera is connected and not used by other apps")
else:
    print("✅ Camera opened successfully")
    
    # Get camera properties
    width  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps    = cap.get(cv2.CAP_PROP_FPS)
    
    print(f"📊 Camera properties:")
    print(f"   • Resolution: {width} × {height}")
    print(f"   • FPS: {fps}")
    print("-"*40)
    
    # Setup video writer
    fourcc      = cv2.VideoWriter_fourcc(*'MJPG')
    output_path = "../assets/outputs/recorded_video.avi"
    out         = cv2.VideoWriter(output_path, fourcc, 20.0, (width, height))
    
    if out.isOpened():
        print(f"✅ Video writer initialized: {output_path}")
    else:
        print("❌ Failed to initialize video writer")
        print("💡 Try changing codec or check permissions")
    
    print("\n🎬 Starting video capture...")
    print("💡 Instructions:")
    print("   • ESC key to stop recording")
    print("   • Video will be saved as grayscale")
    
    frame_count = 0
    
    try:
        while cap.isOpened():
            ret, frame = cap.read()
            
            if not ret:
                print("❌ Can't receive frame. Ending capture...")
                break
            
            frame_count += 1
            
            # Convert to grayscale for processing
            gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            
            # Convert back to BGR for video writer (3 channels required)
            gray_bgr   = cv2.cvtColor(gray_frame, cv2.COLOR_GRAY2BGR)
            
            # Write frame to output video
            out.write(gray_bgr)
            
            # Display frame
            cv2.imshow("Live Feed (Grayscale)", gray_frame)
            cv2.imshow("Original Feed", frame)
            
            # Show frame count every 30 frames
            if frame_count % 30 == 0:
                print(f"📹 Frames captured: {frame_count}")
            
            # Break on ESC key
            if cv2.waitKey(1) & 0xFF == 27:
                print(f"\n🛑 Recording stopped by user")
                break
                
    except KeyboardInterrupt:
        print(f"\n🛑 Recording interrupted by user")
    
    # Cleanup
    cap.release()
    out.release()
    cv2.destroyAllWindows()
    
    print(f"\n✅ Video capture completed!")
    print(f"📊 Total frames captured: {frame_count}")
    print(f"💾 Video saved to: {output_path}")
    print("🏁 All resources released successfully")

🎥 Setting up video capture...
✅ Camera opened successfully
📊 Camera properties:
   • Resolution: 640 × 480
   • FPS: 30.0
----------------------------------------
✅ Video writer initialized: ../assets/outputs/recorded_video.avi

🎬 Starting video capture...
💡 Instructions:
   • ESC key to stop recording
   • Video will be saved as grayscale
✅ Camera opened successfully
📊 Camera properties:
   • Resolution: 640 × 480
   • FPS: 30.0
----------------------------------------
✅ Video writer initialized: ../assets/outputs/recorded_video.avi

🎬 Starting video capture...
💡 Instructions:
   • ESC key to stop recording
   • Video will be saved as grayscale
📹 Frames captured: 30
📹 Frames captured: 30

🛑 Recording stopped by user

🛑 Recording stopped by user

✅ Video capture completed!
📊 Total frames captured: 49
💾 Video saved to: ../assets/outputs/recorded_video.avi
🏁 All resources released successfully

✅ Video capture completed!
📊 Total frames captured: 49
💾 Video saved to: ../assets/outputs/rec

## **6. Summary & Best Practices**

### Key Functions Learned:
| Function | Purpose | Key Parameters |
|----------|---------|----------------|
| `cv2.imread()` | Load images | `path`, `flags` (0=gray, 1=color) |
| `cv2.imshow()` | Display images | `window_name`, `image` |
| `cv2.imwrite()` | Save images | `filename`, `image` |
| `cv2.resize()` | Resize images | `src`, `dsize` or `fx`/`fy` |
| `cv2.VideoCapture()` | Video input | `source` (0=camera, path=file) |
| `cv2.VideoWriter()` | Video output | `filename`, `fourcc`, `fps`, `size` |

### Best Practices:
1. **Always check** if image/video loaded successfully
2. **Use proper error handling** for file operations
3. **Release resources** after video operations
4. **Choose appropriate** interpolation for resizing
5. **Remember BGR format** in OpenCV vs RGB in other libraries

### Common Issues & Solutions:
- **Image not loading**: Check file path and format
- **Camera not opening**: Ensure no other app is using camera
- **Video writer fails**: Verify codec and file permissions
- **Memory issues**: Release resources properly

### Next Steps:
- Learn image filtering and enhancement
- Explore geometric transformations
- Practice with different video sources
- Try real-time image processing

---
**✅ Image & Video Operations Mastered!**