### Data Loading

- **Image Dataset:** [COCO 2017 Dataset on Kaggle](https://www.kaggle.com/datasets/awsaf49/coco-2017-dataset)  
  A large-scale dataset commonly used for image segmentation, object detection, and captioning tasks. Contains diverse images with pixel-level annotations, suitable for training deep learning segmentation models.


### PILLOW (PIL)
`Pillow`, also known as PIL (Python Imaging Library), is a widely-used Python library for opening, manipulating, and saving many different image file formats. It provides an easy-to-use interface for common image processing tasks such as loading images, resizing, cropping, converting between color spaces, applying filters, and saving images in various formats.

Key features of Pillow include:

- Support for a variety of image formats (JPEG, PNG, GIF, BMP, etc.).

- Basic image processing operations like rotation, cropping, resizing, and flipping.

- Color space conversions (e.g., RGB to grayscale).

- Image enhancement and filtering (e.g., blur, sharpen).

- Drawing text, shapes, and graphics on images.

- Integration with NumPy arrays for advanced image manipulations.

`Pillow` is popular for its simplicity and efficiency in handling image data, making it a foundational tool for image preprocessing in many machine learning and computer vision projects.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Loading one image using PIL
from PIL import Image

# Load an image from dataset
image_path = "/content/drive/MyDrive/AI_Vision_Extract_Nov25/data/COCO2017_SAMPLE/val2017/000000000776.jpg" # Change the dataset path
image  = Image.open(image_path)

# Display the image
image.show()

# Print image format, size, and mode
print(f"Format: {image.format}")
print(f"size: {image.size}")
print(f"Mode: {image.mode}")

Format: JPEG
size: (428, 640)
Mode: RGB


In [None]:
from PIL import Image
import os
import pandas as pd

# Directory where images are store
image_dir = "/content/drive/MyDrive/AI_Vision_Extract_Nov25/data/COCO2017_SAMPLE/val2017/"

# List to store metadata
image_data = []

# Iterate over images in directory
for filename in os.listdir(image_dir):
  if filename.endswith((".jpg", ".jpeg", ".png")):
    filepath = os.path.join(image_dir, filename)
    with Image.open(filepath) as img:
      image_data.append({
          "filename": filename,
          "format": img.format,
          "size": img.size,
          "mode":img.mode
      })

# Create DataFrame
df = pd.DataFrame(image_data)

# Display the dataframe summary
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   filename  500 non-null    object
 1   format    500 non-null    object
 2   size      500 non-null    object
 3   mode      500 non-null    object
dtypes: object(4)
memory usage: 15.8+ KB
