Skip to content

CodeKnight314/Image-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Clustering with KMeans and ResNet18

Diagram of clustering for MNIST 10

A comprehensive image clustering tool that uses deep learning feature extraction and K-means clustering to automatically organize images into meaningful groups. Features cross-platform 3D visualization and supports multiple image formats. Visualizations can be extended to html output format for interactive display (as seen above).

Features

  • Deep Learning Feature Extraction: Uses pre-trained ResNet18 model for robust image feature extraction
  • Automatic Device Detection: Supports CUDA, MPS (Apple Silicon), and CPU
  • 3D Cluster Visualization: Interactive 3D scatter plots using Plotly (works on all platforms)
  • Multiple Image Formats: Supports JPG, PNG, BMP, TIFF, WebP, GIF, and more
  • Flexible Data Loading: Recursive directory search, batch processing, and preprocessing options
  • Progress Tracking: Real-time progress bars for long operations
  • Comprehensive Error Handling: Robust error handling with detailed logging
  • Evaluation Metrics: Clustering quality metrics and statistics

Table of Contents

Installation

  1. Clone the repository:

    git clone https://github.com/CodeKnight314/image-clustering.git
    cd image-clustering
  2. Create and activate a virtual environment (recommended):

    python -m venv cluster-env
    source cluster-env/bin/activate  # On Windows: cluster-env\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt

Quick Start

Basic clustering with visualization:

python main.py --image_dir ./my_images --n_clusters 5 --visualize --output_dir ./clustered_images

This will:

  • Load all images from ./my_images
  • Cluster them into 5 groups using ResNet18 features
  • Generate an interactive 3D visualization
  • Organize images into ./clustered_images/cluster_0/, ./clustered_images/cluster_1/, etc.

Usage

Basic Usage

python main.py --image_dir <path_to_images> --n_clusters <number_of_clusters>

Advanced Usage

python main.py \
    --image_dir ./images \
    --n_clusters 8 \
    --output_dir ./results \
    --visualize \
    --viz_method pca \
    --batch_size 64 \
    --image_size 256 \
    --recursive \
    --verbose

Command Line Arguments

Required Arguments

  • --image_dir: Path to directory containing images
  • --n_clusters: Number of clusters (minimum 2)

Optional Arguments

  • --output_dir: Directory to save organized images (required if not using --in_place)
  • --in_place: Organize images in the source directory instead of copying
  • --batch_size: Batch size for processing (default: 32)
  • --image_size: Target image size for processing (default: 224)
  • --recursive: Search subdirectories recursively for images
  • --max_images: Limit number of images processed (useful for testing)
  • --no_normalize: Skip ImageNet normalization (use raw pixel values)
  • --augment: Apply data augmentation during processing

Visualization Arguments

  • --visualize: Generate 3D cluster visualization
  • --viz_method: Dimensionality reduction method (pca or tsne, default: pca)
  • --viz_output: Filename for saving visualization (without extension)
  • --viz_format: Visualization format (html, png, svg, pdf, default: html)
  • --show_2d: Also generate 2D visualization alongside 3D

Other Options

  • --verbose, -v: Enable detailed logging
  • --help: Show help message

Visualization

The tool provides interactive 3D visualizations that work across all platforms:

3D Visualization Features

  • Interactive Scatter Plot: Rotate, zoom, and explore clusters in 3D space
  • Hover Information: See image filenames and cluster assignments
  • Cross-Platform: Works on Windows, macOS, and Linux
  • Multiple Formats: Save as HTML (interactive) or static images (PNG, SVG, PDF)

Visualization Methods

  • PCA: Principal Component Analysis (faster, preserves global structure)
  • t-SNE: t-Distributed Stochastic Neighbor Embedding (slower, better local structure)

Example Visualizations

Generate 3D visualization:

python main.py --image_dir ./images --n_clusters 5 --visualize

Save visualization to file:

python main.py --image_dir ./images --n_clusters 5 --visualize --viz_output my_clusters

Generate both 2D and 3D:

python main.py --image_dir ./images --n_clusters 5 --visualize --show_2d --viz_method tsne

Supported Formats

The tool supports a wide range of image formats:

  • Common: JPG, JPEG, PNG
  • Extended: BMP, TIFF, TIF, WebP, GIF, ICO
  • Scientific: PPM, PGM, PBM

Images are automatically converted to RGB format and resized for consistent processing.

Performance Tips

  • GPU Acceleration: The tool automatically detects and uses CUDA/MPS if available
  • Batch Size: Larger batch sizes improve GPU utilization
  • Image Size: Smaller images process faster but may reduce clustering quality
  • t-SNE: Better for visualization but slower than PCA
  • Memory: Large datasets may require adjusting batch size

About

Image Clustering with KMean Clustering and ResNet feature extraction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages