# Experiment: Attractors in Image Space

If we generate say 100 images an image generator like https://perchance.org/ai-text-to-image-generator and use the same prompt each time and we try the '[datafication of a kiss](https://www.cyberneticforests.com/news/how-to-read-an-ai-image)' approach[^1], what can we see? The idea is to treat the generated images as an infographic of the underlying dataset. What visual tropes or elements seem to be in play in the different clusters? What does that imply about the underlying data?

1. Go to Perchance and generate multiple images from the same prompt. use the same prompt 11 times. Take a screenshot of the grid of results each time. Rename your screenshots run1.png, run2.png, run3.png, run4.png, run5.png etc.  Then, run the code below in **PART ONE**. Drag and drop the screenshots into the file tray here, in the `input` folder you make. Then slice each screenshot into its constituent sub images.
2. **PART TWO** measures the similarity of each image by creating an embedding or vectorized representation of the images and seeing how close each pair of vectors are in turn. It then visualizes the results for us.
3. **PART THREE** Visualizes the results. Not as pretty as PixPlot, but it all works.

## Part One

In [None]:
## run this
#!rm -r input #if you're processing different images w/ different aspect ratios, do it in batches. Do one batch, then uncomment this line to get a fresh input folder. Then change column/rows as appropriate, below
!mkdir input

In [None]:
# if you have a folder of zipped images ready to go
# drag and drop the zipped folder into the file tray (hit the the folder icon to expand it if necessary) at left
# then adjust this code to use your file name, and run it:
!unzip my_images.zip -d input

## create some necessary functions for manipulating grids of images

The functions below will cut your 'contact' sheet of multiple images up into single images. We want to feed one image at a time to pixplot for visualization, so we need to split the grid into separate images. You run the cell that `def`ines the function, then we run the function on your images in the subsequent block.

In [None]:
import os
from PIL import Image
import math
import logging

def setup_logging():
    """Configure logging to track image processing details."""
    logging.basicConfig(level=logging.INFO,
                       format='%(asctime)s - %(levelname)s - %(message)s')
    return logging.getLogger(__name__)

def validate_dimensions(img_width, img_height, num_columns, num_rows):
    """
    Validate that the image can be evenly divided into the specified grid.

    Returns:
        tuple: (is_valid, single_width, single_height, warning_message)
    """
    single_width = img_width / num_columns
    single_height = img_height / num_rows

    # Check if dimensions result in whole numbers
    is_width_whole = single_width.is_integer()
    is_height_whole = single_height.is_integer()

    warning_msg = ""
    if not (is_width_whole and is_height_whole):
        warning_msg = (
            f"Warning: Image dimensions ({img_width}x{img_height}) "
            f"cannot be evenly divided into {num_columns}x{num_rows} grid. "
            f"Subimages will be {single_width:.2f}x{single_height:.2f} pixels."
        )

    return (is_width_whole and is_height_whole,
            int(single_width),
            int(single_height),
            warning_msg)

def slice_image(image_path, output_dir, num_columns=3, num_rows=2, strict_mode=True):
    """
    Slice an image into a grid of smaller images with validation and logging.

    Args:
        image_path (str): Path to the input image
        output_dir (str): Directory to save the output images
        num_columns (int): Number of columns in the grid
        num_rows (int): Number of rows in the grid
        strict_mode (bool): If True, raises error on uneven divisions

    Returns:
        list: List of paths to the generated images
    """
    logger = setup_logging()

    # Load the image
    img = Image.open(image_path).convert("RGB")
    img_width, img_height = img.size

    logger.info(f"Processing image: {image_path}")
    logger.info(f"Original dimensions: {img_width}x{img_height}")

    # Validate dimensions
    is_valid, single_width, single_height, warning_msg = validate_dimensions(
        img_width, img_height, num_columns, num_rows
    )

    if warning_msg:
        logger.warning(warning_msg)
        if strict_mode:
            raise ValueError("Image dimensions must be exactly divisible in strict mode")

    # Ensure we're working with integer dimensions
    single_width = math.floor(single_width)
    single_height = math.floor(single_height)

    logger.info(f"Subimage dimensions: {single_width}x{single_height}")

    output_paths = []
    for row in range(num_rows):
        for col in range(num_columns):
            left = col * single_width
            upper = row * single_height
            right = left + single_width
            lower = upper + single_height

            # Create a new blank image instead of cropping
            cropped_img = Image.new('RGB', (single_width, single_height))
            # Copy the exact region we want
            region = img.crop((left, upper, right, lower))
            cropped_img.paste(region, (0, 0))

            # Strip any existing metadata
            data = list(cropped_img.getdata())
            clean_img = Image.new('RGB', cropped_img.size)
            clean_img.putdata(data)

            base_name = os.path.splitext(os.path.basename(image_path))[0]
            original_ext = os.path.splitext(image_path)[1].lower()
            output_path = os.path.join(
                output_dir,
                f'{base_name}_cropped_{row * num_columns + col + 1}{original_ext}'
            )

            # Save with explicit dimensions
            clean_img.save(output_path, format=original_ext[1:])

            logger.info(f"Saved subimage: {output_path}")
            output_paths.append(output_path)

    return output_paths

def process_images(input_dir="input", output_dir="all_images",
                  num_columns=3, num_rows=2, strict_mode=True):
    """
    Process all supported image files in the input directory with validation.

    Args:
        input_dir (str): Input directory containing images
        output_dir (str): Directory to save the output images
        num_columns (int): Number of columns in the grid
        num_rows (int): Number of rows in the grid
        strict_mode (bool): If True, raises error on uneven divisions
    """
    logger = setup_logging()

    SUPPORTED_FORMATS = ('.jpg', '.jpeg', '.png', '.JPG', '.JPEG', '.PNG')
    os.makedirs(output_dir, exist_ok=True)

    processed_files = 0
    errors = 0

    for filename in os.listdir(input_dir):
        if filename.endswith(SUPPORTED_FORMATS):
            image_path = os.path.join(input_dir, filename)
            try:
                slice_image(image_path, output_dir, num_columns, num_rows, strict_mode)
                processed_files += 1
                logger.info(f"Successfully processed: {filename}")
            except Exception as e:
                errors += 1
                logger.error(f"Error processing {filename}: {str(e)}")

    logger.info(f"Processing complete. {processed_files} images processed, {errors} errors.")
    logger.info(f"Subimages saved to '{output_dir}' directory.")



## slice the images
If your image generator generates previews in a grid-like format, you can take a screenshot of that grid and then run 'process_images' below to cut them into individual images. Just change the number of columns and rows appropriately Eg, craiyon gives you 3x3 preview images;  https://perchance.org/ai-text-to-image-generator with 'casual photography', 6 photos, portrait, will return 3 x 2.

For instance, here's a 'contact' sheet I made with Perchance with the prompt, `an archaeologist at work`; you'd set the code in the next block to have 6 columns and 3 rows.

![](https://github.com/shawngraham/homecooked-history/blob/main/genai-images-as-infographics/perchance-archaeologists.png?raw=true)

In [None]:
process_images(
    input_dir="input",
    output_dir="all_images",
    num_columns=6,  #make sure you set this correctly!
    num_rows=3,  #make sure you set this correctly!
    strict_mode=False
)

# For strict validation (will raise error if dimensions don't divide evenly)
#process_images(strict_mode=True)

# For more lenient processing (will warn but continue)
#process_images(strict_mode=False)

If the block runs, check your file browser for 'all_images'. You should see a number of image files in there (you can double-click them to see if everything worked correctly).

## Part Two

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.models as models
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
import seaborn as sns
from pathlib import Path
import random
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
# ie, everytime the computer needs to calculate a random number, it'll start by using the
# numbers we set here and this will make all subsequent 'random' calculations the same ones
# each time. You thought computers were actually random? Nope. They always take a seed value
# but most of the time we don't set it, so it always appears random-enough to us. Just fyi.
torch.manual_seed(24)
np.random.seed(24)
random.seed(24)

print("All libraries imported successfully!")

In [None]:
# Define the data directory
DATA_DIR = "all_images"

def load_image_paths(data_dir):
    """Load all image paths from the data directory and its subdirectories."""
    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.gif'}
    image_paths = []

    for root, dirs, files in os.walk(data_dir):
        for file in files:
            if any(file.lower().endswith(ext) for ext in image_extensions):
                image_paths.append(os.path.join(root, file))

    return image_paths

# Load all image paths
image_paths = load_image_paths(DATA_DIR)
print(f"Found {len(image_paths)} images in the dataset")

# Display some example images
def display_sample_images(image_paths, n_samples=6):
    """Display a sample of images from the dataset."""
    if len(image_paths) == 0:
        print("No images found in the dataset!")
        return

    sample_paths = random.sample(image_paths, min(n_samples, len(image_paths)))

    fig, axes = plt.subplots(2, 3, figsize=(12, 8))
    axes = axes.flatten()

    for i, path in enumerate(sample_paths):
        try:
            img = Image.open(path)
            axes[i].imshow(img)
            axes[i].set_title(os.path.basename(path))
            axes[i].axis('off')
        except Exception as e:
            print(f"Error loading image {path}: {e}")

    plt.tight_layout()
    plt.show()

# Display sample images
display_sample_images(image_paths)

# Feature Extractor
Now the image model we're going to use has learned already all sorts of labels. We don't want that; we just want its understanding of shapes and form. So we're going to get rid of that last layer and we'll end up with just the vectors that describe images.

In [None]:
# We'll use ResNet-50, a popular convolutional neural network architecture.
# We'll remove the final classification layer to get feature embeddings.

class ImageFeatureExtractor:
    def __init__(self, model_name='resnet50'):
        """Initialize the feature extractor with a pre-trained model."""
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        print(f"Using device: {self.device}")

        # Load pre-trained ResNet-50
        self.model = models.resnet50(pretrained=True)

        # Remove the final classification layer
        # This gives us 2048-dimensional feature vectors
        self.model = nn.Sequential(*list(self.model.children())[:-1])

        # Set to evaluation mode
        self.model.eval()
        self.model.to(self.device)

        # Define image preprocessing transforms
        self.transform = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406],
                               std=[0.229, 0.224, 0.225])
        ])

    def extract_features(self, image_path):
        """Extract features from a single image."""
        try:
            # Load and preprocess the image
            image = Image.open(image_path).convert('RGB')
            input_tensor = self.transform(image).unsqueeze(0).to(self.device)

            # Extract features
            with torch.no_grad():
                features = self.model(input_tensor)

            # Flatten the features
            features = features.view(features.size(0), -1)

            return features.cpu().numpy().flatten()

        except Exception as e:
            print(f"Error processing image {image_path}: {e}")
            return None

# Initialize the feature extractor
extractor = ImageFeatureExtractor()
print("Feature extractor initialized successfully!")

# Drop our pictures through this model

Notice that the information about the image categories implied by the folder name they're in is not used in any way for turning the images into vectors. Later on we'll visualize where in the embedding space the different images fall, and we'll colour the dots by their original categories: which means that you can get a sense of how good those category images might actually be...

In [None]:
# This step processes all images and extracts their feature embeddings.
# Note: This might take a while depending on the number of images.

def extract_all_features(image_paths, extractor):
    """Extract features from all images in the dataset."""
    features = []
    valid_paths = []

    print("Extracting features from all images...")

    for path in tqdm(image_paths, desc="Processing images"):
        feature = extractor.extract_features(path)
        if feature is not None:
            features.append(feature)
            valid_paths.append(path)

    features = np.array(features)
    print(f"Successfully extracted features from {len(features)} images")
    print(f"Feature shape: {features.shape}")

    return features, valid_paths

# Extract features (this might take a few minutes)
if len(image_paths) > 0:
    features, valid_image_paths = extract_all_features(image_paths, extractor)
else:
    print("No images found to process!")
    features, valid_image_paths = np.array([]), []

# Calculate Similarity!

Because vectors are directions, we can work out similarity by doing some geometry on them. In this case we'll use cosine similarity.

In [None]:
# Now we'll compute similarity between images using cosine similarity.
# Cosine similarity measures the cosine of the angle between two vectors.
def find_similar_images(query_idx, features, valid_paths, top_k=5):
    """Find the most similar images to a query image."""
    if len(features) == 0:
        return [], []

    # Compute cosine similarity between query and all images
    query_features = features[query_idx:query_idx+1]
    similarities = cosine_similarity(query_features, features).flatten()

    # Get top-k most similar images (excluding the query itself)
    similar_indices = np.argsort(similarities)[::-1]

    # Remove the query image itself from results
    similar_indices = similar_indices[similar_indices != query_idx][:top_k]

    return similar_indices, similarities[similar_indices]

def display_similar_images(query_idx, similar_indices, similarities, valid_paths):
    """Display the query image and its most similar images."""
    if len(valid_paths) == 0:
        print("No valid images to display!")
        return

    n_images = len(similar_indices) + 1
    fig, axes = plt.subplots(1, n_images, figsize=(15, 3))

    if n_images == 1:
        axes = [axes]

    # Display query image
    query_img = Image.open(valid_paths[query_idx])
    axes[0].imshow(query_img)
    axes[0].set_title(f"Query Image\n{os.path.basename(valid_paths[query_idx])}")
    axes[0].axis('off')

    # Display similar images
    for i, (idx, sim) in enumerate(zip(similar_indices, similarities)):
        similar_img = Image.open(valid_paths[idx])
        axes[i+1].imshow(similar_img)
        axes[i+1].set_title(f"Similarity: {sim:.3f}\n{os.path.basename(valid_paths[idx])}")
        axes[i+1].axis('off')

    plt.tight_layout()
    plt.show()



In [None]:
# Find similar images to a random query
if len(features) > 0:
    query_idx = random.randint(0, len(features) - 1)
    similar_indices, similarities = find_similar_images(query_idx, features, valid_image_paths)

    print(f"Finding images similar to: {os.path.basename(valid_image_paths[query_idx])}")
    display_similar_images(query_idx, similar_indices, similarities, valid_image_paths)
else:
    print("No features available for similarity computation!")

In [None]:
def another_similarity_search(features, valid_paths, n_queries=3):
    """Demonstrate similarity search with multiple random queries."""
    if len(features) == 0:
        print("No features available for similarity search!")
        return

    print(f"Demonstrating similarity search with {n_queries} random queries...")

    for i in range(n_queries):
        print(f"\n--- Query {i+1} ---")
        query_idx = random.randint(0, len(features) - 1)
        similar_indices, similarities = find_similar_images(query_idx, features, valid_paths, top_k=4)
        display_similar_images(query_idx, similar_indices, similarities, valid_paths)

# Run another similarity search
if len(features) > 0:
    another_similarity_search(features, valid_image_paths)

# Part THREE
Visualize Similarity!

In [None]:
# High-dimensional embeddings are hard to visualize directly.
# We'll use dimensionality reduction techniques to project them into 2D space.

def visualize_embeddings_2d(features, valid_paths, method='tsne', n_samples=None,
                           show_labels=False):
    """
    Visualize embeddings in 2D space using PCA or t-SNE.

    Parameters:
    -----------
    features : numpy.ndarray
        Feature embeddings for all images
    valid_paths : list
        List of valid image paths
    method : str
        Dimensionality reduction method ('pca' or 'tsne')
    n_samples : int, optional
        Number of samples to visualize (None for all)
    show_labels : bool
        Whether to show filename labels on the plot
    """
    if len(features) == 0:
        print("No features available for visualization!")
        return

    # Limit number of samples for faster computation
    if n_samples and len(features) > n_samples:
        indices = random.sample(range(len(features)), n_samples)
        features_subset = features[indices]
        paths_subset = [valid_paths[i] for i in indices]
    else:
        features_subset = features
        paths_subset = valid_paths

    print(f"Visualizing {len(features_subset)} images using {method.upper()}...")

    if method == 'pca':
        # Principal Component Analysis
        reducer = PCA(n_components=2, random_state=42)
        embeddings_2d = reducer.fit_transform(features_subset)
        title = f"Image Embeddings Visualization (PCA)\nExplained Variance: {reducer.explained_variance_ratio_.sum():.2%}"

    elif method == 'tsne':
        # t-Distributed Stochastic Neighbor Embedding
        reducer = TSNE(n_components=2, random_state=42, perplexity=min(30, len(features_subset)-1))
        embeddings_2d = reducer.fit_transform(features_subset)
        title = "Image Embeddings Visualization (t-SNE)"

    # Create the visualization
    fig, ax = plt.subplots(figsize=(12, 8))

    # Color points by their directory (if available)
    colors = []
    labels = []
    for path in paths_subset:
        # Extract directory name as label
        parent_dir = os.path.basename(os.path.dirname(path))
        if parent_dir not in labels:
            labels.append(parent_dir)
        colors.append(labels.index(parent_dir))

    # Create scatter plot
    scatter = ax.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1],
                       c=colors, cmap='tab10', alpha=0.7, s=50)

    # Add filename labels if requested
    if show_labels:
        for i, (x, y, path) in enumerate(zip(embeddings_2d[:, 0], embeddings_2d[:, 1], paths_subset)):
            filename = os.path.basename(path)
            # Truncate long filenames
            if len(filename) > 15:
                filename = filename[:12] + '...'
            ax.annotate(filename, (x, y), xytext=(5, 5), textcoords='offset points',
                       fontsize=8, alpha=0.7, ha='left')

    # Add colorbar if we have multiple categories
    # where each color corresponds to the category
    if len(labels) > 1:
        plt.colorbar(scatter, label='Directory')

    # Add legend if we have multiple categories
    if len(labels) > 1 and len(labels) <= 10:
        handles = [plt.Line2D([0], [0], marker='o', color='w',
                            markerfacecolor=plt.cm.tab10(i/len(labels)),
                            markersize=8, label=label)
                  for i, label in enumerate(labels)]
        ax.legend(handles=handles, title='Directory', bbox_to_anchor=(1.05, 1), loc='upper left')

    ax.set_title(title)
    ax.set_xlabel('Dimension 1')
    ax.set_ylabel('Dimension 2')
    ax.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.show()

    return embeddings_2d



In [None]:
# Visualize embeddings using both PCA and t-SNE
if len(features) > 0:
    print("Creating 2D visualizations of the image embeddings...")

    # Standard PCA visualization
    pca_embeddings = visualize_embeddings_2d(features, valid_image_paths, method='pca', n_samples=200)

    # t-SNE visualization with filename labels
    print("\nCreating t-SNE visualization with filename labels...")
    tsne_embeddings = visualize_embeddings_2d(features, valid_image_paths, method='tsne',
                                             n_samples=50, show_labels=True)
else:
    print("No features available for visualization!")

In [None]:
# Let's analyze the distribution of similarities and feature statistics.

def analyze_embeddings(features, valid_paths):
    """Analyze the properties of extracted embeddings."""
    if len(features) == 0:
        print("No features available for analysis!")
        return

    print("=== Embedding Analysis ===")
    print(f"Number of images: {len(features)}")
    print(f"Feature dimensionality: {features.shape[1]}")
    print(f"Feature range: [{features.min():.3f}, {features.max():.3f}]")
    print(f"Mean feature magnitude: {np.linalg.norm(features, axis=1).mean():.3f}")

    # Compute pairwise similarities
    print("\nComputing pairwise similarities...")
    similarities = cosine_similarity(features)

    # Remove diagonal (self-similarities)
    np.fill_diagonal(similarities, 0)

    # Analyze similarity distribution
    fig, axes = plt.subplots(1, 2, figsize=(15, 5))

    # Histogram of similarities
    axes[0].hist(similarities.flatten(), bins=50, alpha=0.7, edgecolor='black')
    axes[0].set_title('Distribution of Pairwise Similarities')
    axes[0].set_xlabel('Cosine Similarity')
    axes[0].set_ylabel('Frequency')
    axes[0].grid(True, alpha=0.3)

    # Feature magnitude distribution
    feature_magnitudes = np.linalg.norm(features, axis=1)
    axes[1].hist(feature_magnitudes, bins=30, alpha=0.7, edgecolor='black')
    axes[1].set_title('Distribution of Feature Magnitudes')
    axes[1].set_xlabel('L2 Norm')
    axes[1].set_ylabel('Frequency')
    axes[1].grid(True, alpha=0.3)

    plt.tight_layout()
    plt.show()

    print(f"Average pairwise similarity: {similarities.mean():.3f}")
    print(f"Std of pairwise similarities: {similarities.std():.3f}")
    print(f"Most similar pair similarity: {similarities.max():.3f}")

# Run analysis
if len(features) > 0:
    analyze_embeddings(features, valid_image_paths)

Cosine similarity ranges from -1 to +1, where 1 = identical, 0 = orthogonal, -1 = opposite
+ If average pairwise similarity is high, then the images are sharing similar kinds of composition, colours, visual organization etc.
+ The standard deviation (std) is low, then this suggests that there is consistent similarity, limited diversity, tight clustering, or a homogeneous dataset.
+ If there is a most similar pair scoring 1.000, then that suggests that there's a duplicate image in the dataset somewhere!

So... what does this all imply about our dataset? Make a note!