<a href="https://colab.research.google.com/github/zentralwerkstatt/fau/blob/main/fau.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Image Analysis

**This notebook requires a GPU - make sure to change the runtime type in the "runtime" menu!**

## Copyright notice

This version (c) 2024 Fabian Offert, [MIT License](LICENSE).

Access utility functions

In [None]:
%%capture
!rm -rf toolbox
!git clone https://github.com/zentralwerkstatt/toolbox
!pip3 install git+https://github.com/openai/CLIP.git
!pip3 install umap-learn filetype

import warnings
warnings.filterwarnings("ignore",category=DeprecationWarning)
warnings.filterwarnings("ignore",category=FutureWarning)

Imports

In [None]:
import numpy as np
import PIL.Image
from tqdm.notebook import tqdm
from sklearn.cluster import KMeans
from toolbox import toolbox
from IPython.display import display

Getting some test data "locally" (Web gallery of art)


In [None]:
%%capture
!rm -rf wga
!gdown --id 10eyHTKDDqN7iwu0WDCaq2sZtOCQ3w9Fs
!unzip wga.zip -d .
!rm wga.zip



---



## 1. Individual images

### Opening and displaying images

In [None]:
# Let's get a test image
img = toolbox.img_from_url("https://c.files.bbci.co.uk/8D30/production/_106344163_florida_python_.big_cypressjpg.jpg")

In [None]:
img

In [None]:
# Within a loop, use:
toolbox.show_img(img)

### Resizing images

This image is too big, but we can easily resize it.

In [None]:
img_small = img.resize((img.width//2, img.height//3)) # Floor division
img_small

If we are not sure how big our original image is, we can use the `thumbnail` function to resize the image to a min/max size while keeping the aspect ratio. Caution: this functions changes an image in-place, i.e. the function does not return a new variable but changes the one it operates on as a parameter!

In [None]:
img.thumbnail((200, 200)) # Target thumbnail size
img

To save an image, simply call save on the variable with a filename/path.

In [None]:
img.save("small.jpg")



---



## 2. What is an image anyway?

In Python (for deep learning), images are arrays, i.e. multi-dimensional matrices (arrays). Color images have three channels.

In [None]:
# For convenience
def show_np(x):
    img = PIL.Image.fromarray(x)
    toolbox.show_img(img)

In [None]:
a = np.ones((20,20,3), dtype=np.uint8) * 255 # Multiply with scalar
print(a.shape) # Show the "shape" of a matrix

In [None]:
print(a)

In [None]:
show_np(a)

### Manipulating pixels with slicing

In [None]:
b = np.zeros((20,20,3), dtype=np.uint8)
b[:,:,0] = 255
show_np(b)

In [None]:
img_np = np.array(img)

In [None]:
img_np[0:100,0:100,2] = 0 # Remove the red channel in the upper left corner
show_np(img_np)

### Exercises H

1. Create a grey 300x300 pixel image and display it.
2. Re-color 1/3 of the image red, 1/3 green, 1/3 blue and display it.



---



## 3. Colors

Let's get the average color of all images in a dataset. First, load the WGA dataset.

In [None]:
paths = toolbox.get_all_files("wga", ext="jpg")
print(paths[:5])

In [None]:
len(paths)

In [None]:
def avg_color(np_img):
    avg_color_per_row = np.average(np_img, axis=0)
    avg_color = np.average(avg_color_per_row, axis=0)
    return avg_color

In [None]:
for path in paths[:3]: # Only try this on a subset
    img = toolbox.load_img(path)
    img.thumbnail((200,200)) # In-place!
    np_img = np.array(img)

    color_img = toolbox.color_img(50, avg_color(np_img))

    display(img)
    display(color_img)

Now this seems correct but not very useful. It turns out what we actually want are the *dominant* colors, not the average color. And we can get these by applying a machine learning technique called [k-means clustering](https://en.wikipedia.org/wiki/K-means_clustering). To visualize the colors we will use a function provided by the class toolbox called `make_palette` that takes an array of colors and creates a plot.

In [None]:
for path in paths[:3]: # Only try this on a subset
    img = toolbox.load_img(path)
    img.thumbnail((200,200)) # In-place!
    img_np = np.array(img)
    # Result: 200x200x3 matrix

    km = KMeans(n_clusters=15) # Set up algorithm to find 5 clusters
    km.fit(img_np.reshape(-1, 3)) # Flatten image but keep color planes: -1 means the computer will figure the dimension itself!
    centers = km.cluster_centers_ # Get the center points of the clusters
    palette = toolbox.make_palette(centers) # Make a palette image

    toolbox.show_img(img)
    toolbox.show_img(palette)



---



## 4. Clustering images...

### ... by "brightness"

Extract the features by "flattening" the image - we are simply concatenating all color values into one huge list.

In [None]:
features = np.zeros((len(paths), 32*32*3))
for i, path in enumerate(tqdm(paths)):
    img = toolbox.load_img(path)
    features[i] = toolbox.flatten_img(img, 32)

In [None]:
print(features.shape)

This gives us 3072-dimensional features, we have to reduce them down to Euclidean space somehow. We will use the [UMAP algorithm](https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Uniform_manifold_approximation_and_projection) for this.

In [None]:
reduced_features = toolbox.reduce_features(features)
print(reduced_features.shape)

Now let's see how that looks

In [None]:
plot = toolbox.plot_imgs_features(paths, 50, reduced_features)

In [None]:
plot

In [None]:
plot.save("plot_raw.jpg")

### ... using CLIP

Usually, however, color will not tell us much about an image dataset. Instead, we can leverage state-of-the-art, fully-trained neural networks, like [CLIP](https://openai.com/blog/clip/), that know something about the *content* of images.

In [None]:
features = np.zeros((len(paths), 512))
for i, path in enumerate(tqdm(paths)):
    img = toolbox.load_img(path)
    features[i] = toolbox.CLIP_img(img)

In [None]:
print(features.shape)

CLIP gives us 512-dimensional embeddings, we have to reduce them down to Euclidean space somehow. Again we will use the UMAP algorithm.

In [None]:
reduced_features = toolbox.reduce_features(features)
print(reduced_features.shape)

In [None]:
plot = toolbox.plot_imgs_features(paths, 200, reduced_features)

In [None]:
plot

In [None]:
plot.save("plot_clip.jpg")



---



## 5. Advanced clustering

There are many interesting clusters that we can see in the plot - can we automate this process, too?

In [None]:
n_clusters = 20

In [None]:
km = KMeans(n_clusters=n_clusters)
km.fit(features)

In [None]:
km.labels_.shape

We are trying to find 5 clusters in 2200 images - here is what the algorihtm found

In [None]:
km.labels_[:100]

Let's try to visualize this

In [None]:
clusters = {}
for c in range(n_clusters):
    clusters[c] = []
    for i, path in enumerate(paths):
        if km.labels_[i] == c:
            clusters[c].append(path)

In [None]:
for c in range(n_clusters):
    toolbox.show_img(toolbox.plot_imgs_grid(clusters[c], 50))

No we will visualize where the clusters are in the overview plot

In [None]:
borders = []
p = toolbox.random_palette(n_clusters)
for i, path in enumerate(paths):
    borders.append(p[km.labels_[i]])

In [None]:
plot = toolbox.plot_imgs_features(paths, 50, reduced_features, borders)

In [None]:
plot

In [None]:
plot.save("plot_clip_borders_clusters.jpg")

Finally, we can use the colored borders to add metadata back in!

In [None]:
borders = []
p = toolbox.random_palette(2)
for i, path in enumerate(paths):
    if "Screen" in path:
        borders.append(p[0])
    else:
        borders.append(p[1])

In [None]:
plot = toolbox.plot_imgs_features(paths, 50, reduced_features, borders)

In [None]:
plot

In [None]:
plot.save("plot_clip_borders_classes.jpg")