# **Image Compression using K-Means Clustering**

Image compression is reducing the size that an image takes while storing or transmitting.  Images of high quality take a lot of memory while storing, whereas the low image of low quality takes less memory. There are many ways by which we can compress images, one of which is K-Means Clustering. Here, we will discuss image compression and demonstrate how image compression can be done using K-Means clustering.

In this session we will cover:

> * What is K-Means Clustering?
> * Image Compression using K-Means clustering
> * Creating Interactive controls to compress image
> * Visualize the compressed image

For theory, please go through [this](https://analyticsindiamag.com/beginners-guide-to-image-compression-using-k-means-clustering/) article.

## **Implementation of Image Compression using K-Means Clustering**

K-Means Clustering is defined under the SK-Learn library of python, before using it let us install it by pip install sklearn

In [None]:
!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn scikit-image nltk gensim --user -q --no-warn-script-location

import IPython
IPython.Application.instance().kernel.do_shutdown(True)


### **a. Importing required libraries**

Here we require libraries for Visualization, Compression and creating interactive widgets.

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as image
%matplotlib inline
plt.style.use("ggplot")
from skimage import io
from sklearn.cluster import KMeans
from ipywidgets import interact, interactive, fixed, interact_manual, IntSlider
import ipywidgets as widgets

### **b. Loading Images Dataset and analyzing the properties of images**

In order to access images, you can have different images stored in a folder and parse the images one at a time. We will set a standard view size i.e. (20,12) for all the images for maintaining consistency. We will also set the axes title as blank.

In [1]:
# !wget -nd https://i.pinimg.com/originals/72/38/38/7238383546f5efe19f6e7f440fb9fcb5.jpg

--2021-10-28 14:32:09--  https://i.pinimg.com/originals/72/38/38/7238383546f5efe19f6e7f440fb9fcb5.jpg
Resolving i.pinimg.com (i.pinimg.com)... 104.101.16.77, 2600:140f:dc00:1a2::1931, 2600:140f:dc00:18c::1931, ...
Connecting to i.pinimg.com (i.pinimg.com)|104.101.16.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 59441 (58K) [image/jpeg]
Saving to: ‘7238383546f5efe19f6e7f440fb9fcb5.jpg’


2021-10-28 14:32:10 (2.23 MB/s) - ‘7238383546f5efe19f6e7f440fb9fcb5.jpg’ saved [59441/59441]



In [None]:
plt.rcParams['figure.figsize'] = (20, 12)
image = io.imread('7238383546f5efe19f6e7f440fb9fcb5.jpg')
labels = plt.axes(xticks=[], yticks=[])
labels.imshow(image);

In [None]:
image.shape

In [None]:
image.size

The image shape contains the rows, columns and channels in the image. Here we can see that it contains 3 channels because it is a colored picture, similarly, if we check the shape of grayscale images it will have only 1 channel. Image size displayed here shows the total no. of pixels.

We will reshape this image so that it contains only 2 parameters: product of rows and columns i.e. no. of pixels and no. of channels. We will divide the image size by 255 because that is the maximum intensity value for RGB individually.

In [None]:
image_data = (image / 255.0).reshape(749 * 500, 3)
image_data.shape

## **c. Analyzing the Color Space**

Colorspace is the specific organization of colours in physical appearance where any 2 images having the same colour model can have entirely different colorspace. Here we will analyze the colorspace of our image.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

class plot_utils:
    def __init__(self, img_data, title, num_pixels=10000, colors=None):
        self.img_data = img_data
        self.title = title
        self.num_pixels = num_pixels
        self.colors = colors

    def colorSpace(self):
        if self.colors is None:
            self.colors = self.img_data

        rand = np.random.RandomState(42)
        index = rand.permutation(self.img_data.shape[0])[:self.num_pixels]
        colors = self.colors[index]
        R, G, B = self.img_data[index].T
        fig, ax = plt.subplots(1, 2, figsize=(12,8))
        ax[0].scatter(R, G, color=colors, marker='.')
        ax[0].set(xlabel='Red', ylabel='Green', xlim=(0, 1), ylim=(0, 1))
        ax[1].scatter(R, B, color=colors, marker='.')
        ax[1].set(xlabel='Red', ylabel='Blue', xlim=(0, 1), ylim=(0, 1))
        fig.suptitle(self.title, size=20)

In [None]:
color = plot_utils(image_data, title='Original possible colors')
color.colorSpace()

As our image is based on the RGB model we can see the colorspace according to Red, Green and Blue colors. We will reduce these possible colors from 16 Million to only 160 colors and visualize the colorspace again.

As we are going to reduce this colorspace to 160 it means we have to make 160 clusters and pass it to K Means clustering.

In [None]:
from sklearn.cluster import MiniBatchKMeans
kmeans = MiniBatchKMeans(16).fit(image_data)
k_colors = kmeans.cluster_centers_[kmeans.predict(image_data)]


In [None]:
reduced = plot_utils(image_data, colors=k_colors, title="Reduced color space only 160 colors")
reduced.colorSpace()

Here we can clearly see that the color scheme is changed as compared to the original colorspace to only 160 Colors. Now we will work on creating an interactive workspace where we can select different images and select different sizes of clusters from 1 to 256.

### **d. Image Compression using Interactive widgets**

Now we will create an image compression module, but let’s make it a little interactive. So what we will do is, we will create widgets using the ipywidgets library of python. It is mainly used for creating widgets like range slider, dropdown menu etc. 

We will create a range slider which will allow us to choose the value of the K i.e the number of clusters, also we will create a dropdown menu from which we can select different images from our image dataset.

We will wrap this up in a single user-defined function and display the comparison of the image with the original one.

We will use @interact from ipywidgets which automatically creates the user interface. 

In [None]:
#location of image dataset
img_lib = '7238383546f5efe19f6e7f440fb9fcb5.jpg'  
@interact
#defining compression function
def compression(image=img_lib, k=IntSlider(min=1, max=256, 
                             step=1,value=160,  continuous_update=False, 
                                         layout=dict(width='100%'))):

                        
    img_dir = './'
   #loading the image and reshaping it as done above
    input_img = io.imread(img_dir + image)
    img_data = (input_img / 255.0).reshape(-1, 3)
  
    #Using K value to create clusters
    kmeans = MiniBatchKMeans(k).fit(img_data)
    k_colors = kmeans.cluster_centers_[kmeans.predict(img_data)]
 
    # Reshaping the image according to the clusters
    k_img = np.reshape(k_colors, (input_img.shape))

    #Plotting the compressed and original image
    fig, (ax1, ax2) = plt.subplots(1, 2)
    fig.suptitle('K-means Image Compressor', fontsize=20)

    ax1.set_title('Compressed Image')
    ax1.set_xticks([])
    ax1.set_yticks([])
    ax1.imshow(k_img)

    ax2.set_title('Original (16Million Colors)')
    ax2.set_xticks([])
    ax2.set_yticks([])
    ax2.imshow(input_img)

    plt.subplots_adjust(top=0.85)
    plt.show()
    # import matplotlib
    # matplotlib.image.imsave('Compressed.jpg', k_img)


#**Related Articles:**

> * [Image Compression using K-Means Clustering](https://analyticsindiamag.com/beginners-guide-to-image-compression-using-k-means-clustering/)

> * [PCA for images](https://analyticsindiamag.com/how-does-pca-dimension-reduction-work-for-images/)

> * [Image Generation with Tensorflow Keras](https://analyticsindiamag.com/getting-started-image-generation-tensorflow-keras/)

> * [Getting Started with Computer Vision using Tensorflow Keras](https://analyticsindiamag.com/computer-vision-using-tensorflow-keras/)

> * [Feature Extraction of Images with Skimage](https://analyticsindiamag.com/image-feature-extraction-using-scikit-image-a-hands-on-guide/)

> * [Bitwise Operations On Images Using OpenCV](https://analyticsindiamag.com/how-to-implement-bitwise-operations-on-images-using-opencv/)
