**Motivation**<br>
Storing images when in one hour millions of images are uploaded, is a big challenge to Companies like Instagram.<br>
A colored image in RGB space, if each channel is represented by 256 bit, will result in almost 1.6 million colors. <br>
Do we need so many colors? For example RGB = (0,0,0) is indistinguishable from (1,1,1) from human eyes (both Black).<br>
**Theory**<br>
[K-Means Details](https://medium.com/@ribhu198iit/k-means-clustering-algorithm-16ad985be467)<br>
Here important factor is selection of k, the number of centroids. Say if k=1 (min allowed) then whole picture reduces to single color. If k=256 then whole picture reduces to 256 colors only, each color represented by a centroid.<br>
For example, say pink and maroon are reduced to single color, red.<br>
*Author - Gaurav Kabra*


In [0]:
import os
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from skimage import io
from sklearn.cluster import KMeans, MiniBatchKMeans

from ipywidgets import interact, IntSlider

In [27]:
from google.colab import drive
drive.mount('/content/drive')
os.chdir('/content/drive/My Drive/Coursera')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [28]:
os.listdir()

['StBasilCathedralMoscow.jpg',
 'NY.jpg',
 'Peacock.jpg',
 'GoldenGateBridgeCalifornia.jpg']

In [29]:
# @interact decorator created interactive interface
@interact
def km_img_compress(image=os.listdir(), k=IntSlider(value=8, min=1, max=299, step=1)):
  input_img = io.imread(image)
  # 3 is number of channels, RGB
  # -1 indicates rest dimesnsions of image e.g. (200,100,3) then -1 is 200x100
  # /255 is for normlization in range [0,1] assuming max pixel value is 255 (8-bit repr)
  img_data = (input_img/255).reshape(-1,3)

  # MiniBatchKMeans is faster than KMeans
  kmeans = MiniBatchKMeans(k).fit(img_data)
  # for each pixel in img_data, assign a centroid
  k_colors = kmeans.cluster_centers_[kmeans.predict(img_data)]
  k_img = np.reshape(k_colors, (input_img.shape))

  # plot details
  fig, (ax1, ax2) = plt.subplots(1,2)
  fig.suptitle('Image Compression using KMeans', fontsize=18)

  ax1.set_title('Compressed')
  ax1.set_xticks([])
  ax1.set_yticks([])
  ax1.imshow(k_img)

  ax2.set_title('Original'+str(input_img.shape))
  ax2.set_xticks([])
  ax2.set_yticks([])
  ax2.imshow(input_img)

  plt.subplots_adjust(top=0.85)
  plt.show()


interactive(children=(Dropdown(description='image', options=('StBasilCathedralMoscow.jpg', 'NY.jpg', 'Peacock.…