### Basic Image Processing and Classification

### Image Processing

>Image processign is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it.

In [None]:
from osgeo import gdal
import os 
import matplotlib.pyplot as plt
import numpy as np
from pathlib import path
import pandas as pd
import matplotlib as mpl

In [None]:
os.chdir(r'D:\haridwar\multi')

In [None]:
img_file = 'multi_band.tif'

In [None]:
ds=gdal.Open(img_file)

In [None]:
image=ds.ReadAsArray()

In [None]:
image.shape

In [None]:
image_res=np.stack((image[2],image[1],image[0]), axis=-1)

In [None]:
image_res.shape

In [None]:
image_res

### Rescaling Image

In [None]:
image_rescaled=image_res*2

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.imshow(image_res)

plt.subplot(1,2,2)
plt.imshow(image_rescaled)

Geometric Transformation of Images

>But OpenCV provides scaled rotation with adjustable center of roataion so that you can rotate at any location you prefer. Modified transformation matrix is given by

            -> go to the lecture 9 of geoprocessing using python 

In [2]:
import cv2
import numpy as np

In [None]:
img=image[0]

In [None]:
img[0]

In [None]:
rows,cols=img.shape

In [None]:
M=cv2.getRotationMatrix2D((cols/2,rows/2),90,1)

In [None]:
M

In [None]:
dst=cv2.warpAffine(img,M,(cols, rows))

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.imshow(img,cmap='gray')

plt.subplot(1,2,2)
plt.imshow(dst,cmap='gray')

### Smoothing Images

### Convolution Filter

>OpenCV provides a function, cv2.filter2D(), to convolve a kernel with an image. As an example, we will try an averaging filter on an image. A 5x5 averaging filter kernel can be defined as follows :

        

In [None]:
kernel=np.ones((5,5),np.float32)/25

In [None]:
kernel

In [None]:
dst=cv2.filter2D(img,-1,kernel)

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.imshow(img,cmap='gray')
plt.subplot(1,2,2)
plt.imshow(dst,cmap='gray')

### Image Blurring (Image Smoothing)

>Image blurring is achieved by convolving the image with a low-pass filter kernel. It is useful for removing noise. It actually removes high frequency content(e.g., noise , edges) from the image resulting in edges being blurred when this is filter is applied.

In [None]:
blur=cv2.blur(img,(5,5))

In [None]:
gauss=cv2.GaussianBlur(img,(5,5),0)

In [None]:
median=cv2.medianBlur(img,5)

In [None]:
plt.figure(figsize=(15,15))
plt.subplot(2,2,1)
plt.imshow(img,cmap='gray')

plt.subplot(2,2,2)
plt.imshow(blur, cmap='gray')

plt.subplot(2,2,3)
plt.imshow(gauss,cmap='gray')

plt.subplot(2,2,4)
plt.imshow(median,cmap='gray')

### Morphological Transformations

>Morphological transformations are some simple operations based on the image shape. It is normally performed on binary images. It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation. Two basic morphological operators are Erosion and Dilation.

### Erosion

>The basic idea of erosion is just like soil eroison oly, it erodes away the boundaries of foreground object(Always try to keep foreground in while). So what does it do? The kernel slides through the image (as in 2D convolution). A pixel in the original image(either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).

In [None]:
kernel= np.ones((5,5), np.unit8)
erosion=cv2.erode(img,kernel, iterations = 1)

In [None]:
plt.figure(figsize(15,5))
plt.subplot(1,2,1)
plt.imshow(img,cmap='gray')

plt.subplot(1,2,2)
plt.imshow(erosion, cmap='gray')

### Dialation

>It is just opposite of erosion. Here, a pixel elements is '1' if atleast one pixel under the kernel is '1'. So it increases the white region in the image or size of foreground object increases.

In [None]:
dialtion = cv2.dilate(img, kernel, iterations = 1)

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.imshow(img,cmap='gray')

plt.subplot(1,2,2)
plt.imshow(dilation, cmap='gray')

### Canny Edge Detecton

> Canny Edge Detection is a popular edge detection algorithm. It was developed by John F. Canny in 1986.

In [None]:
img=(img).astype(np.uint8)

In [None]:
edges=cv2.Canny(img,100,200)

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.imshow(img,cmap='gray')

plt.subplot(1,2,2)
plt.imshow(edges, cmap='gray')

In [None]:
plt.close()

In [None]:
cv2.destroyAllWindows()

In [None]:
del ds

### Image Classification 

>Image classification assigning pixels in the image to categories or classes of interest.

>It may be considered as a mapping from numbers to symbols. Examples : bulit-up areas, waterbody, green, vegetation, bare soil

In order to classify a set of data into different classes or categories, the relationship between the data and the classes into which they are classified must be well understood. Computer classification of remotely sensed images involves the process of the computer program learning relationship between the data and the information classes

## Types of Learning

### Supervised Learning

>Learning process designed to form a mapping from one set of variables (data) to another set of variables(information classes).

### Unsupervised Learning

>Exploration of the data space to discover the underlying the data distribution.

Features are attributes of the data elements based on which the elements are assigned to various classes. In satellite remote sensing, the features are measurements made by sensors in different wavelengths of the electormagnetic specturm-visible/infrared.

### Supervised Classification 

> The classifier has the advantage of an analyst or domain knowledge using which the classifier can be guided to learn the relationship between the data and the classes. The number of classes, prototype pixels for each class can be identified using this prior knowledge

### Unsupervised Classification 

> Unsupervised Classification : When access to domain knowledge or the experience of an analyst is unavailable or unreliable, the data can still be analyzed by numerical exploration, whereby the data are grouped into subsets or clusterbased on statistical similarity

### K-means Clustering Algorithm 

            randomly chose k examples as initial centroids while true:
                create k clusters by assigning each
                    example to closest centroid 
                compute k new centroids by averaging 
                    examples in each cluster
                if centroids don't change:
                    break

In [None]:
img_file = 'multi_band.tif'

In [None]:
ds=gdal.Open(img_file)

In [None]:
data=ds.ReadAsArray()

### K- means Clustering Single Band

In [None]:
from sklearn.cluster import kMeans

In [None]:
red_data=data[1]

In [None]:
red_data

In [None]:
kmeans=KMeans(n_clusters=5, random_state=0).fit(red_data.reshape(-1,1))

In [None]:
kmeans.labels_

In [None]:
claa_data=kmeans.labels_.reshape(red_data.shape)

In [None]:
fig, ax=plt.subplots(figsize=(10,10))
cax=ax.imshow(claa_data)
fig.colorbar(cax)

### K-means Clustering Multi Band

In [None]:
featured_data=np.stack(tuple(data),axis=-1)

In [None]:
featured_data=featured_data.reshape(1151*1151, 4)

In [None]:
featured_data

In [None]:
kmeans = KMeans(n_clusters=4, random_state=0).fit(featured_data)

In [None]:
claa_data=kmeans.labels_.reshape(red_data.shape)

In [None]:
fig, ax= plt.subplots(figsize=(10,10))
cax=ax.imshow(claa_data)
fig.colobar(cax)

In [None]:
plt.close()

### Supervised Classification 

>Using Distance Matrix for Classificaton 

>Simplest approach is probably nearest neighbor

### Nearest Neighbors Algorithm

#### When predicting the label of a new example

>Find the nearest example in the training data

>Predict the label associated with that example

In [None]:
df=pd.read_csv('training_data.csv',index_col=0)

In [None]:
df=df.drop(columns=['Lon','Lat','file'])

In [None]:
df

In [None]:
data=df[['Band2', 'Band3', 'Band4', 'Band5']].to_numpy()

In [None]:
data.shape

In [None]:
df.columns

In [None]:
labels=df['class '].to_numpy()

In [None]:
labels

In [None]:
from sklearn.neighbors import NearestNeighbors

In [None]:
neigh=NearestNeighbors(n_neighbors=1)

In [None]:
neigh.fit(data)

In [None]:
claa_data=neigh.kneighbors(featured_data, return_distance=False)

In [None]:
claa_data=claa_data.reshape(1151,1151)

In [None]:
fig, ax = plt.subplots(figsize=(10,10))
cax=ax.imshow(claa_data)
fig.colorbar(cax)

In [None]:
plt.close()

### Decision Tree Classifier

>>DT are knowledge based.

>>DT are hierarchical rule based approaches.

>>DT predict class membership by recursively partitioning a dataset into homogeneous subsets.

>>DT predict class membership by recursiverly partitioning a dataset into homogeneous subsets. Different variables and splits are then used to split the subsets into four subsets.

In [None]:
import sklearn.tree

In [None]:
clf=tree.DecisionTreeClassifier()

In [None]:
clf=clf.fit(data,labels)

In [None]:
claa_data=clf.predict(featured_data)

In [None]:
claa_data=claa_data.reshape(red_data.shape)

In [None]:
fig, ax=plt.subplots(figsize=(15,15))
cmap=mpl.colors.ListedColormap(['blue', 'red', 'green'])
cax=ax.imshow(claa_data, cmap=cmap)
cbar=fig.colorbar(cax, ticks=[0,1,2],orientation='horizontal')
cbar.ax.set_xticklabels(['water', 'Builtup', 'Vegetation'])

### Random Forest Classifier

> Random forest is ensemble methods which combine the predictions of several base estimators. A random forest is a classifier that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

In [None]:
from sklearn.ensemble import RandomForestClassifier

In [None]:
clf=RandomForestClassifier(max_depth=20, random_state=0)

In [None]:
clf=clf.fit(data, labels)

In [None]:
rf_data=clf.predict(featured_data)

In [None]:
rf_data=rf_data.reshape(red_data.shape)

In [None]:
fig, ax= plt.subplots(figsize=(15,15))
cmap=mpl.colors.ListedColormap(['blue', 'red', 'green'])
cax=ax.imshow(claa_data, cmap=cmap)
cbar=fig.colorbar(cax, ticks=[0,1,2],orientation='horizontal')
cbar.ax