# Image recognition Wetterskip Fryslan #

De opdracht is om op lucht-/satellietfoto's water te onderscheiden van land. Hiermee zou het waterschap geautomatiseerd veranderingen in de waterstand kunnen detecteren.

### Aanpak ###

Van een satellietfoto worden tegels gemaakt die dienen als de trainingsdata voor een classifier. 

Van dezelfde foto in zwart (water)/ wit (geen water) worden labels (water, oever, land gemaakt). 

### Functies, Import en Globale variabelen ###

In [1]:
def sliding_window(image, stepSize, windowSize):
	# slide a window across the image
	for y in range(0, image.shape[0], stepSize):
		for x in range(0, image.shape[1], stepSize):
			# yield the current window
			yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])

def tegelen(image,size):
    return np.array([image[x:x+size,y:y+size] for x in range(0, len(image),size) for y in range(0, len(image),size)])


In [61]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy import ndimage as ndi
from itertools import chain
from time import time
#import logging

from skimage.external import tifffile
from skimage.color import rgb2gray
from skimage.io import imshow
from skimage.feature import local_binary_pattern

import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.svm import LinearSVC

## Aanmaken trainingsdata ##

### Inlezen image ###
Inlezen image en verdelen in tegels om zo meer traingsdata te krijgen

In [3]:
im = tifffile.imread('water.tif', key=0)

In [5]:
#verdeel image in vierkante tegels met zijde z
z = 10
tegels = tegelen(im,z)

In [6]:
#Zet om in grijswaarden
#List van 10.000 regels met 10 x 10 waarden
tegel_grys = [rgb2gray(tegel) for tegel in tegels]

In [None]:
#Laat een tegel zien
plt.imshow(tegel_grys[9002], cmap=cm.gray)

### Inlezen van zwart/wit image ###
zwart/wit image. WIT = Land. ZWART = Water.

In [7]:
im_lab = tifffile.imread('water_labels.tif', key=0)

In [None]:
plt.imshow(im_lab, cmap=cm.gray)

In [8]:
#verdeel image in vierkante tegels met zijde z
#10.000 regels van 10 x 10
tegels_lab = tegelen(im_lab,z)

In [9]:
#Elke regel omzetten naar 1 waarde. 1.0 = Land, 0.0 = Water. Alles er tussen is oever.
labels = [(np.sum(tegels_lab[x])/25500 for x in range(0,len(tegels_lab))]

##### Labels toevoegen aan Image #####
Image omzetten naar dataframe. Labels toevoegen.

In [10]:
tegels_flat = [list(chain.from_iterable(tegel)) for tegel in tegel_grys]

In [11]:
tf_df = pd.DataFrame(tegels_flat)

In [13]:
tf_df['label'] = labels

In [15]:
#Alleen Land en Water overhouden.
tf_df.drop(tf_df[(tf_df.label < 1.0) & (tf_df.label > 0.0)].index, inplace=True)

In [39]:
y = (tf_df['label'])
X = tf_df.iloc[:,0:100]

In [43]:
y = y.astype(int)

In [45]:
#Opsplitsen data in train en test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.75, random_state=0)

### Trainen van een classifier ###

In [50]:
clf = LinearSVC()

In [52]:
clf.fit(X_train, y_train)

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)

In [62]:
#Hoe goed werkt het model

print("Predicting Land (1) of Water(0)")
t0 = time()
y_pred = clf.predict(X_test)
print("done in %0.3fs" % (time() - t0))

print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

Predicting Land (1) of Water(0)
done in 0.007s
             precision    recall  f1-score   support

          0       1.00      0.00      0.00      1801
          1       0.75      1.00      0.86      5342

avg / total       0.81      0.75      0.64      7143

[[   4 1797]
 [   0 5342]]
0.74842503149937
