# Satellite Image Classification Project
## Project Overview
This script build and validate a classification model using satellite image data.
## Introduction
There are three different classification models applied:
1. Random Forest
2. Extra Tree
3. Bagging

Performance of each  model will be assesed visually by confusion matrix. Further preformance report is printed followed by Kohen Kappa Score values.

In [None]:
%matplotlib inline

## Python modules in use
To build and validate model performance some libraries were necessary. Version of libriaries working with this scirpt are included in *requirements.txt* [file](https://github.com/pciuh/satellite-image-classification/blob/main/requirements.txt)

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.interpolate as sci
import seaborn as sns

from matplotlib import cm
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier, BaggingClassifier
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report, confusion_matrix, cohen_kappa_score

In [13]:
########### Random Seed for Split Data
SEED = 30082024

iDir = 'input/'

fnam = 'labels.csv'
df = pd.read_csv(iDir + fnam,sep=',')
lbl = df.label.drop_duplicates().to_numpy()

In [9]:
_,num = np.unique(df.label,return_counts=True)

variables=['x','y','band4','band3','band2']
variables=['x','y','band1','band2','band3','band4','band5','band6']

X = df[variables].to_numpy()
Y = []
for i,l in enumerate(lbl):
    for ni in range(num[i]):
        Y=np.append(Y,i)

TIT = {'RF' : 'Random Forest', 'ET' : 'Extra Tree', 'BA' : 'Bagging'}

mvec = [ExtraTreesClassifier(),
        RandomForestClassifier(),
        BaggingClassifier()]

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.1, random_state = SEED)

print('Total Size:',X.shape,Y.shape)
print('Train Size:',X_train.shape, Y_train.shape)
print(' Test Size:',X_test.shape, Y_test.shape)

Total Size: (319, 8) (319,)
Train Size: (287, 8) (287,)
 Test Size: (32, 8) (32,)


In [10]:
pvec = []
for v in mvec:
    v.fit(X_train, Y_train)
    pvec.append(v.predict(X_test))

In [11]:
ofnam = 'class_report-%.8d.txt'%SEED
of = open(ofnam,'w')
of.write('            %10s%10s%10s\n'%('Samples','Category','Outcome'))
of.write('Total Size: %10.0f%10.0f%10.0f\n'%(X.shape[0],X.shape[1],Y.shape[0]))
of.write('Train Size: %10.0f%10.0f%10.0f\n'%(X_train.shape[0],X_train.shape[1],Y_train.shape[0]))
of.write(' Test Size: %10.0f%10.0f%10.0f\n'%(X_test.shape[0],X_test.shape[1],Y_test.shape[0]))

p_crf,p_cet,p_cba = pvec
mNam = ['RF','ET','BA']

of.write('\n%36s\n'%'Model Performace')
print('\nModel Performance')
cfm = []
for i,v in enumerate(pvec):
    print('\n%18s:'%TIT[mNam[i]])
    print(classification_report(Y_test, v, target_names=lbl))
    print('Kappa Score:',cohen_kappa_score(Y_test, v))

    of.write('\n%14s:\n'%TIT[mNam[i]])
    of.write(classification_report(Y_test, v, target_names=lbl))
    of.write('\nKappa Score:%12.3f\n'%(cohen_kappa_score(Y_test,v))) 

    cfm.append(confusion_matrix(Y_test, v))
of.close()


Model Performance

     Random Forest:
              precision    recall  f1-score   support

       road1       1.00      0.50      0.67         2
       road2       0.75      1.00      0.86         3
       grass       0.67      0.50      0.57         4
       water       1.00      1.00      1.00         1
       roof1       1.00      0.50      0.67         2
       roof2       0.67      1.00      0.80         2
       roof3       1.00      1.00      1.00         2
        tree       0.71      0.83      0.77         6
      shadow       0.67      0.57      0.62         7
        soil       0.25      0.33      0.29         3

    accuracy                           0.69        32
   macro avg       0.77      0.72      0.72        32
weighted avg       0.72      0.69      0.68        32

Kappa Score: 0.6400449943757031

        Extra Tree:
              precision    recall  f1-score   support

       road1       1.00      1.00      1.00         2
       road2       1.00      1.00      

## Data visualization