# Get the overall confusion matrix

This notebook takes the aggregated predictions generated in the previous notebook and arranges them in a confusion matrix. We make the criterion for comission/omission quite strict by ensuring that an outlier is defined only if all 3 predicted class labels are the same and NOT EQUAL to the labeled class. Relaxing this rule will result in more confusion.

We show results in terms of parcel counts and in terms of total area in the parcels. Not surprisingly, the overall accuracy in the latter is significantly larger, as most confusion occurs with smaller parcels.



In [None]:
import psycopg2
import pandas as pd
import numpy as np
import psycopg2
from collections import Counter

In [None]:
# Conect to the database
conn = psycopg2.connect(
    host="localhost",
    database="postgres",
    user="postgres",
    password="")
cur = conn.cursor()

In [None]:
# Set the table names
parcels_table = "aoi2020"
bs_tensorflow = "bs_tensorflow"

# Set the folder to store the data
data_folder = ''

In [1]:

classLabelSql = F"""
    SELECT pid::int, class::int, majclass::int, majcount::int,
        st_area(wkb_geometry)/10000.0 as area 
    FROM {bs_tensorflow} tf, {parcels_table}
    WHERE tf.pid = {parcels}.ogc_fid
    """

df = pd.read_sql_query(classLabelSql, conn)

carea = np.zeros((11, 11))
ccnt = np.zeros((11, 11))

for i in df.index:
    row = df.loc[i]
    n = int(row['class'])    
    m = int(row['majclass'])
    area = float(row['area'])
    cnt = int(row['majcount'])
    if (cnt > 2) and (n != m):
        carea[n,m] += area
        ccnt[n,m] += 1
    else:
        carea[n,n] += area
        ccnt[n,n] += 1

np.set_printoptions(suppress=True, precision=0)
pd.set_option('expand_frame_repr', False)
pd.set_option('precision', 1)

print ("Overall Accuracy (area): ", 100.0*carea.trace()/carea.sum())

# These are the ordered labels for class 0 to 10
crops =['GRA', 'MAI', 'POT', 'WWH', 'SBT', 'WBA', 'WOR', 'SCE', 'WCE', 'VEG', 'FAL']

print(pd.DataFrame(carea, index = crops, columns = crops))

print ("\n\nOverall Accuracy (count): ", 100.0*ccnt.trace()/ccnt.sum())

print(pd.DataFrame(ccnt, index = crops, columns = crops))



Overall Accuracy (area):  94.26771705313865
          GRA      MAI      POT       WWH    SBT      WBA      WOR       SCE      WCE      VEG    FAL
GRA  193017.4    783.2     37.3     386.0    0.9     74.2     15.3    1359.9   1306.8     85.1  199.0
MAI     790.4  95602.4     78.6      16.2    2.7     17.9      6.0    1074.9     78.5     15.1    6.6
POT     161.3    463.6  13388.4       1.3   35.1     25.9      0.0     367.6     36.5    202.7    0.4
WWH     492.6     55.0     22.1  119928.1    0.0    141.9     38.5    2561.3    312.7      0.0    0.4
SBT       8.8    322.9    550.1       0.0  836.3      0.0      0.0      22.0      1.9      0.0    0.0
WBA     308.3      1.8     11.3     812.5    0.0  28612.2      9.1     299.9    260.0      0.0    0.0
WOR      51.4    100.1      4.2     178.3    0.0    141.8  43265.4     438.0    100.2    135.8    6.0
SCE    3008.9   1870.2    228.0     878.1    0.0     91.9     67.0  238449.1   2196.9    150.9   59.9
WCE    1362.0    327.3      2.6     74

You can now move back to the general notebook session to select and plot the outlier signatures versus their neighboring conformist parcels.


In [None]:
# Close database connection
database.close_connection()