# Assessing analysts accuracy at labelling reference data

Collect Earth Online is being used as a tool for collecting cropland reference data.  The sample data contains 'known' labels seeded among the other samples. This script will compare the known test labels (GFSAD's validation data), against the user collected labels.

Inputs will be:

1. `ceo-data....csv` : The results from collecting training data in the CEO tool

Output will be:
1. A `confusion error matrix` containing Overall, Producer's, and User's accuracy, along with the F1 score.

***

In [1]:
import pandas as pd
import numpy as np
import seaborn as sn
import geopandas as gpd
import matplotlib.pyplot as plt
from sklearn.metrics import f1_score

## Analysis Parameters

In [2]:
folder = 'data/training_validation/collect_earth/eastern/'
csv = 'data/training_validation/collect_earth/eastern/ceo-cropland-reference-data-acquisition---eastern-region-sample-data-2020-10-06.csv'

### Load the dataset

In [3]:
#ground truth shapefile
df = pd.read_csv(csv)

### Clean up dataframe


In [4]:
# this line if testing sample:
# df = df[['LON', 'LAT', 'SMPL_CLASS','IS THE SAMPLE AREA ENTIRELY: CROP, NON-CROP, MIXED, OR UNSURE?']]

#This line if entire dataset:
df = df[['LON', 'LAT', 'SMPL_SAMPLEID', 'SMPL_GFSAD_SAMP','SMPL_CLASS','IS THE SAMPLE AREA ENTIRELY: CROP, NON-CROP, MIXED, OR UNSURE?']]

#rename columns
df = df.rename(columns={'IS THE SAMPLE AREA ENTIRELY: CROP, NON-CROP, MIXED, OR UNSURE?':'Prediction',
                        'SMPL_CLASS':'Actual'})

#remove nan rows
df = df.dropna()
df.head()

Unnamed: 0,LON,LAT,SMPL_SAMPLEID,SMPL_GFSAD_SAMP,Actual,Prediction
0,36.707049,12.103541,0,0,2,mixed
1,36.681716,4.016547,1,0,1,non-crop
2,44.957895,8.458357,2,0,1,non-crop
3,40.820614,9.178447,3,0,2,mixed
4,33.940957,-8.582325,4,0,1,non-crop


***
If this is the `test sample` (first 50-100 samples used for training analysts) then ignore the following cell.

If this is the reference data sample (2100) points, then run the cell below to extract the GFSAD validation samples before running the rest of the code


In [5]:
df = df[df['SMPL_GFSAD_SAMP']==True]

***

### Reclassify prediction & actual columns

1 = crop, 
0 = non-crop

In [7]:
df['Prediction'] = np.where(df['Prediction']=='non-crop', 0, df['Prediction'])
df['Prediction'] = np.where(df['Prediction']=='crop', 1, df['Prediction'])

df['Actual'] = np.where(df['Actual']==1, 0, df['Actual'])
df['Actual'] = np.where(df['Actual']==2, 1, df['Actual'])

### Generate a confusion matrix with all classes

In [8]:
confusion_matrix = pd.crosstab(df['Actual'],
                               df['Prediction'],
                               rownames=['Actual'],
                               colnames=['Prediction'],
                               margins=True)

confusion_matrix

Prediction,0,1,mixed,unsure,All
Actual,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,44,1,3,2,50
1,1,40,6,3,50
All,45,41,9,5,100


### Reclassify into a binary assessment

In [9]:
counts = df.groupby('Prediction').count()

print("Total number of samples: " + str(len(df)))
print("Number of 'mixed' samples: "+ str(counts[counts.index=='mixed']['Actual'].values[0]))
print("Number of 'unsure' samples: "+ str(counts[counts.index=='unsure']['Actual'].values[0]))

print("Dropping 'mixed' and 'unsure' samples")

df = df.drop(df[df['Prediction']=='mixed'].index)
df = df.drop(df[df['Prediction']=='unsure'].index)

Total number of samples: 100
Number of 'mixed' samples: 9
Number of 'unsure' samples: 5
Dropping 'mixed' and 'unsure' samples


---

### Recreate confusion matrix

In [10]:
confusion_matrix = pd.crosstab(df['Actual'],
                               df['Prediction'],
                               rownames=['Actual'],
                               colnames=['Prediction'],
                               margins=True)

confusion_matrix

Prediction,0,1,All
Actual,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,44,1,45
1,1,40,41
All,45,41,86


### Calculate User's and Producer's Accuracy

`User's Accuracy`

In [11]:
confusion_matrix["User's"] = [confusion_matrix.loc[0, 0] / confusion_matrix.loc[0, 'All'] * 100,
                              confusion_matrix.loc[1, 1] / confusion_matrix.loc[1, 'All'] * 100,
                              np.nan]

`Producer's Accuracy`

In [12]:
producers_accuracy = pd.Series([confusion_matrix[0][0] / confusion_matrix[0]['All'] * 100,
                                confusion_matrix[1][1] / confusion_matrix[1]['All'] * 100]
                         ).rename("Producer's")

confusion_matrix = confusion_matrix.append(producers_accuracy)

`Overall Accuracy`

In [13]:
confusion_matrix.loc["Producer's", "User's"] = (confusion_matrix.loc[0, 0] + 
                                                confusion_matrix.loc[1, 1]) / confusion_matrix.loc['All', 'All'] * 100

`F1 Score`

The F1 score is the harmonic mean of the precision and recall, where an F1 score reaches its best value at 1 (perfect precision and recall), and is calculated as:

$$
\begin{aligned}
\text{Fscore} = 2 \times \frac{\text{UA} \times \text{PA}}{\text{UA} + \text{PA}}.
\end{aligned}
$$

Where UA = Users Accuracy, and PA = Producer's Accuracy

In [14]:
fscore = pd.Series([(2*(confusion_matrix.loc[0, "User's"]*confusion_matrix.loc["Producer's", 0]) / (confusion_matrix.loc[0, "User's"]+confusion_matrix.loc["Producer's", 0])) / 100,
                    f1_score(df['Actual'].astype(np.int8), df['Prediction'].astype(np.int8), average='binary')]
                         ).rename("F-score")

confusion_matrix = confusion_matrix.append(fscore)

### Tidy Confusion Matrix

* Limit decimal places,
* Add readable class names
* Remove non-sensical values 

In [15]:
# round numbers
confusion_matrix = confusion_matrix.round(decimals=2)

In [16]:
# rename booleans to class names
confusion_matrix = confusion_matrix.rename(columns={0:'Non-crop', 1:'Crop', 'All':'Total'},
                                            index={0:'Non-crop', 1:'Crop', 'All':'Total'})

In [17]:
#remove the nonsensical values in the table
confusion_matrix.loc['Total', "User's"] = '--'
confusion_matrix.loc["Producer's", 'Total'] = '--'
confusion_matrix.loc["F-score", 'Total'] = '--'
confusion_matrix.loc["F-score", "User's"] = '--'

In [18]:
confusion_matrix

Prediction,Non-crop,Crop,Total,User's
Actual,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Non-crop,44.0,1.0,45,97.78
Crop,1.0,40.0,41,97.56
Total,45.0,41.0,86,--
Producer's,97.78,97.56,--,97.67
F-score,0.98,0.98,--,--


### Export csv

In [None]:
confusion_matrix.to_csv(folder+ 'test_sample_results.csv')