<table class="ee-notebook-buttons" align="center">
    <td><a target="_blank"  href="https://colab.research.google.com/github/yotarazona/scikit-eo/blob/main/examples/02.%20Estimated%20area%20and%20uncertainty%20in%20Machine%20Learning.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run in Google Colab</a></td>
</table>

# **_<div class="alert alert-success"><font color='darkred'> Tutorials: 02 Estimating area and uncertainty in Machine Learning</font></div>_**

# 1.0 Libraries

To install ```scikit-eo``` and ```rasterio``` you can do it with the following line:

In [None]:
!pip install scikeo rasterio geopandas

Libraries to be used:

In [3]:
import rasterio
import numpy as np
import pandas as pd
from scikeo.process import confintervalML

Connecting to Google Drive

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## 2.0 Reading classified raster and confusion matrix

The classified image was obtained using ther Random Forest classifier as well as the confusion matrix. This dataset can be downloaded [here](https://drive.google.com/drive/folders/193RhNpACu9THcOZu8OzMh-btnFCOgHrU?usp=sharing):

In [6]:
# classified raster
path_raster = "/content/drive/MyDrive/Packages/scikit-eo_data/02_Uncertainty/LC08_232066_20190727_Label.tif"
img = rasterio.open(path_raster).read(1)

# confusion matrix
path_matrix = "/content/drive/MyDrive/Packages/scikit-eo_data/02_Uncertainty/confusion_matrix.csv"
conf_error = pd.read_csv(path_matrix, index_col= 0, sep = ';')

Confusion matrix

In [7]:
conf_error

Unnamed: 0,1.0,2.0,3.0,4.0,Total,Users_Accuracy,Commission
1.0,15.0,0.0,0.0,0.0,15.0,100.0,0.0
2.0,0.0,15.0,0.0,1.0,16.0,93.75,6.25
3.0,0.0,0.0,14.0,0.0,14.0,100.0,0.0
4.0,0.0,1.0,0.0,18.0,19.0,94.736842,5.263158
Total,15.0,16.0,14.0,19.0,,,
Producer_Accuracy,100.0,93.75,100.0,94.736842,,,
Omission,0.0,6.25,0.0,5.263158,,,


Only confusion matrix values:

In [8]:
values = conf_error.iloc[0:4, 0:4].to_numpy()
values

array([[15.,  0.,  0.,  0.],
       [ 0., 15.,  0.,  1.],
       [ 0.,  0., 14.,  0.],
       [ 0.,  1.,  0., 18.]])

## 3.0 Results

Obtaining estimated area and uncertainty. Be careful with the parameter to be used with the ```confintervalML()``` function. Let's explain in detail:
- *matrix*: is the confusion matrix with only values
- *image_pred*: the classified image in rows and cols (2d)
- *pixel_size*: the pixel size
- *nodata*: in this image nodata has a -9999 value, but in other cases it could take 0 or NaN. So, be careful with this parameter.

In [9]:
confintervalML(matrix = values, image_pred = img, pixel_size = 30, nodata = -9999)

***** Confusion Matrix by Estimated Proportions of area an uncertainty*****

Overall accuracy with 95%
0.9758 ± 0.0418

Confusion matrix
            1       2       3       4  Total[Wi]  Area[pixels]
1      0.0414  0.0000  0.0000  0.0000     0.0414       67246.0
2      0.0000  0.0462  0.0000  0.0031     0.0493       79983.0
3      0.0000  0.0000  0.5081  0.0000     0.5081      825123.0
4      0.0000  0.0211  0.0000  0.3801     0.4012      651448.0
Total  0.0414  0.0673  0.5081  0.3832     1.0000     1623800.0

User´s accuracy with 95%
1.0: 1.0000 ± 0.0000
2.0: 0.9375 ± 0.1225
3.0: 1.0000 ± 0.0000
4.0: 0.9474 ± 0.1032

Estimating area (Ha) and uncertainty with 95%
1.0: 6052.1400 ± 0.0000
2.0: 9834.3719 ± 6112.1256
3.0: 74261.0700 ± 0.0000
4.0: 55994.4181 ± 6112.1256
