# Plastic Map

The Plastic Map was created by the Remote Sensing of Coastal and Urban Environments (RESCUE) research group at Federal University of Rio Grande do Sul (UFRGS), with the contributions of researchers Bianca Matos de Barros, Cristiano Lima Hackmann, and Douglas Galimberti Barbosa. If you'd like to get in touch, please send an email to bianca.matos@ufrgs.br.

## Imports

First, we begin by importing the necessary Python libraries for the algorithm. Please ensure that these libraries are installed in your Python environment for the imports to function correctly.

In [18]:
from modules import dart_files, rsdata_classification, rsdata_charts
#from modules import dart_files, tiff_files, rsdata_classification, rsdata_charts
from scipy.stats import ks_2samp
#from sklearn.ensemble import RandomForestClassifier
#from sklearn.metrics import accuracy_score, balanced_accuracy_score, confusion_matrix, f1_score, fbeta_score, jaccard_score, log_loss, precision_score, recall_score, roc_auc_score
#from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold
#from sklearn.neural_network import MLPClassifier
#import datetime
import os
import pandas as pd
import shutil

## Parameters

We are now defining the feature names (columns) that will serve as input for the classifiers.

The 'feature_names' variable contains the names of the MSI/Sentinel-2 sensor bands with spatial resolution equal to 10 meters or 20 meters. The 'radiometric_indexes' variable contains the indices presented by [1] and [2]. 

Note: The NDMI index was also evaluated in [1], but its formula was identical to that of the NDWI index. Therefore, the NDMI index was not considered here.

In [19]:
feature_names = ['Blue', 'Green', 'Red', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'NIR1', 'NIR2', 'NIRa', 'SWIR1', 'SWIR2']
radiometric_indexes = ['NDWI', 'WRI', 'NDVI', 'AWEI', 'MNDWI', 'SR', 'PI', 'RNDVI', 'FDI']

In this repository, files are organized in the following directory structure:
- **charts**: Contains charts generated during data analysis and classification.
- **modules**: Includes Python modules used in the application.
- **files/csv_files**: Stores reflectance values from compiled datasets, with each CSV file containing a dataset.
- **files/dart_files**: Holds DART (Discrete Anisotropic Radiative Transfer) simulations converted to ASC format.
- **files/raw_dart_files**: Contains original-format DART files.
- **files/tiff_files**: Stores MSI/Sentinel-2 images after atmospheric correction, cropping of regions of interest, and conversion to TIFF format.


## Simulated dataset (DART)

DART files are organized within a directory structure that includes folders identifying the simulations and sensor bands, and files with standard names in MPR or MP# format. To compile reflectance values from the simulated data, it's necessary to convert the files to ASC format. This conversion can be performed using GIS software (such as QGIS). Additionally, it's essential to rename the files to indicate the simulation and the specific band they contain.

In this project, there are two sets of simulated data, one produced in 2021 and another in 2023. The 2021 dataset has already been processed and will be imported at a later stage. The 2023 dataset will undergo an extraction process to fulfill the specified requirements.

### Extracting DART raw files

The 'file_names' variable contains the default filenames that require renaming and conversion to ASC format. The 'info' variable comprises lists, each detailing the source directory, destination directory, 'file_names' variable, submersion depth, color, status (Dry, Wet, or Submerged), and the name of the polymer used for each simulation configuration.

The structure of the 'raw_files' directory, as depicted by the 'info' variable, follows a model based on simulation settings: source_folder/polymer/coverage_percentage/band.

In [21]:
file_names = ['ima01_VZ=000_0_VA=000_0.mp#', 'ima01_VZ=000_0_VA=000_0.mpr']

info = [
            ['files/raw_dart_files/Limpa_S0/LDPE/', 
             'files/dart_files/Sentinel2_artificial/Limpa/LDPE/',
             file_names,
             'S0', 'Transparent', 'Dry', 'LDPE'],
            ['files/raw_dart_files/Limpa_S0/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
            file_names,
            'S0', 'Orange', 'Wet', 'PP'],
            ['files/raw_dart_files/Limpa_S0/PET/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PET/',
             file_names,
             'S0', 'Transparent', 'Dry', 'PET'],
            ['files/raw_dart_files/Limpa_S0/White_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
             file_names,
             'S0', 'White', 'Wet', 'PP'],
            ['files/raw_dart_files/Limpa_S2/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
            file_names,
            'S2', 'Orange', 'Submerged', 'PP'],
            ['files/raw_dart_files/Limpa_S2/White_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
            file_names,
            'S2', 'White', 'Submerged', 'PP'],
            ['files/raw_dart_files/Limpa_S5/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
            file_names,
            'S5', 'Orange', 'Submerged', 'PP'],
            ['files/raw_dart_files/Limpa_S5/White_PP/', 
             'files/dart_files/Sentinel2_artificial/Limpa/PP/',
            file_names,
            'S5', 'White', 'Submerged', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S0/LDPE/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/LDPE/',
             file_names,
             'S0', 'Transparent', 'Dry', 'LDPE'],
            ['files/raw_dart_files/LimpaEspuma_S0/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
            file_names,
            'S0', 'Orange', 'Wet', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S0/PET/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PET/',
             file_names,
             'S0', 'Transparent', 'Dry', 'PET'],
            ['files/raw_dart_files/LimpaEspuma_S0/White_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
             file_names,
             'S0', 'White', 'Wet', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S0/Whitecaps/', 
             'files/dart_files/Sentinel2_artificial/Whitecaps/',
             file_names,
             'S0', '-', '-', 'Whitecaps'],
            ['files/raw_dart_files/LimpaEspuma_S2/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
            file_names,
            'S2', 'Orange', 'Submerged', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S2/White_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
            file_names,
            'S2', 'White', 'Submerged', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S5/Orange_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
            file_names,
            'S5', 'Orange', 'Submerged', 'PP'],
            ['files/raw_dart_files/LimpaEspuma_S5/White_PP/', 
             'files/dart_files/Sentinel2_artificial/LimpaEspuma/PP/',
            file_names,
            'S5', 'White', 'Submerged', 'PP']
        ]

The 'info' variable serves as input for the extraction method in the dart_files module. This method, using source and destination paths along with simulation configuration information, creates necessary destination folders and returns a list of paths for exporting files with identifiable names.

Subsequently, the source files are copied to their respective destinations using the obtained export paths.

In [6]:
for n in range(len(info)):
    paths = dart_files.extraction(info[n][0], info[n][1], info[n][2], info[n][3], info[n][4], info[n][5], info[n][6])
    
    os.chdir('../')
    os.chdir('../')
    os.chdir('../')
    
    for path in paths:
        if os.path.exists(path[0]):
            if os.path.exists(path[1]):
                try:
                    shutil.copyfile(path[0], path[1]+path[2])
                    print("File ", path[1]+path[2], " copied")
                except:
                    print("Error when trying to copy ", path[1]+path[2])
            else:
                try:
                    os.makedirs(path[1])
                    print(path[1], " directory created")
                    try: 
                        shutil.copyfile(path[0], path[1]+path[2])
                        print("File ", path[1]+path[2], " copied")
                    except:
                        print("Error when trying to copy ", path[1]+path[2])
                except:
                    print("Error when trying to create ", path[1], " directory")
        else:
            print(path[0], " does not exist")

files/raw_dart_files/Limpa_S0/LDPE/100/Blue/BroadBand/Blue/Blue/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/Blue/BroadBand/Blue/Blue/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/Green/BroadBand/Green/Green/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/Green/BroadBand/Green/Green/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/NIR1/BroadBand/NIR1/NIR1/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/NIR1/BroadBand/NIR1/NIR1/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/NIR2/BroadBand/NIR2/NIR2/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/100/NIR2/BroadBand/NIR2/NIR2/BRF/ITERX/IMAGES_DAR

files/raw_dart_files/Limpa_S0/LDPE/80/NIR2/BroadBand/NIR2/NIR2/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/NIR2/BroadBand/NIR2/NIR2/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/NIR_Artificial/BroadBand/NIR_Artificial/NIR_Artificial/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/NIR_Artificial/BroadBand/NIR_Artificial/NIR_Artificial/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/Red/BroadBand/Red/Red/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/Red/BroadBand/Red/Red/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mpr  does not exist
files/raw_dart_files/Limpa_S0/LDPE/80/RedEdge1/BroadBand/RedEdge1/RedEdge1/BRF/ITERX/IMAGES_DART/ima01_VZ=000_0_VA=000_0.mp#  does not exist
files/raw_dart_files/Limpa_S0/L

FileNotFoundError: [WinError 3] O sistema não pode encontrar o caminho especificado: 'files/raw_dart_files/Limpa_S0/Orange_PP/'

Now, it's essential to convert the data from MPR/MP# to ASC format. This conversion can be accomplished using software like QGIS. Once the conversion is complete, ensure that the converted files are stored within the existing folder structure. This step is crucial for the application to locate and compile them seamlessly in the subsequent sections.

### Getting directory and file structure

In [3]:
n_folders = int(input("How many folders does your database have? "))
paths = dict()

for n in range(1, n_folders + 1):
    path, polymers = dart_files.open_folders(str(input("Simulated files path "+str(n)+": ")))
    paths[path] = polymers
    print(paths)
    os.chdir('../')
    os.chdir('../')
    os.chdir('../')
    os.chdir('../')
    
#Paths (for example): files/dart_files/Sentinel2_Artificial/Limpa/ e files/dart_files/Sentinel2_Artificial/LimpaEspuma/ 

How many folders does your database have? 2
Simulated files path 1: files/dart_files/Sentinel2_Artificial/Limpa/
{'files/dart_files/Sentinel2_Artificial/Limpa/': {'LDPE': {'S0': {'Transparent': {'Wet': ['100', '40', '60', '80']}}}, 'PET': {'S0': {'Transparent': {'Wet': ['100', '40', '60', '80']}}}, 'PP': {'S0': {'Orange': {'Wet': ['100', '40', '60', '80']}, 'White': {'Wet': ['100', '40', '60', '80']}}, 'S2': {'Orange': {'Submerged': ['100', '40', '60', '80']}, 'White': {'Submerged': ['100', '40', '60', '80']}}, 'S5': {'Orange': {'Submerged': ['100', '40', '60', '80']}, 'White': {'Submerged': ['100', '40', '60', '80']}}}}}
Simulated files path 2: files/dart_files/Sentinel2_Artificial/LimpaEspuma/
{'files/dart_files/Sentinel2_Artificial/Limpa/': {'LDPE': {'S0': {'Transparent': {'Wet': ['100', '40', '60', '80']}}}, 'PET': {'S0': {'Transparent': {'Wet': ['100', '40', '60', '80']}}}, 'PP': {'S0': {'Orange': {'Wet': ['100', '40', '60', '80']}, 'White': {'Wet': ['100', '40', '60', '80']}}, 'S

In [None]:
#ADICIONAR PROCESSAMENTO DA BASE DE DADOS ANTERIOR, REFERENCIANDO ISSO EXPLICITAMENTE COM TEXTOS EXPLICATIVOS, REFERENCIAS E LINKS DO GITHUB

### Resampling

#### Resampling 20 meter bands by nearest neighbor

In [4]:
nn_data_10 = dart_files.get_images(paths, "nearest", "up")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest up
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', '

files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/ ['Wet_White_PP_Blue_40.asc', 'Wet_White_PP_Green_40.asc', 'Wet_White_PP_NIR1_40.asc', 'Wet_White_PP_NIR2_40.asc', 'Wet_White_PP_NIRa_40.asc', 'Wet_White_PP_RedEdge1_40.asc', 'Wet_White_PP_RedEdge2_40.asc', 'Wet_White_PP_RedEdge3_40.asc', 'Wet_White_PP_Red_40.asc', 'Wet_White_PP_SWIR1_40.asc', 'Wet_White_PP_SWIR2_40.asc'] PP S0 White Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc'

files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc

The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PET/S0/Transparent/Wet/80/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/40/ ['Wet_Orange_PP_Blue_40.asc', 'Wet_Orange_PP_Green_40.asc', 'Wet_Orange_PP_NIR1_40.asc', 'Wet_Orange_PP_NIR2_40.asc', 'Wet_Orange_PP_NIRa_40.asc', 'Wet_Orange_PP_RedEdge1_40.asc', 'Wet_Orange_PP_RedEdge2_40.asc', 'Wet_Orange_PP_RedEdge3_40.asc', 'Wet_Orange_PP_Red_40.asc', 'Wet_Orange_PP_SWIR1_40.asc', 'Wet_Orange_PP_SWIR2_40.asc'] PP S0 Orange Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet

The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S2/White/Submerged/80/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.

In [5]:
nn_data_20 = dart_files.get_images(paths, "nearest", "down")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest down
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', '

files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/ ['Wet_White_PP_Blue_40.asc', 'Wet_White_PP_Green_40.asc', 'Wet_White_PP_NIR1_40.asc', 'Wet_White_PP_NIR2_40.asc', 'Wet_White_PP_NIRa_40.asc', 'Wet_White_PP_RedEdge1_40.asc', 'Wet_White_PP_RedEdge2_40.asc', 'Wet_White_PP_RedEdge3_40.asc', 'Wet_White_PP_Red_40.asc', 'Wet_White_PP_SWIR1_40.asc', 'Wet_White_PP_SWIR2_40.asc'] PP S0 White Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc'

files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/40/ ['Wet_Orange_PP_Blue_40.asc', 'Wet_Orange_PP_Green_40.asc', 'Wet_Orange_PP_NIR1_40.asc', 'Wet_Orange_PP_NIR2_40.asc', 'Wet_Orange_PP_NIRa_40.asc', 'Wet_Orange_PP_RedEdge1_40.asc', 'Wet_Orange_PP_RedEdge2_40.asc', 'Wet_Orange_PP_RedEdge3_40.asc', 'Wet_Orange_PP_Red_40.asc', 'Wet_Orange_PP_SWIR1_40.asc', 'Wet_Orange_PP_SWIR2_40.asc'] PP S0 Orange Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet_Orange_PP_NIR1_60.asc', 'Wet_Orange_PP_NIR2_60.asc', 'Wet_Orange_PP_NIRa_60.asc', 'Wet_Orange_PP_RedEdge1_60.asc', 'Wet_Orange_PP_RedEdge2_60.asc', 'Wet_Orange_PP_Re

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] nearest down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_

In [6]:
for image in nn_data_10:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(29, 57, 4, 27, 87, 4, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [7]:
for image in nn_data_20:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(14, 28, 2, 13, 43, 2, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [8]:
dart_nn_10 = dart_files.build_dataset(nn_data_10)

In [9]:
dart_nn_20 = dart_files.build_dataset(nn_data_20)

In [10]:
dart_nn_10['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_nn_10['Submergence']]

dart_nn_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [11]:
dart_nn_20['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_nn_20['Submergence']]

dart_nn_20

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100795,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,55,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100796,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,56,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100797,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,57,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100798,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,58,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [12]:
dart_nn_10.to_csv(str(input("Path/filename for dart_nn_10 dataset: ")))
#For example: files/csv_files/dataset_dart_nn_10.csv

dart_nn_20.to_csv(str(input("Path/filename for dart_nn_20 dataset: ")))
#For example: files/csv_files/dataset_dart_nn_20.csv

#### Resampling 20 meter bands by bilinear interpolation

In [13]:
bilinear_data_10 = dart_files.get_images(paths, "bilinear", "up")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear up
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 

The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc', 'Wet_White_PP_SWIR1_60.asc', 'Wet_White_PP_SWIR2_60.asc'] PP S0 White Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/80/ ['Wet_White_PP_Blue_80.asc', 'Wet_White_PP_Green_80.asc', 'Wet_White_PP_NIR1_80.asc', 'Wet_White_PP_NIR2_80.a

The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_80.asc', 'S5_Orange_PP_NI

The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet_Orange_PP_NIR1_60.asc', 'Wet_Orange_PP_NIR2_60.asc', 'Wet_Orange_PP_NIRa_60.asc', 'Wet_Orange_PP_RedEdge1_60.asc', 'Wet_Orange_PP_RedEdge2_60.asc', 'Wet_Orange_PP_RedEdge3_60.asc', 'Wet_Orange_PP_Red_60.asc', 'Wet_Orange_PP_SWIR1_60.asc', 'Wet_Orange_PP_SWIR2_60.asc'] PP S0 Orange Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/80/ ['Wet_Orange_PP_Blue_80.asc', 'Wet_Orange_PP_Green_80.asc', 'Wet_Oran

The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_8

In [14]:
bilinear_data_20 = dart_files.get_images(paths, "bilinear", "down")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear down
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 

files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc', 'Wet_White_PP_SWIR1_60.asc', 'Wet_White_PP_SWIR2_60.asc'] PP S0 White Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/80/ ['Wet_White_PP_Blue_80.asc', 'Wet_White_PP_Green_80.asc', 'Wet_White_PP_NIR1_80.asc', 'Wet_White_PP_NIR2_80.asc', 'Wet_White_PP_NIRa_80.asc', 'Wet_White_PP_RedEdge1_80.asc', 'Wet_White_PP_RedEdge2_80.asc', 'Wet_White_PP_RedEdge3_80.asc', 'Wet_White_PP_Red_80.asc

files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_80.asc', 'S5_Orange_PP_NIR1_80.asc', 'S5_Orange_PP_NIR2_80.asc', 'S5_Orange_PP_NIRa_80.asc', 'S5_Orange_PP_RedEdge1_80.asc', 'S5_Orange_PP_RedEdge2_80.asc', 'S5_Orange_PP_RedEdge3_80.as

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet_Orange_PP_NIR1_60.asc', 'Wet_Orange_PP_NIR2_60.asc', 'Wet_Orange_PP_NIRa_60.asc', 'Wet_Orange_PP_RedEdge1_60.asc', 'Wet_Orange_PP_RedEdge2_60.asc', 'Wet_Orange_PP_RedEdge3_60.asc', 'Wet_Orange_PP_Red_60.asc', 'Wet_Orange_PP_SWIR1_60.asc', 'Wet_Orange_PP_SWIR2_60.asc'] PP S0 Orange Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/80/ ['Wet_Orange_PP_Blue_80.asc', 'Wet_Orange_PP_Green_80.asc', 'Wet_Orange_PP_NIR1_80.asc', 'Wet_Orange_PP_NIR2_80.asc', 'Wet_Orange_PP_NIRa_80.asc', 'Wet_Orange_PP_RedEdge1_80.asc', 'Wet_Orange_PP_RedEdge2_80.asc', 'Wet_Orange_PP_R

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] bilinear down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_80.asc', 'S5_Orange_PP_NIR1_80.asc', 'S5_Orange_PP_NIR2_80.asc', 'S5_Orange_PP_NIRa_80.asc', 'S5_Orange_PP_RedEdge1_80.asc', 'S5_Orange_PP_RedEdge2_80.asc', 'S5_Orange

In [15]:
for image in bilinear_data_10:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(29, 57, 4, 27, 87, 4, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [16]:
for image in bilinear_data_20:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(14, 28, 2, 13, 43, 2, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [17]:
dart_bilinear_10 = dart_files.build_dataset(bilinear_data_10)

In [18]:
dart_bilinear_20 = dart_files.build_dataset(bilinear_data_20)

In [19]:
dart_bilinear_10['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_bilinear_10['Submergence']]

dart_bilinear_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [20]:
dart_bilinear_20['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_bilinear_20['Submergence']]

dart_bilinear_20

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100795,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,55,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100796,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,56,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100797,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,57,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100798,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,58,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [21]:
dart_bilinear_10.to_csv(str(input("Path/filename for dart_bilinear_10 dataset: ")))
#For example: files/csv_files/dataset_dart_bilinear_10.csv

dart_bilinear_20.to_csv(str(input("Path/filename for dart_bilinear_20 dataset: ")))
#For example: files/csv_files/datfiles/csv_files/dataset_dart_bilinear_10.csv

#### Resampling 20 meter bands by cubic interpolation

In [22]:
cubic_data_10 = dart_files.get_images(paths, "cubic", "up")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic up
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'Re

files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/ ['Wet_White_PP_Blue_40.asc', 'Wet_White_PP_Green_40.asc', 'Wet_White_PP_NIR1_40.asc', 'Wet_White_PP_NIR2_40.asc', 'Wet_White_PP_NIRa_40.asc', 'Wet_White_PP_RedEdge1_40.asc', 'Wet_White_PP_RedEdge2_40.asc', 'Wet_White_PP_RedEdge3_40.asc', 'Wet_White_PP_Red_40.asc', 'Wet_White_PP_SWIR1_40.asc', 'Wet_White_PP_SWIR2_40.asc'] PP S0 White Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc', 

files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic up
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc',

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet_Orange_PP_NIR1_60.asc', 'Wet_Orange_PP_NIR2_60.asc', 'Wet_Orange_PP_NIRa_60.asc', 'Wet_Orange_PP_RedEdge1_60.asc', 'Wet_Orange_PP_RedEdge2_60.asc', 'Wet_Orange_PP_RedEdge3_60.asc', 'Wet_Orange_PP_Red_60.asc', 'Wet_Orange_PP_SWIR1_60.asc', 'Wet_Orange_PP_SWIR2_60.asc'] PP S0 Orange Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/80/ ['Wet_Orange_PP_Blue_80.asc', 'Wet_Orange_PP_Green_80.asc', 'Wet_Orange_PP_NIR1_80.asc', 'Wet_Orange_PP_NIR2_80.asc', 'Wet_Orange_PP_NIRa_80.asc', 'Wet_Orange_PP_RedEdge1_80.asc', 'Wet_Orange_PP_RedEdge2_80.asc', 'Wet_Orange_PP_RedE

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic up
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/  image bands will be resampled to the higher available spatial resolution (60, 120)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_80.asc', 'S5_Orange_PP_NIR1_80.asc', 'S5_Orange_PP_NIR2_80.asc', 'S5_Orange_PP_NIRa_80.asc', 'S5_Orange_PP_RedEdge1_80.asc', 'S5_Orange_PP_RedEdge2_80.asc', 'S5_Orange_PP

In [23]:
cubic_data_20 = dart_files.get_images(paths, "cubic", "down")

files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/ ['LDPE_Blue_100.asc', 'LDPE_Green_100.asc', 'LDPE_NIR1_100.asc', 'LDPE_NIR2_100.asc', 'LDPE_NIRa_100.asc', 'LDPE_RedEdge1_100.asc', 'LDPE_RedEdge2_100.asc', 'LDPE_RedEdge3_100.asc', 'LDPE_Red_100.asc', 'LDPE_SWIR1_100.asc', 'LDPE_SWIR2_100.asc'] LDPE S0 Transparent Wet 100 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic down
The  files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/ ['LDPE_Blue_40.asc', 'LDPE_Green_40.asc', 'LDPE_NIR1_40.asc', 'LDPE_NIR2_40.asc', 'LDPE_NIRa_40.asc', 'LDPE_RedEdge1_40.asc', 'LDPE_RedEdge2_40.asc', 'LDPE_RedEdge3_40.asc', 'LDPE_Red_40.asc', 'LDPE_SWIR1_40.asc', 'LDPE_SWIR2_40.asc'] LDPE S0 Transparent Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'Re

files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/ ['Wet_White_PP_Blue_40.asc', 'Wet_White_PP_Green_40.asc', 'Wet_White_PP_NIR1_40.asc', 'Wet_White_PP_NIR2_40.asc', 'Wet_White_PP_NIRa_40.asc', 'Wet_White_PP_RedEdge1_40.asc', 'Wet_White_PP_RedEdge2_40.asc', 'Wet_White_PP_RedEdge3_40.asc', 'Wet_White_PP_Red_40.asc', 'Wet_White_PP_SWIR1_40.asc', 'Wet_White_PP_SWIR2_40.asc'] PP S0 White Wet 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/White/Wet/60/ ['Wet_White_PP_Blue_60.asc', 'Wet_White_PP_Green_60.asc', 'Wet_White_PP_NIR1_60.asc', 'Wet_White_PP_NIR2_60.asc', 'Wet_White_PP_NIRa_60.asc', 'Wet_White_PP_RedEdge1_60.asc', 'Wet_White_PP_RedEdge2_60.asc', 'Wet_White_PP_RedEdge3_60.asc', 'Wet_White_PP_Red_60.asc', 

files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/ ['S5_Orange_PP_Blue_40.asc', 'S5_Orange_PP_Green_40.asc', 'S5_Orange_PP_NIR1_40.asc', 'S5_Orange_PP_NIR2_40.asc', 'S5_Orange_PP_NIRa_40.asc', 'S5_Orange_PP_RedEdge1_40.asc', 'S5_Orange_PP_RedEdge2_40.asc', 'S5_Orange_PP_RedEdge3_40.asc', 'S5_Orange_PP_Red_40.asc', 'S5_Orange_PP_SWIR1_40.asc', 'S5_Orange_PP_SWIR2_40.asc'] PP S5 Orange Submerged 40 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic down
The  files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/40/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/Limpa/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc',

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/ ['Wet_Orange_PP_Blue_60.asc', 'Wet_Orange_PP_Green_60.asc', 'Wet_Orange_PP_NIR1_60.asc', 'Wet_Orange_PP_NIR2_60.asc', 'Wet_Orange_PP_NIRa_60.asc', 'Wet_Orange_PP_RedEdge1_60.asc', 'Wet_Orange_PP_RedEdge2_60.asc', 'Wet_Orange_PP_RedEdge3_60.asc', 'Wet_Orange_PP_Red_60.asc', 'Wet_Orange_PP_SWIR1_60.asc', 'Wet_Orange_PP_SWIR2_60.asc'] PP S0 Orange Wet 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S0/Orange/Wet/80/ ['Wet_Orange_PP_Blue_80.asc', 'Wet_Orange_PP_Green_80.asc', 'Wet_Orange_PP_NIR1_80.asc', 'Wet_Orange_PP_NIR2_80.asc', 'Wet_Orange_PP_NIRa_80.asc', 'Wet_Orange_PP_RedEdge1_80.asc', 'Wet_Orange_PP_RedEdge2_80.asc', 'Wet_Orange_PP_RedE

files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/ ['S5_Orange_PP_Blue_60.asc', 'S5_Orange_PP_Green_60.asc', 'S5_Orange_PP_NIR1_60.asc', 'S5_Orange_PP_NIR2_60.asc', 'S5_Orange_PP_NIRa_60.asc', 'S5_Orange_PP_RedEdge1_60.asc', 'S5_Orange_PP_RedEdge2_60.asc', 'S5_Orange_PP_RedEdge3_60.asc', 'S5_Orange_PP_Red_60.asc', 'S5_Orange_PP_SWIR1_60.asc', 'S5_Orange_PP_SWIR2_60.asc'] PP S5 Orange Submerged 60 ['Blue', 'Green', 'NIR1', 'NIR2', 'NIRa', 'RedEdge1', 'RedEdge2', 'RedEdge3', 'Red', 'SWIR1', 'SWIR2'] cubic down
The  files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/60/  image bands will be resampled to the lower available spatial resolution (30, 60)
files/dart_files/Sentinel2_Artificial/LimpaEspuma/PP/S5/Orange/Submerged/80/ ['S5_Orange_PP_Blue_80.asc', 'S5_Orange_PP_Green_80.asc', 'S5_Orange_PP_NIR1_80.asc', 'S5_Orange_PP_NIR2_80.asc', 'S5_Orange_PP_NIRa_80.asc', 'S5_Orange_PP_RedEdge1_80.asc', 'S5_Orange_PP_RedEdge2_80.asc', 'S5_Orange_PP

In [24]:
for image in cubic_data_10:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(29, 57, 4, 27, 87, 4, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [25]:
for image in cubic_data_20:
    image.setAreaLabel(0, (image.getXSize() - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    image.setGridLabel(14, 28, 2, 13, 43, 2, "Plastic")
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent("Plastic", image.getPlasticCoverPercent())

In [26]:
dart_cubic_10 = dart_files.build_dataset(cubic_data_10)

In [27]:
dart_cubic_20 = dart_files.build_dataset(cubic_data_20)

In [28]:
dart_cubic_10['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_cubic_10['Submergence']]

dart_cubic_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [29]:
dart_cubic_20['Submergence'] = ['0' if value == 'S0' else '2 cm' if value == 'S2' else '5 cm' for value in dart_cubic_20['Submergence']]

dart_cubic_20

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,Red,SWIR1,SWIR2,Polymer,Cover_percent,Submergence,Color,Status,Label
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,LDPE,0,0,Transparent,Wet,Water
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100795,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,55,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100796,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,56,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100797,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,57,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water
100798,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,58,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,0.0199,0.0245,0.0186,0.0163,PP,0,5 cm,White,Submerged,Water


In [30]:
dart_cubic_10.to_csv(str(input("Path/filename for dart_cubic_10 dataset: ")))
#For example: files/csv_files/dataset_dart_cubic_10.csv

dart_cubic_20.to_csv(str(input("Path/filename for dart_cubic_20 dataset: ")))
#For example: files/csv_files/dataset_dart_cubic_20.csv

### Building data frames

#### Building data frames

##### Default dataset (Bilinear interpolation) 

Adding labels translation - 10 m resolution dataset

In [3]:
dataset_bilinear_10 = pd.read_csv(str(input("Path/filename: "))) #For example: files/csv_files/dataset_dart_bilinear_10.csv
dataset_bilinear_10.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_bilinear_10)):
    if dataset_bilinear_10.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_bilinear_10.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_bilinear_10.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_bilinear_10['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_bilinear_10.csv



Columns (17) have mixed types. Specify dtype option on import or set low_memory=False.



Adding radiometric indices

In [4]:
dataset_bilinear_10['NDWI'] = (dataset_bilinear_10['Green'] - dataset_bilinear_10['NIR1']) / (dataset_bilinear_10['Green'] + dataset_bilinear_10['NIR1'])
dataset_bilinear_10['WRI'] = (dataset_bilinear_10['Green'] + dataset_bilinear_10['Red']) / (dataset_bilinear_10['NIR1'] + dataset_bilinear_10['SWIR2'])
dataset_bilinear_10['NDVI'] = (dataset_bilinear_10['NIR1'] - dataset_bilinear_10['Red']) / (dataset_bilinear_10['NIR1'] + dataset_bilinear_10['Red'])
dataset_bilinear_10['AWEI'] = 4 * (dataset_bilinear_10['Green'] - dataset_bilinear_10['SWIR2']) - (0.25 * dataset_bilinear_10['NIR1'] + 2.75 * dataset_bilinear_10['SWIR1'])
dataset_bilinear_10['MNDWI'] = (dataset_bilinear_10['Green'] - dataset_bilinear_10['SWIR2']) / (dataset_bilinear_10['Green'] + dataset_bilinear_10['SWIR2'])
dataset_bilinear_10['SR'] = dataset_bilinear_10['NIR1'] / dataset_bilinear_10['Red']
dataset_bilinear_10['PI'] = dataset_bilinear_10['NIR1'] / (dataset_bilinear_10['NIR1'] + dataset_bilinear_10['Red'])
dataset_bilinear_10['RNDVI'] = (dataset_bilinear_10['Red'] - dataset_bilinear_10['NIR1']) / (dataset_bilinear_10['Red'] + dataset_bilinear_10['NIR1'])
dataset_bilinear_10['FDI'] = dataset_bilinear_10['NIR1'] - (dataset_bilinear_10['RedEdge2'] + (dataset_bilinear_10['SWIR1'] - dataset_bilinear_10['RedEdge2']) * ((dataset_bilinear_10['NIR1'] - dataset_bilinear_10['Red']) / (dataset_bilinear_10['SWIR1'] - dataset_bilinear_10['Red'])) * 10)

dataset_bilinear_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


Data cleaning

In [5]:
dividers_dart = dict()
dividers_dart.update({"Green + NIR1": dataset_bilinear_10['Green'] + dataset_bilinear_10['NIR1']})
dividers_dart.update({"SWIR2 + NIR1": dataset_bilinear_10['NIR1'] + dataset_bilinear_10['SWIR2']})
dividers_dart.update({"Red + NIR1": dataset_bilinear_10['NIR1'] + dataset_bilinear_10['Red']})
dividers_dart.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_bilinear_10['NIR1'] + 2.75 * dataset_bilinear_10['SWIR1']})
dividers_dart.update({"Swir2 + * Green": dataset_bilinear_10['Green'] + dataset_bilinear_10['SWIR2']})
dividers_dart.update({"Red": dataset_bilinear_10['Red']})
dividers_dart.update({"SWIR1 - Red": dataset_bilinear_10['SWIR1'] - dataset_bilinear_10['Red']})

zeros_dart = dict()

for key in dividers_dart.keys():
    i = 0
    for value in dividers_dart[key]:
        if value == 0:
            i += 1
    zeros_dart.update({key:i})
zeros_dart

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 0}

There were no divisions by zero, it is not necessary to disregard any samples

Adding labels translation - 20 m resolution dataset

In [6]:
dataset_bilinear_20 = pd.read_csv(str(input("Path/filename: "))) #For example: files/csv_files/dataset_dart_bilinear_20.csv
dataset_bilinear_20.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_bilinear_20)):
    if dataset_bilinear_20.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_bilinear_20.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_bilinear_20.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_bilinear_20['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_bilinear_20.csv


Adding radiometric indices

In [7]:
dataset_bilinear_20['NDWI'] = (dataset_bilinear_20['Green'] - dataset_bilinear_20['NIR1']) / (dataset_bilinear_20['Green'] + dataset_bilinear_20['NIR1'])
dataset_bilinear_20['WRI'] = (dataset_bilinear_20['Green'] + dataset_bilinear_20['Red']) / (dataset_bilinear_20['NIR1'] + dataset_bilinear_20['SWIR2'])
dataset_bilinear_20['NDVI'] = (dataset_bilinear_20['NIR1'] - dataset_bilinear_20['Red']) / (dataset_bilinear_20['NIR1'] + dataset_bilinear_20['Red'])
dataset_bilinear_20['AWEI'] = 4 * (dataset_bilinear_20['Green'] - dataset_bilinear_20['SWIR2']) - (0.25 * dataset_bilinear_20['NIR1'] + 2.75 * dataset_bilinear_20['SWIR1'])
dataset_bilinear_20['MNDWI'] = (dataset_bilinear_20['Green'] - dataset_bilinear_20['SWIR2']) / (dataset_bilinear_20['Green'] + dataset_bilinear_20['SWIR2'])
dataset_bilinear_20['SR'] = dataset_bilinear_20['NIR1'] / dataset_bilinear_20['Red']
dataset_bilinear_20['PI'] = dataset_bilinear_20['NIR1'] / (dataset_bilinear_20['NIR1'] + dataset_bilinear_20['Red'])
dataset_bilinear_20['RNDVI'] = (dataset_bilinear_20['Red'] - dataset_bilinear_20['NIR1']) / (dataset_bilinear_20['Red'] + dataset_bilinear_20['NIR1'])
dataset_bilinear_20['FDI'] = dataset_bilinear_20['NIR1'] - (dataset_bilinear_20['RedEdge2'] + (dataset_bilinear_20['SWIR1'] - dataset_bilinear_20['RedEdge2']) * ((dataset_bilinear_20['NIR1'] - dataset_bilinear_20['Red']) / (dataset_bilinear_20['SWIR1'] - dataset_bilinear_20['Red'])) * 10)

dataset_bilinear_20

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100795,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,55,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100796,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,56,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100797,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,57,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100798,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,58,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


Data cleaning

In [8]:
dividers_dart = dict()
dividers_dart.update({"Green + NIR1": dataset_bilinear_20['Green'] + dataset_bilinear_20['NIR1']})
dividers_dart.update({"SWIR2 + NIR1": dataset_bilinear_20['NIR1'] + dataset_bilinear_20['SWIR2']})
dividers_dart.update({"Red + NIR1": dataset_bilinear_20['NIR1'] + dataset_bilinear_20['Red']})
dividers_dart.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_bilinear_20['NIR1'] + 2.75 * dataset_bilinear_20['SWIR1']})
dividers_dart.update({"Swir2 + * Green": dataset_bilinear_20['Green'] + dataset_bilinear_20['SWIR2']})
dividers_dart.update({"Red": dataset_bilinear_20['Red']})
dividers_dart.update({"SWIR1 - Red": dataset_bilinear_20['SWIR1'] - dataset_bilinear_20['Red']})

zeros_dart = dict()

for key in dividers_dart.keys():
    i = 0
    for value in dividers_dart[key]:
        if value == 0:
            i += 1
    zeros_dart.update({key:i})
zeros_dart

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 0}

There were no divisions by zero, it is not necessary to disregard any samples

##### Other datasets (Nearest neighbor and Cubic interpolation) 

Adding labels translation and radiometric indices and cleaning data - 10 m resolution datasets

In [9]:
dataset_dart_cubic_10 = pd.read_csv(str(input("Path/filename: "))) #For example: files/csv_files/dataset_dart_cubic_10.csv
dataset_dart_cubic_10.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_dart_cubic_10)):
    if dataset_dart_cubic_10.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_dart_cubic_10.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_dart_cubic_10.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_dart_cubic_10['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_cubic_10.csv



Columns (17) have mixed types. Specify dtype option on import or set low_memory=False.



In [10]:
dataset_dart_cubic_10['NDWI'] = (dataset_dart_cubic_10['Green'] - dataset_dart_cubic_10['NIR1']) / (dataset_dart_cubic_10['Green'] + dataset_dart_cubic_10['NIR1'])
dataset_dart_cubic_10['WRI'] = (dataset_dart_cubic_10['Green'] + dataset_dart_cubic_10['Red']) / (dataset_dart_cubic_10['NIR1'] + dataset_dart_cubic_10['SWIR2'])
dataset_dart_cubic_10['NDVI'] = (dataset_dart_cubic_10['NIR1'] - dataset_dart_cubic_10['Red']) / (dataset_dart_cubic_10['NIR1'] + dataset_dart_cubic_10['Red'])
dataset_dart_cubic_10['AWEI'] = 4 * (dataset_dart_cubic_10['Green'] - dataset_dart_cubic_10['SWIR2']) - (0.25 * dataset_dart_cubic_10['NIR1'] + 2.75 * dataset_dart_cubic_10['SWIR1'])
dataset_dart_cubic_10['MNDWI'] = (dataset_dart_cubic_10['Green'] - dataset_dart_cubic_10['SWIR2']) / (dataset_dart_cubic_10['Green'] + dataset_dart_cubic_10['SWIR2'])
dataset_dart_cubic_10['SR'] = dataset_dart_cubic_10['NIR1'] / dataset_dart_cubic_10['Red']
dataset_dart_cubic_10['PI'] = dataset_dart_cubic_10['NIR1'] / (dataset_dart_cubic_10['NIR1'] + dataset_dart_cubic_10['Red'])
dataset_dart_cubic_10['RNDVI'] = (dataset_dart_cubic_10['Red'] - dataset_dart_cubic_10['NIR1']) / (dataset_dart_cubic_10['Red'] + dataset_dart_cubic_10['NIR1'])
dataset_dart_cubic_10['FDI'] = dataset_dart_cubic_10['NIR1'] - (dataset_dart_cubic_10['RedEdge2'] + (dataset_dart_cubic_10['SWIR1'] - dataset_dart_cubic_10['RedEdge2']) * ((dataset_dart_cubic_10['NIR1'] - dataset_dart_cubic_10['Red']) / (dataset_dart_cubic_10['SWIR1'] - dataset_dart_cubic_10['Red'])) * 10)

dataset_dart_cubic_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


In [11]:
dividers_dart_cubic = dict()
dividers_dart_cubic.update({"Green + NIR1": dataset_dart_cubic_10['Green'] + dataset_dart_cubic_10['NIR1']})
dividers_dart_cubic.update({"SWIR2 + NIR1": dataset_dart_cubic_10['NIR1'] + dataset_dart_cubic_10['SWIR2']})
dividers_dart_cubic.update({"Red + NIR1": dataset_dart_cubic_10['NIR1'] + dataset_dart_cubic_10['Red']})
dividers_dart_cubic.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_dart_cubic_10['NIR1'] + 2.75 * dataset_dart_cubic_10['SWIR1']})
dividers_dart_cubic.update({"Swir2 + * Green": dataset_dart_cubic_10['Green'] + dataset_dart_cubic_10['SWIR2']})
dividers_dart_cubic.update({"Red": dataset_dart_cubic_10['Red']})
dividers_dart_cubic.update({"SWIR1 - Red": dataset_dart_cubic_10['SWIR1'] - dataset_dart_cubic_10['Red']})

zeros_dart_cubic = dict()

for key in dividers_dart_cubic.keys():
    i = 0
    for value in dividers_dart_cubic[key]:
        if value == 0:
            i += 1
    zeros_dart_cubic.update({key:i})
zeros_dart_cubic

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 8}

In [12]:
indexes = dataset_dart_cubic_10.query('FDI < -1000').index #deletando amostras com valor -inf derivados da divisâo por zero no FDI
dataset_dart_cubic_10.drop(indexes,  axis=0, inplace=True)
dataset_dart_cubic_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


In [13]:
dataset_dart_nn_10 = pd.read_csv(str(input("Path/filename: ")))#For example: files/csv_files/dataset_dart_nn_10.csv
dataset_dart_nn_10.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_dart_nn_10)):
    if dataset_dart_nn_10.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_dart_nn_10.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_dart_nn_10.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_dart_nn_10['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_nn_10.csv



Columns (17) have mixed types. Specify dtype option on import or set low_memory=False.



In [14]:
dataset_dart_nn_10['NDWI'] = (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['NIR1']) / (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['NIR1'])
dataset_dart_nn_10['WRI'] = (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['SWIR2'])
dataset_dart_nn_10['NDVI'] = (dataset_dart_nn_10['NIR1'] - dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red'])
dataset_dart_nn_10['AWEI'] = 4 * (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['SWIR2']) - (0.25 * dataset_dart_nn_10['NIR1'] + 2.75 * dataset_dart_nn_10['SWIR1'])
dataset_dart_nn_10['MNDWI'] = (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['SWIR2']) / (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['SWIR2'])
dataset_dart_nn_10['SR'] = dataset_dart_nn_10['NIR1'] / dataset_dart_nn_10['Red']
dataset_dart_nn_10['PI'] = dataset_dart_nn_10['NIR1'] / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red'])
dataset_dart_nn_10['RNDVI'] = (dataset_dart_nn_10['Red'] - dataset_dart_nn_10['NIR1']) / (dataset_dart_nn_10['Red'] + dataset_dart_nn_10['NIR1'])
dataset_dart_nn_10['FDI'] = dataset_dart_nn_10['NIR1'] - (dataset_dart_nn_10['RedEdge2'] + (dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['RedEdge2']) * ((dataset_dart_nn_10['NIR1'] - dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['Red'])) * 10)

dataset_dart_nn_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


In [15]:
dividers_dart_nn = dict()
dividers_dart_nn.update({"Green + NIR1": dataset_dart_nn_10['Green'] + dataset_dart_nn_10['NIR1']})
dividers_dart_nn.update({"SWIR2 + NIR1": dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['SWIR2']})
dividers_dart_nn.update({"Red + NIR1": dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red']})
dividers_dart_nn.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_dart_nn_10['NIR1'] + 2.75 * dataset_dart_nn_10['SWIR1']})
dividers_dart_nn.update({"Swir2 + * Green": dataset_dart_nn_10['Green'] + dataset_dart_nn_10['SWIR2']})
dividers_dart_nn.update({"Red": dataset_dart_nn_10['Red']})
dividers_dart_nn.update({"SWIR1 - Red": dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['Red']})

zeros_dart_nn = dict()

for key in dividers_dart_nn.keys():
    i = 0
    for value in dividers_dart_nn[key]:
        if value == 0:
            i += 1
    zeros_dart_nn.update({key:i})
zeros_dart_nn

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 128}

In [16]:
indexes = dataset_dart_nn_10.query('FDI < -1000').index #deletando amostras com valor -inf derivados da divisâo por zero no FDI
dataset_dart_nn_10.drop(indexes,  axis=0, inplace=True)
dataset_dart_nn_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


Adding labels translation and radiometric indices and cleaning data - 20 m resolution datasets

In [17]:
dataset_dart_cubic_20 = pd.read_csv(str(input("Path/filename: "))) #For example: files/csv_files/dataset_dart_cubic_20.csv
dataset_dart_cubic_20.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_dart_cubic_20)):
    if dataset_dart_cubic_20.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_dart_cubic_20.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_dart_cubic_20.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_dart_cubic_20['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_cubic_20.csv


In [18]:
dataset_dart_cubic_20['NDWI'] = (dataset_dart_cubic_20['Green'] - dataset_dart_cubic_20['NIR1']) / (dataset_dart_cubic_20['Green'] + dataset_dart_cubic_20['NIR1'])
dataset_dart_cubic_20['WRI'] = (dataset_dart_cubic_20['Green'] + dataset_dart_cubic_20['Red']) / (dataset_dart_cubic_20['NIR1'] + dataset_dart_cubic_20['SWIR2'])
dataset_dart_cubic_20['NDVI'] = (dataset_dart_cubic_20['NIR1'] - dataset_dart_cubic_20['Red']) / (dataset_dart_cubic_20['NIR1'] + dataset_dart_cubic_20['Red'])
dataset_dart_cubic_20['AWEI'] = 4 * (dataset_dart_cubic_20['Green'] - dataset_dart_cubic_20['SWIR2']) - (0.25 * dataset_dart_cubic_20['NIR1'] + 2.75 * dataset_dart_cubic_20['SWIR1'])
dataset_dart_cubic_20['MNDWI'] = (dataset_dart_cubic_20['Green'] - dataset_dart_cubic_20['SWIR2']) / (dataset_dart_cubic_20['Green'] + dataset_dart_cubic_20['SWIR2'])
dataset_dart_cubic_20['SR'] = dataset_dart_cubic_20['NIR1'] / dataset_dart_cubic_20['Red']
dataset_dart_cubic_20['PI'] = dataset_dart_cubic_20['NIR1'] / (dataset_dart_cubic_20['NIR1'] + dataset_dart_cubic_20['Red'])
dataset_dart_cubic_20['RNDVI'] = (dataset_dart_cubic_20['Red'] - dataset_dart_cubic_20['NIR1']) / (dataset_dart_cubic_20['Red'] + dataset_dart_cubic_20['NIR1'])
dataset_dart_cubic_20['FDI'] = dataset_dart_cubic_20['NIR1'] - (dataset_dart_cubic_20['RedEdge2'] + (dataset_dart_cubic_20['SWIR1'] - dataset_dart_cubic_20['RedEdge2']) * ((dataset_dart_cubic_20['NIR1'] - dataset_dart_cubic_20['Red']) / (dataset_dart_cubic_20['SWIR1'] - dataset_dart_cubic_20['Red'])) * 10)

dataset_dart_cubic_20

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
100795,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,55,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100796,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,56,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100797,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,57,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
100798,files/dart_files/Sentinel2_Artificial/LimpaEsp...,29,58,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


In [19]:
dividers_dart_cubic = dict()
dividers_dart_cubic.update({"Green + NIR1": dataset_dart_cubic_20['Green'] + dataset_dart_cubic_20['NIR1']})
dividers_dart_cubic.update({"SWIR2 + NIR1": dataset_dart_cubic_20['NIR1'] + dataset_dart_cubic_20['SWIR2']})
dividers_dart_cubic.update({"Red + NIR1": dataset_dart_cubic_20['NIR1'] + dataset_dart_cubic_20['Red']})
dividers_dart_cubic.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_dart_cubic_20['NIR1'] + 2.75 * dataset_dart_cubic_20['SWIR1']})
dividers_dart_cubic.update({"Swir2 + * Green": dataset_dart_cubic_20['Green'] + dataset_dart_cubic_20['SWIR2']})
dividers_dart_cubic.update({"Red": dataset_dart_cubic_20['Red']})
dividers_dart_cubic.update({"SWIR1 - Red": dataset_dart_cubic_20['SWIR1'] - dataset_dart_cubic_20['Red']})

zeros_dart_cubic = dict()

for key in dividers_dart_cubic.keys():
    i = 0
    for value in dividers_dart_cubic[key]:
        if value == 0:
            i += 1
    zeros_dart_cubic.update({key:i})
zeros_dart_cubic

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 0}

In [20]:
dataset_dart_nn_20 = pd.read_csv(str(input("Path/filename: ")))#For example: files/csv_files/dataset_dart_nn_20.csv
dataset_dart_nn_20.drop('Unnamed: 0', axis=1, inplace=True)

dart_label = []
for i in range(len(dataset_dart_nn_20)):
    if dataset_dart_nn_20.at[i, 'Label'] == 'Water':
        dart_label.append('Água')
    elif dataset_dart_nn_20.at[i, 'Label'] == 'Sand':
        dart_label.append('Areia')
    elif dataset_dart_nn_20.at[i, 'Label'] == 'Plastic':
        dart_label.append('Plástico')

dataset_dart_nn_20['Classe'] = dart_label

Path/filename: files/csv_files/dataset_dart_nn_20.csv


In [21]:
dataset_dart_nn_10['NDWI'] = (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['NIR1']) / (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['NIR1'])
dataset_dart_nn_10['WRI'] = (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['SWIR2'])
dataset_dart_nn_10['NDVI'] = (dataset_dart_nn_10['NIR1'] - dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red'])
dataset_dart_nn_10['AWEI'] = 4 * (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['SWIR2']) - (0.25 * dataset_dart_nn_10['NIR1'] + 2.75 * dataset_dart_nn_10['SWIR1'])
dataset_dart_nn_10['MNDWI'] = (dataset_dart_nn_10['Green'] - dataset_dart_nn_10['SWIR2']) / (dataset_dart_nn_10['Green'] + dataset_dart_nn_10['SWIR2'])
dataset_dart_nn_10['SR'] = dataset_dart_nn_10['NIR1'] / dataset_dart_nn_10['Red']
dataset_dart_nn_10['PI'] = dataset_dart_nn_10['NIR1'] / (dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red'])
dataset_dart_nn_10['RNDVI'] = (dataset_dart_nn_10['Red'] - dataset_dart_nn_10['NIR1']) / (dataset_dart_nn_10['Red'] + dataset_dart_nn_10['NIR1'])
dataset_dart_nn_10['FDI'] = dataset_dart_nn_10['NIR1'] - (dataset_dart_nn_10['RedEdge2'] + (dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['RedEdge2']) * ((dataset_dart_nn_10['NIR1'] - dataset_dart_nn_10['Red']) / (dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['Red'])) * 10)

dataset_dart_nn_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


In [22]:
dividers_dart_nn = dict()
dividers_dart_nn.update({"Green + NIR1": dataset_dart_nn_10['Green'] + dataset_dart_nn_10['NIR1']})
dividers_dart_nn.update({"SWIR2 + NIR1": dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['SWIR2']})
dividers_dart_nn.update({"Red + NIR1": dataset_dart_nn_10['NIR1'] + dataset_dart_nn_10['Red']})
dividers_dart_nn.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_dart_nn_10['NIR1'] + 2.75 * dataset_dart_nn_10['SWIR1']})
dividers_dart_nn.update({"Swir2 + * Green": dataset_dart_nn_10['Green'] + dataset_dart_nn_10['SWIR2']})
dividers_dart_nn.update({"Red": dataset_dart_nn_10['Red']})
dividers_dart_nn.update({"SWIR1 - Red": dataset_dart_nn_10['SWIR1'] - dataset_dart_nn_10['Red']})

zeros_dart_nn = dict()

for key in dividers_dart_nn.keys():
    i = 0
    for value in dividers_dart_nn[key]:
        if value == 0:
            i += 1
    zeros_dart_nn.update({key:i})
zeros_dart_nn

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 0,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 0,
 'SWIR1 - Red': 0}

In [23]:
indexes = dataset_dart_nn_10.query('FDI < -1000').index #deletando amostras com valor -inf derivados da divisâo por zero no FDI
dataset_dart_nn_10.drop(indexes,  axis=0, inplace=True)
dataset_dart_nn_10

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,0,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
1,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,1,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
2,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,2,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
3,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,3,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
4,files/dart_files/Sentinel2_Artificial/Limpa/LD...,0,4,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403195,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,115,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403196,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,116,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403197,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,117,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525
403198,files/dart_files/Sentinel2_Artificial/LimpaEsp...,59,118,0.0324,0.0386,0.0198,0.0198,0.0195,0.0236,0.0208,...,Água,0.321918,1.747922,-0.106095,0.0331,0.406193,0.808163,0.446953,0.106095,0.016525


#### Building subdataframes

In [24]:
dataset_dart = dataset_bilinear_10.copy()
dart_subdatasets = dict()
dart_subdatasets['plastic'] = dataset_bilinear_10.loc[dataset_bilinear_10['Label'] == "Plastic"].copy()
dart_subdatasets['water'] = dataset_bilinear_10.loc[dataset_bilinear_10['Label'] == "Water"].copy()
dart_subdatasets['sand'] = dataset_bilinear_10.loc[dataset_bilinear_10['Label'] == "Sand"].copy()
dart_subdatasets['plastic_and_water'] = dataset_bilinear_10.query("Label == 'Water' or Label == 'Plastic'").copy()

In [113]:
dart_subdatasets['plastic'].groupby('Path').count()

Unnamed: 0_level_0,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,RedEdge3,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
Path,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/100/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/40/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/60/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/LDPE/S0/Transparent/Wet/80/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PET/S0/Transparent/Wet/100/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PET/S0/Transparent/Wet/40/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PET/S0/Transparent/Wet/60/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PET/S0/Transparent/Wet/80/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/Orange/Wet/100/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128
files/dart_files/Sentinel2_Artificial/Limpa/PP/S0/Orange/Wet/40/,128,128,128,128,128,128,128,128,128,128,...,128,128,128,128,128,128,128,128,128,128


In [128]:
dart_plastic_in_water, dart_plastic_in_foam = [], []

substring = "Espuma"
print(string.find(substring))

for i in dart_subdatasets['plastic'].index:
    if dataset_bilinear_10.at[i - 1, 'Path'].find("Espuma") > 0:
        dart_plastic_in_foam.append(dart_subdatasets['plastic'].loc[i])
    else:
        dart_plastic_in_water.append(dart_subdatasets['plastic'].loc[i])
        
dart_subdatasets['plastic_in_water'], dart_subdatasets['plastic_in_foam'] = pd.DataFrame(dart_plastic_in_water, columns=dart_subdatasets['plastic'].columns), pd.DataFrame(dart_plastic_in_foam, columns=dart_subdatasets['plastic'].columns)

-1


In [129]:
dart_subdatasets['plastic_in_water'].groupby('Cover_percent').count()

Unnamed: 0_level_0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
Cover_percent,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
40,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024
60,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024
80,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024
100,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024


In [131]:
dart_subdatasets['plastic_in_foam'].groupby('Cover_percent').count()

Unnamed: 0_level_0,Path,Line,Column,Blue,Green,NIR1,NIR2,NIRa,RedEdge1,RedEdge2,...,Classe,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
Cover_percent,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
40,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024
60,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024
80,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024,...,1024,1024,1024,1024,1024,1024,1024,1024,1024,1024


In [26]:
dart_subdatasets['plastic_20'] = dart_subdatasets['plastic'].query("Cover_percent == 20")
dart_subdatasets['plastic_40'] = dart_subdatasets['plastic'].query("Cover_percent == 40")
dart_subdatasets['plastic_60'] = dart_subdatasets['plastic'].query("Cover_percent == 60")
dart_subdatasets['plastic_80'] = dart_subdatasets['plastic'].query("Cover_percent == 80")
dart_subdatasets['plastic_100'] = dart_subdatasets['plastic'].query("Cover_percent == 100")

dart_subdatasets['plastic_ldpe'] = dart_subdatasets['plastic'].query("Polymer == 'LDPE'")
dart_subdatasets['plastic_micronapo'] = dart_subdatasets['plastic'].query("Polymer == 'MicroNapo'")
dart_subdatasets['plastic_nylon'] = dart_subdatasets['plastic'].query("Polymer == 'Nylon'")
dart_subdatasets['plastic_pet'] = dart_subdatasets['plastic'].query("Polymer == 'PET'")
dart_subdatasets['plastic_pp'] = dart_subdatasets['plastic'].query("Polymer == 'PP'")
dart_subdatasets['plastic_pvc'] = dart_subdatasets['plastic'].query("Polymer == 'PVC'")

In [27]:
dart_nn_subdatasets = dict()
dart_nn_subdatasets['plastic'] = dataset_dart_nn_10.loc[dataset_dart_nn_10['Label'] == "Plastic"].copy()
dart_nn_subdatasets['water'] = dataset_dart_nn_10.loc[dataset_dart_nn_10['Label'] == "Water"].copy()
dart_nn_subdatasets['sand'] = dataset_dart_nn_10.loc[dataset_dart_nn_10['Label'] == "Sand"].copy()
dart_nn_subdatasets['plastic_water'] = dataset_dart_nn_10.query("Label == 'Water' or Label == 'Plastic'").copy()

In [28]:
dart_cubic_subdatasets = dict()
dart_cubic_subdatasets['plastic'] = dataset_dart_cubic_10.loc[dataset_dart_cubic_10['Label'] == "Plastic"].copy()
dart_cubic_subdatasets['water'] = dataset_dart_cubic_10.loc[dataset_dart_cubic_10['Label'] == "Water"].copy()
dart_cubic_subdatasets['sand'] = dataset_dart_cubic_10.loc[dataset_dart_cubic_10['Label'] == "Sand"].copy()
dart_cubic_subdatasets['plastic_water'] = dataset_dart_cubic_10.query("Label == 'Water' or Label == 'Plastic'").copy()#### Testing resample methods

## Observed dataset (Copernicus / USGS)

### Loading data from sand area 

In [45]:
path = str(input("Observed sand area TIFF files path: ")) #For example: files/tiff_files/coast
path, sources, dates = tiff_files.open_folders(path)

GEE exported TIFF files path: files/tiff_files/coast


In [46]:
os.chdir('../')
os.chdir('../')
os.chdir('../')

In [47]:
tiff_data = tiff_files.get_images(path, sources, dates)

In [48]:
for image in tiff_data:
    image.setAreaLabel(0, (int(image.getXSize()) - 1), 0, (image.getYSize() - 1), "Coast") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent(100)
        pixel.setPolymer("None")

In [49]:
tiff_coastal_dataset = tiff_files.build_dataset(tiff_data)

### Loading data from sea area

In [50]:
path = str(input("Observed sea area TIFF files path")) #For example: files/tiff_files/sea
path, sources, dates = tiff_files.open_folders(path) 

GEE exported TIFF files path: files/tiff_files/sea


In [51]:
os.chdir('../')
os.chdir('../')
os.chdir('../')

In [52]:
tiff_data = tiff_files.get_images(path, sources, dates) 

#### Labeling the entire area as Water

In [53]:
for image in tiff_data:
    image.setAreaLabel(0, (int(image.getXSize()) - 1), 0, (image.getYSize() - 1), "Water") #-1 é porque indice (numero das linhas e colunas) comeca em zero, enquanto len (que informa o tamanho) comeca em 1     
    for pixel in image.getPixels():
        pixel.setLabel(image.getLabelsMap())
        pixel.setCoverPercent(100)
        pixel.setPolymer("None")

#### Labeling the artificial targets as plastic and wood

In [54]:
for image in tiff_data:
    if image.getDate() == "2019_04_18":
        #A1 100% Water
        image.setPixelLabel(6, 3, "Plastic")  #A2
        image.setPixelLabel(6, 4, "Plastic")  #A3 
        image.setPixelLabel(7, 2, "Plastic")  #A4
        image.setPixelLabel(7, 3, "Plastic")  #A5 
        image.setPixelLabel(7, 4, "Plastic")  #A6
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 6 and pixel.getColumn() == 3: #A2
                pixel.setCoverPercent(30) #Bags + bottles
                pixel.setPolymer("Bags and Bottles")
            elif pixel.getLine() == 6 and pixel.getColumn() == 4: #A3
                pixel.setCoverPercent(18)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 7 and pixel.getColumn() == 2: #A4
                pixel.setCoverPercent(38)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 7 and pixel.getColumn() == 3: #A5
                pixel.setCoverPercent(33) #Bags + bottles
                pixel.setPolymer("Bags and Bottles")
            elif pixel.getLine() == 7 and pixel.getColumn() == 4: #A6
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bottles")
    elif image.getDate() == "2019_05_03":
        image.setPixelLabel(1, 15, "Plastic")  #A1
        image.setPixelLabel(1, 16, "Plastic")  #A2
        image.setPixelLabel(2, 15, "Plastic")  #A3 
        image.setPixelLabel(2, 16, "Plastic")  #A4
        image.setPixelLabel(3, 10, "Plastic")  #B1
        image.setPixelLabel(3, 11, "Plastic")  #B2 
        image.setPixelLabel(4, 11, "Plastic")  #B3
        image.setPixelLabel(5, 7, "Plastic")  #C1
        image.setPixelLabel(5, 8, "Plastic")  #C2
        image.setPixelLabel(6, 7, "Plastic")  #C3 
        image.setPixelLabel(6, 8, "Plastic")  #C4
        image.setPixelLabel(5, 12, "Plastic")  #D1
        image.setPixelLabel(5, 13, "Plastic")  #D2
        image.setPixelLabel(6, 12, "Plastic")  #D3 
        image.setPixelLabel(6, 13, "Plastic")  #D4
        image.setPixelLabel(9, 2, "Plastic")  #E1
        image.setPixelLabel(9, 3, "Plastic")  #E2
        #E3 and E4 100% Water
        image.setPixelLabel(11, 7, "Plastic")  #F1
        image.setPixelLabel(11, 8, "Plastic")  #F2
        image.setPixelLabel(11, 9, "Plastic")  #F3 
        #F4 and F5 100% Water
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 1 and pixel.getColumn() == 15: #A1
                pixel.setCoverPercent(15)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 1 and pixel.getColumn() == 16: #A2
                pixel.setCoverPercent(43)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 2 and pixel.getColumn() == 15: #A3
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 2 and pixel.getColumn() == 16: #A4
                pixel.setCoverPercent(2)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 3 and pixel.getColumn() == 10: #B1
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 3 and pixel.getColumn() == 11: #B2
                pixel.setCoverPercent(38)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 4 and pixel.getColumn() == 11: #B3
                pixel.setCoverPercent(8)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 5 and pixel.getColumn() == 7: #C1
                pixel.setCoverPercent(9)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 5 and pixel.getColumn() == 8: #C2
                pixel.setCoverPercent(5)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 6 and pixel.getColumn() == 7: #C3
                pixel.setCoverPercent(18)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 6 and pixel.getColumn() == 8: #C4
                pixel.setCoverPercent(14)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 5 and pixel.getColumn() == 12: #D1
                pixel.setCoverPercent(3)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 5 and pixel.getColumn() == 13: #D2
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 6 and pixel.getColumn() == 12: #D3
                pixel.setCoverPercent(2)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 6 and pixel.getColumn() == 13: #D4
                pixel.setCoverPercent(9)
                pixel.setPolymer("Bags") #Reeds ignored
                #elif pixel.getLine() == 6 and pixel.getColumn() == 14: #Reeds ignored
            elif pixel.getLine() == 9 and pixel.getColumn() == 2: #E1
                pixel.setCoverPercent(13)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 9 and pixel.getColumn() == 3: #E2
                pixel.setCoverPercent(27)
                pixel.setPolymer("Bottles")
            #E3 and E4 100% Water
            elif pixel.getLine() == 11 and pixel.getColumn() == 7: #F1
                pixel.setCoverPercent(10)
                pixel.setPolymer("Bottles") 
            elif pixel.getLine() == 11 and pixel.getColumn() == 8: #F2
                pixel.setCoverPercent(21)
                pixel.setPolymer("Bottles") 
            elif pixel.getLine() == 11 and pixel.getColumn() == 9: #F3
                pixel.setCoverPercent(2)
                pixel.setPolymer("Bottles") 
            #F4 and F5 100% Water
    elif image.getDate() == "2019_05_18":
        image.setPixelLabel(16, 2, "Plastic")  #A1  
        #A2 100% Water
        image.setPixelLabel(17, 2, "Plastic")  #A3
        image.setPixelLabel(17, 3, "Plastic")  #A4
        #B1 100% Water
        image.setPixelLabel(12, 5, "Plastic")  #B2
        image.setPixelLabel(13, 4, "Plastic")  #B3  
        image.setPixelLabel(13, 5, "Plastic")  #B4
        image.setPixelLabel(5, 5, "Plastic")  #C1
        image.setPixelLabel(5, 6, "Plastic")  #C2
        image.setPixelLabel(6, 5, "Plastic")  #C3
        image.setPixelLabel(6, 6, "Plastic")  #C4  
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 16 and pixel.getColumn() == 2: #A1
                pixel.setCoverPercent(17)
                pixel.setPolymer("Bags")
            #A2 100% Water  
            elif pixel.getLine() == 17 and pixel.getColumn() == 2: #A3
                pixel.setCoverPercent(27)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 17 and pixel.getColumn() == 3: #A4
                pixel.setCoverPercent(3)
                pixel.setPolymer("Bags")
            #B1 100% Water
            elif pixel.getLine() == 12 and pixel.getColumn() == 5: #B2
                pixel.setCoverPercent(2)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 13 and pixel.getColumn() == 4: #B3
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 13 and pixel.getColumn() == 5: #B4
                pixel.setCoverPercent(10)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 5 and pixel.getColumn() == 5: #C1
                pixel.setCoverPercent(5)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 5 and pixel.getColumn() == 6: #C2
                pixel.setCoverPercent(6)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 6 and pixel.getColumn() == 5: #C3
                pixel.setCoverPercent(10)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 6 and pixel.getColumn() == 6: #C4
                pixel.setCoverPercent(40)
                pixel.setPolymer("Bottles")
    elif image.getDate() == "2019_05_28":
        image.setPixelLabel(0, 8, "Plastic")  #A1  
        image.setPixelLabel(1, 8, "Plastic")  #A2  
        image.setPixelLabel(1, 9, "Plastic")  #A3
        image.setPixelLabel(4, 6, "Plastic")  #B1  
        image.setPixelLabel(5, 6, "Plastic")  #B2
        image.setPixelLabel(7, 3, "Plastic")  #C1  
        #C2 100% Water 
        image.setPixelLabel(8, 3, "Plastic")  #C3
        image.setPixelLabel(8, 4, "Plastic")  #C4  
        #C5 100% Water 
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 0 and pixel.getColumn() == 8: #A1
                pixel.setCoverPercent(7)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 1 and pixel.getColumn() == 8: #A2
                pixel.setCoverPercent(10)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 1 and pixel.getColumn() == 9: #A3
                pixel.setCoverPercent(13)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 4 and pixel.getColumn() == 6: #B1
                pixel.setCoverPercent(5)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 5 and pixel.getColumn() == 6: #B2
                pixel.setCoverPercent(8)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 7 and pixel.getColumn() == 3: #C1
                pixel.setCoverPercent(2)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 8 and pixel.getColumn() == 3: #C3
                pixel.setCoverPercent(35)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 8 and pixel.getColumn() == 4: #C4
                pixel.setCoverPercent(18)
                pixel.setPolymer("Bottles")
    elif image.getDate() == "2019_06_07":
        image.setPixelLabel(1, 4, "Plastic")  #A1  
        image.setPixelLabel(1, 5, "Plastic")  #A2  
        #image.setPixelLabel(2, 5, "Plastic")  #A3
        #image.setPixelLabel(5, 5, "Plastic")  #B1  
        #image.setPixelLabel(5, 6, "Plastic")  #B2
        #image.setPixelLabel(6, 5, "Plastic")  #B3  
        #image.setPixelLabel(6, 6, "Plastic")  #B4
        image.setPixelLabel(9, 2, "Plastic")  #C1  
        image.setPixelLabel(9, 3, "Plastic")  #C2 
        image.setPixelLabel(9, 4, "Plastic")  #C3
        image.setPixelLabel(10, 2, "Plastic")  #C4  
        image.setPixelLabel(10, 3, "Plastic")  #C5 
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 1 and pixel.getColumn() == 4: #A1
                pixel.setCoverPercent(4)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 1 and pixel.getColumn() == 5: #A2
                pixel.setCoverPercent(9)
                pixel.setPolymer("Bottles")
            #A3 100% Water 
            #B1 Reeds ignored
            #B2 Reeds ignored
            #B3 Reeds ignored
            #B4 Reeds ignored
            elif pixel.getLine() == 9 and pixel.getColumn() == 2: #C1
                pixel.setCoverPercent(3)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 9 and pixel.getColumn() == 3: #C2
                pixel.setCoverPercent(55)
                pixel.setPolymer("Bags and Bottles")
            elif pixel.getLine() == 9 and pixel.getColumn() == 4: #C3
                pixel.setCoverPercent(1)
                pixel.setPolymer("Bottles")
            elif pixel.getLine() == 10 and pixel.getColumn() == 2: #C4
                pixel.setCoverPercent(4)
                pixel.setPolymer("Bags")
            elif pixel.getLine() == 10 and pixel.getColumn() == 3: #C5
                pixel.setCoverPercent(15)
                pixel.setPolymer("Bags and Bottles")      

In [55]:
for image in tiff_data:
    if image.getDate() == "2021_06_21":
        image.setPixelLabel(3, 5, "Plastic") 
        image.setPixelLabel(3, 6, "Plastic")
        image.setPixelLabel(4, 4, "Plastic")
        image.setPixelLabel(4, 5, "Plastic") 
        image.setPixelLabel(4, 6, "Plastic")
        image.setPixelLabel(4, 7, "Plastic")
        image.setPixelLabel(5, 4, "Plastic")
        image.setPixelLabel(5, 5, "Plastic") 
        image.setPixelLabel(5, 6, "Plastic")
        image.setPixelLabel(8, 3, "Wood") 
        image.setPixelLabel(8, 4, "Wood")
        image.setPixelLabel(8, 5, "Wood")
        image.setPixelLabel(9, 3, "Wood") 
        image.setPixelLabel(9, 4, "Wood")
        image.setPixelLabel(9, 5, "Wood")
        image.setPixelLabel(9, 6, "Wood") 
        image.setPixelLabel(10, 2, "Wood") 
        image.setPixelLabel(10, 3, "Wood") 
        image.setPixelLabel(10, 4, "Wood")
        image.setPixelLabel(10, 5, "Wood")
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 3: 
                if pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 4: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 7: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 5: 
                if pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 8: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
            elif pixel.getLine() == 9: 
                if pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 10: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                elif pixel.getColumn() == 3 or pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
    elif image.getDate() == "2021_07_01":
        image.setPixelLabel(3, 3, "Plastic") 
        image.setPixelLabel(3, 4, "Plastic")
        image.setPixelLabel(3, 5, "Plastic")
        image.setPixelLabel(4, 3, "Plastic") 
        image.setPixelLabel(4, 4, "Plastic")
        image.setPixelLabel(4, 5, "Plastic")
        image.setPixelLabel(5, 3, "Plastic") 
        image.setPixelLabel(5, 4, "Plastic")
        image.setPixelLabel(5, 5, "Plastic")
        image.setPixelLabel(6, 3, "Plastic")
        image.setPixelLabel(6, 4, "Plastic") 
        image.setPixelLabel(6, 5, "Plastic")
        image.setPixelLabel(8, 2, "Wood")
        image.setPixelLabel(8, 3, "Wood")
        image.setPixelLabel(9, 1, "Wood") 
        image.setPixelLabel(9, 2, "Wood")
        image.setPixelLabel(9, 3, "Wood")
        image.setPixelLabel(9, 4, "Wood")
        image.setPixelLabel(10, 1, "Wood") 
        image.setPixelLabel(10, 2, "Wood")
        image.setPixelLabel(10, 3, "Wood")
        image.setPixelLabel(10, 4, "Wood")
        image.setPixelLabel(11, 2, "Wood") 
        image.setPixelLabel(11, 3, "Wood") 
        image.setPixelLabel(11, 4, "Wood") 
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 3: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 4: 
                if pixel.getColumn() == 3: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 4 or pixel.getColumn() == 5:
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 5: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 6: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 8: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 3: 
                    pixel.setCoverPercent(-1)
            elif pixel.getLine() == 9: 
                if pixel.getColumn() == 1: 
                    pixel.setCoverPercent(-1)
                elif pixel.getColumn() == 2 or pixel.getColumn() == 3 or pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 10: 
                if pixel.getColumn() == 1: 
                    pixel.setCoverPercent(-1)
                elif pixel.getColumn() == 2 or pixel.getColumn() == 3 or pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 11: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 3 or pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-1)
    elif image.getDate() == "2021_07_06":
        image.setPixelLabel(2, 5, "Plastic")
        image.setPixelLabel(2, 6, "Plastic")
        image.setPixelLabel(3, 4, "Plastic") 
        image.setPixelLabel(3, 5, "Plastic")
        image.setPixelLabel(3, 6, "Plastic")
        image.setPixelLabel(4, 4, "Plastic") 
        image.setPixelLabel(4, 5, "Plastic")
        image.setPixelLabel(4, 6, "Plastic")
        image.setPixelLabel(5, 4, "Plastic") 
        image.setPixelLabel(5, 5, "Plastic")
        image.setPixelLabel(5, 6, "Plastic") 
        image.setPixelLabel(7, 3, "Wood")
        image.setPixelLabel(7, 4, "Wood")
        image.setPixelLabel(8, 3, "Wood")
        image.setPixelLabel(8, 4, "Wood")
        image.setPixelLabel(8, 5, "Wood")
        image.setPixelLabel(8, 6, "Wood") 
        image.setPixelLabel(9, 3, "Wood")
        image.setPixelLabel(9, 4, "Wood")
        image.setPixelLabel(9, 5, "Wood")
        image.setPixelLabel(10, 3, "Wood")
        image.setPixelLabel(10, 4, "Wood")
        image.setPixelLabel(10, 5, "Wood")
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 2: 
                if pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 3: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 4: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 5: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 7: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
            elif pixel.getLine() == 8: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                if pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 9: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                elif pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 10: 
                if pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
    elif image.getDate() == "2021_07_21":
        image.setPixelLabel(2, 6, "Plastic")
        image.setPixelLabel(2, 7, "Plastic")
        image.setPixelLabel(2, 8, "Plastic")
        image.setPixelLabel(3, 6, "Plastic")
        image.setPixelLabel(3, 7, "Plastic")
        image.setPixelLabel(3, 8, "Plastic")
        image.setPixelLabel(4, 6, "Plastic")
        image.setPixelLabel(4, 7, "Plastic")
        image.setPixelLabel(4, 8, "Plastic")
        image.setPixelLabel(5, 6, "Plastic") 
        image.setPixelLabel(5, 7, "Plastic") 
        image.setPixelLabel(7, 4, "Wood")
        image.setPixelLabel(7, 5, "Wood")
        image.setPixelLabel(7, 6, "Wood")
        image.setPixelLabel(8, 4, "Wood")
        image.setPixelLabel(8, 5, "Wood")
        image.setPixelLabel(8, 6, "Wood")
        image.setPixelLabel(8, 7, "Wood")
        image.setPixelLabel(9, 4, "Wood")
        image.setPixelLabel(9, 5, "Wood")
        image.setPixelLabel(9, 6, "Wood")
        image.setPixelLabel(9, 7, "Wood") 
        image.setPixelLabel(10, 5, "Wood")
        image.setPixelLabel(10, 6, "Wood")
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 2: 
                if pixel.getColumn() == 6 or pixel.getColumn() == 7 or pixel.getColumn() == 8: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 3: 
                if pixel.getColumn() == 8: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 6 or pixel.getColumn() == 7: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 4: 
                if pixel.getColumn() == 6 or pixel.getColumn() == 8:  
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 7:  
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 5: 
                if pixel.getColumn() == 6 or pixel.getColumn() == 7:  
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 7: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
            elif pixel.getLine() == 8: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 7: 
                    pixel.setCoverPercent(-1)
                if pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 9: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 7: 
                    pixel.setCoverPercent(-1)
                if pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 10: 
                if pixel.getColumn() == 5 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
    elif image.getDate() == "2021_08_25": 
        image.setPixelLabel(1, 4, "Plastic") 
        image.setPixelLabel(1, 5, "Plastic") 
        image.setPixelLabel(2, 4, "Plastic")
        image.setPixelLabel(2, 5, "Plastic")
        image.setPixelLabel(2, 6, "Plastic")
        image.setPixelLabel(3, 4, "Plastic")
        image.setPixelLabel(3, 5, "Plastic")
        image.setPixelLabel(3, 6, "Plastic")
        image.setPixelLabel(4, 4, "Plastic")
        image.setPixelLabel(4, 5, "Plastic")
        image.setPixelLabel(4, 6, "Plastic")
        image.setPixelLabel(6, 3, "Wood") 
        image.setPixelLabel(7, 2, "Wood") 
        image.setPixelLabel(7, 3, "Wood") 
        image.setPixelLabel(7, 4, "Wood")
        image.setPixelLabel(7, 5, "Wood")
        image.setPixelLabel(8, 2, "Wood") 
        image.setPixelLabel(8, 3, "Wood")
        image.setPixelLabel(8, 4, "Wood")
        image.setPixelLabel(8, 5, "Wood")
        image.setPixelLabel(9, 2, "Wood") 
        image.setPixelLabel(9, 3, "Wood")
        image.setPixelLabel(9, 4, "Wood")
        image.setPixelLabel(9, 5, "Wood")
        for pixel in image.getPixels():
            pixel.setLabel(image.getLabelsMap())
            if pixel.getLine() == 1: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            if pixel.getLine() == 2: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 3: 
                if pixel.getColumn() == 6: 
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
                elif pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-100)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 4: 
                if pixel.getColumn() == 4 or pixel.getColumn() == 5 or pixel.getColumn() == 6:  
                    pixel.setCoverPercent(-1)
                    pixel.setPolymer("HDPE mesh")
            elif pixel.getLine() == 6: 
                if pixel.getColumn() == 3: 
                    pixel.setCoverPercent(-1)
            elif pixel.getLine() == 7: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 4 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                elif pixel.getColumn() == 3: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 8: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 5: 
                    pixel.setCoverPercent(-1)
                if pixel.getColumn() == 3 or pixel.getColumn() == 4: 
                    pixel.setCoverPercent(-100)
            elif pixel.getLine() == 9: 
                if pixel.getColumn() == 2 or pixel.getColumn() == 3 or pixel.getColumn() == 4 or pixel.getColumn() == 5:
                    pixel.setCoverPercent(-1)

In [56]:
tiff_marine_dataset = tiff_files.build_dataset(tiff_data)

In [59]:
tiff_dataset = pd.concat([tiff_marine_dataset, tiff_coastal_dataset], ignore_index=True)
tiff_dataset

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,Red,RedEdge1,RedEdge2,RedEdge3,SWIR1,SWIR2,Label,Cover_percent,Polymer
0,2019_04_18,0,0,0.0408,0.0439,0.0173,0.0168,0.0223,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,
1,2019_04_18,0,1,0.0388,0.0398,0.0169,0.0168,0.0184,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,
2,2019_04_18,0,2,0.0428,0.0403,0.0166,0.0149,0.0159,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,
3,2019_04_18,0,3,0.0344,0.0305,0.0157,0.0149,0.0134,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,
4,2019_04_18,0,4,0.0294,0.0257,0.0135,0.0110,0.0116,0.0119,0.0145,0.0154,0.0035,0.0029,Water,100,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192,2021_08_25,9,5,0.0466,0.0733,0.1892,0.2083,0.0937,0.1093,0.1559,0.1754,0.2495,0.1615,Coast,100,
3193,2021_08_25,9,6,0.0513,0.0754,0.1967,0.2286,0.1010,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,
3194,2021_08_25,9,7,0.0509,0.0773,0.2047,0.2286,0.1024,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,
3195,2021_08_25,9,8,0.0486,0.0713,0.2026,0.2381,0.0926,0.1229,0.1834,0.2078,0.2647,0.1822,Coast,100,


In [60]:
tiff_dataset.to_csv(str(input("Observed files path: "))) #For example: files/csv_files/dataset_usgs.csv

Observed files path: files/csv_files/dataset_usgs.csv


### Building subdataframes

In [29]:
dataset_usgs = pd.read_csv(str(input("Observed files path: "))) #For example: files/csv_files/dataset_usgs.csv
dataset_usgs.drop('Unnamed: 0', axis=1, inplace=True)
dataset_usgs

Observed files path: files/csv_files/dataset_usgs.csv


Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,Red,RedEdge1,RedEdge2,RedEdge3,SWIR1,SWIR2,Label,Cover_percent,Polymer
0,2019_04_18,0,0,0.0408,0.0439,0.0173,0.0168,0.0223,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,
1,2019_04_18,0,1,0.0388,0.0398,0.0169,0.0168,0.0184,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,
2,2019_04_18,0,2,0.0428,0.0403,0.0166,0.0149,0.0159,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,
3,2019_04_18,0,3,0.0344,0.0305,0.0157,0.0149,0.0134,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,
4,2019_04_18,0,4,0.0294,0.0257,0.0135,0.0110,0.0116,0.0119,0.0145,0.0154,0.0035,0.0029,Water,100,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192,2021_08_25,9,5,0.0466,0.0733,0.1892,0.2083,0.0937,0.1093,0.1559,0.1754,0.2495,0.1615,Coast,100,
3193,2021_08_25,9,6,0.0513,0.0754,0.1967,0.2286,0.1010,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,
3194,2021_08_25,9,7,0.0509,0.0773,0.2047,0.2286,0.1024,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,
3195,2021_08_25,9,8,0.0486,0.0713,0.2026,0.2381,0.0926,0.1229,0.1834,0.2078,0.2647,0.1822,Coast,100,


#### Adding labels translation, year column and radiometric indices

In [30]:
usgs_label = []
for i in range(len(dataset_usgs)):
    if dataset_usgs.at[i, 'Label'] == 'Water':
        usgs_label.append('Água')
    elif dataset_usgs.at[i, 'Label'] == 'Coast':
        usgs_label.append('Costa')
    elif dataset_usgs.at[i, 'Label'] == 'Plastic':
        usgs_label.append('Plástico')
    elif dataset_usgs.at[i, 'Label'] == 'Wood':
        usgs_label.append('Madeira')

dataset_usgs['Classe'] = usgs_label

In [31]:
usgs_polymer = []

for i in range(len(dataset_usgs)):
    if dataset_usgs.at[i, 'Polymer'] == 'None':
        usgs_polymer.append('Nenhum')
    elif dataset_usgs.at[i, 'Polymer'] == 'Bags':
        usgs_polymer.append('Sacolas')
    elif dataset_usgs.at[i, 'Polymer'] == 'Bottles':
        usgs_polymer.append('Garrafas')
    elif dataset_usgs.at[i, 'Polymer'] == 'HDPE mesh':
        usgs_polymer.append('Malha de HDPE')
    elif dataset_usgs.at[i, 'Polymer'] == 'Bags and Bottles':
        usgs_polymer.append('Sacolas e garrafas')

dataset_usgs['Polímero'] = usgs_polymer
dataset_usgs

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,Red,RedEdge1,RedEdge2,RedEdge3,SWIR1,SWIR2,Label,Cover_percent,Polymer,Classe,Polímero
0,2019_04_18,0,0,0.0408,0.0439,0.0173,0.0168,0.0223,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,,Água,Nenhum
1,2019_04_18,0,1,0.0388,0.0398,0.0169,0.0168,0.0184,0.0183,0.0181,0.0193,0.0088,0.0059,Water,100,,Água,Nenhum
2,2019_04_18,0,2,0.0428,0.0403,0.0166,0.0149,0.0159,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,,Água,Nenhum
3,2019_04_18,0,3,0.0344,0.0305,0.0157,0.0149,0.0134,0.0130,0.0167,0.0170,0.0050,0.0040,Water,100,,Água,Nenhum
4,2019_04_18,0,4,0.0294,0.0257,0.0135,0.0110,0.0116,0.0119,0.0145,0.0154,0.0035,0.0029,Water,100,,Água,Nenhum
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192,2021_08_25,9,5,0.0466,0.0733,0.1892,0.2083,0.0937,0.1093,0.1559,0.1754,0.2495,0.1615,Coast,100,,Costa,Nenhum
3193,2021_08_25,9,6,0.0513,0.0754,0.1967,0.2286,0.1010,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,,Costa,Nenhum
3194,2021_08_25,9,7,0.0509,0.0773,0.2047,0.2286,0.1024,0.1218,0.1754,0.1965,0.2600,0.1704,Coast,100,,Costa,Nenhum
3195,2021_08_25,9,8,0.0486,0.0713,0.2026,0.2381,0.0926,0.1229,0.1834,0.2078,0.2647,0.1822,Coast,100,,Costa,Nenhum


In [32]:
usgs_year = []
for i in range(len(dataset_usgs)):
    if str(dataset_usgs.at[i, 'Path']).split('_')[0] == '2019':
        usgs_year.append('2019')
    elif str(dataset_usgs.at[i, 'Path']).split('_')[0] == '2021':
        usgs_year.append('2021')

dataset_usgs['Year'] = usgs_year

In [33]:
dataset_usgs['NDWI'] = (dataset_usgs['Green'] - dataset_usgs['NIR1']) / (dataset_usgs['Green'] + dataset_usgs['NIR1'])
dataset_usgs['WRI'] = (dataset_usgs['Green'] + dataset_usgs['Red']) / (dataset_usgs['NIR1'] + dataset_usgs['SWIR2'])
dataset_usgs['NDVI'] = (dataset_usgs['NIR1'] - dataset_usgs['Red']) / (dataset_usgs['NIR1'] + dataset_usgs['Red'])
dataset_usgs['AWEI'] = 4 * (dataset_usgs['Green'] - dataset_usgs['SWIR2']) - (0.25 * dataset_usgs['NIR1'] + 2.75 * dataset_usgs['SWIR1'])
dataset_usgs['MNDWI'] = (dataset_usgs['Green'] - dataset_usgs['SWIR2']) / (dataset_usgs['Green'] + dataset_usgs['SWIR2'])
dataset_usgs['SR'] = dataset_usgs['NIR1'] / dataset_usgs['Red']
dataset_usgs['PI'] = dataset_usgs['NIR1'] / (dataset_usgs['NIR1'] + dataset_usgs['Red'])
dataset_usgs['RNDVI'] = (dataset_usgs['Red'] - dataset_usgs['NIR1']) / (dataset_usgs['Red'] + dataset_usgs['NIR1'])
dataset_usgs['FDI'] = dataset_usgs['NIR1'] - (dataset_usgs['RedEdge2'] + (dataset_usgs['SWIR1'] - dataset_usgs['RedEdge2']) * ((dataset_usgs['NIR1'] - dataset_usgs['Red']) / (dataset_usgs['SWIR1'] - dataset_usgs['Red'])) * 10)

dataset_usgs

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,Red,RedEdge1,RedEdge2,...,Year,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,2019_04_18,0,0,0.0408,0.0439,0.0173,0.0168,0.0223,0.0183,0.0181,...,2019,0.434641,2.853448,-0.126263,0.123475,0.763052,0.775785,0.436869,0.126263,0.033644
1,2019_04_18,0,1,0.0388,0.0398,0.0169,0.0168,0.0184,0.0183,0.0181,...,2019,0.403880,2.552632,-0.042493,0.107175,0.741794,0.918478,0.478754,0.042493,0.013331
2,2019_04_18,0,2,0.0428,0.0403,0.0166,0.0149,0.0159,0.0130,0.0167,...,2019,0.416520,2.728155,0.021538,0.127300,0.819413,1.044025,0.510769,-0.021538,-0.007614
3,2019_04_18,0,3,0.0344,0.0305,0.0157,0.0149,0.0134,0.0130,0.0167,...,2019,0.320346,2.228426,0.079038,0.088325,0.768116,1.171642,0.539519,-0.079038,-0.033036
4,2019_04_18,0,4,0.0294,0.0257,0.0135,0.0110,0.0116,0.0119,0.0145,...,2019,0.311224,2.274390,0.075697,0.078200,0.797203,1.163793,0.537849,-0.075697,-0.026802
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192,2021_08_25,9,5,0.0466,0.0733,0.1892,0.2083,0.0937,0.1093,0.1559,...,2021,-0.441524,0.476190,0.337575,-1.086225,-0.375639,2.019210,0.668788,-0.337575,-0.540436
3193,2021_08_25,9,6,0.0513,0.0754,0.1967,0.2286,0.1010,0.1218,0.1754,...,2021,-0.445792,0.480523,0.321465,-1.144175,-0.386493,1.947525,0.660732,-0.321465,-0.487896
3194,2021_08_25,9,7,0.0509,0.0773,0.2047,0.2286,0.1024,0.1218,0.1754,...,2021,-0.451773,0.479072,0.333116,-1.138575,-0.375858,1.999023,0.666558,-0.333116,-0.519848
3195,2021_08_25,9,8,0.0486,0.0713,0.2026,0.2381,0.0926,0.1229,0.1834,...,2021,-0.479372,0.425936,0.372629,-1.222175,-0.437475,2.187905,0.686314,-0.372629,-0.500440


#### Cleaning data and creating subdataframes

In [34]:
dividers_usgs = dict()
dividers_usgs.update({"Green + NIR1": dataset_usgs['Green'] + dataset_usgs['NIR1']})
dividers_usgs.update({"SWIR2 + NIR1": dataset_usgs['NIR1'] + dataset_usgs['SWIR2']})
dividers_usgs.update({"Red + NIR1": dataset_usgs['NIR1'] + dataset_usgs['Red']})
dividers_usgs.update({"0.25 * NIR1 + 2.75 * SWIR1": 0.25 * dataset_usgs['NIR1'] + 2.75 * dataset_usgs['SWIR1']})
dividers_usgs.update({"Swir2 + * Green": dataset_usgs['Green'] + dataset_usgs['SWIR2']})
dividers_usgs.update({"Red": dataset_usgs['Red']})
dividers_usgs.update({"SWIR1 - Red": dataset_usgs['SWIR1'] - dataset_usgs['Red']})

zeros_usgs = dict()

for key in dividers_usgs.keys():#### Adding labels translation, year column and radiometric indices
    i = 0
    for value in dividers_usgs[key]:
        if value == 0:
            i += 1
    zeros_usgs.update({key:i})

zeros_usgs

{'Green + NIR1': 0,
 'SWIR2 + NIR1': 0,
 'Red + NIR1': 1,
 '0.25 * NIR1 + 2.75 * SWIR1': 0,
 'Swir2 + * Green': 0,
 'Red': 3,
 'SWIR1 - Red': 16}

In [35]:
query = 'NDVI > 1000 or NDVI < -1000 or SR > 1000 or SR < -1000 or PI > 1000 or PI < -1000 or RNDVI > 1000 or RNDVI < -1000 or FDI > 1000 or FDI < -1000'
indexes = dataset_usgs.query(query).index #deletando amostras com valor -inf derivados da divisâo por zero no FDI
dataset_usgs.drop(indexes,  axis=0, inplace=True) #20 pixels nâo selecionados, 1 de plástico e todos os outros de água
dataset_usgs

Unnamed: 0,Path,Line,Column,Blue,Green,NIR1,NIR2,Red,RedEdge1,RedEdge2,...,Year,NDWI,WRI,NDVI,AWEI,MNDWI,SR,PI,RNDVI,FDI
0,2019_04_18,0,0,0.0408,0.0439,0.0173,0.0168,0.0223,0.0183,0.0181,...,2019,0.434641,2.853448,-0.126263,0.123475,0.763052,0.775785,0.436869,0.126263,0.033644
1,2019_04_18,0,1,0.0388,0.0398,0.0169,0.0168,0.0184,0.0183,0.0181,...,2019,0.403880,2.552632,-0.042493,0.107175,0.741794,0.918478,0.478754,0.042493,0.013331
2,2019_04_18,0,2,0.0428,0.0403,0.0166,0.0149,0.0159,0.0130,0.0167,...,2019,0.416520,2.728155,0.021538,0.127300,0.819413,1.044025,0.510769,-0.021538,-0.007614
3,2019_04_18,0,3,0.0344,0.0305,0.0157,0.0149,0.0134,0.0130,0.0167,...,2019,0.320346,2.228426,0.079038,0.088325,0.768116,1.171642,0.539519,-0.079038,-0.033036
4,2019_04_18,0,4,0.0294,0.0257,0.0135,0.0110,0.0116,0.0119,0.0145,...,2019,0.311224,2.274390,0.075697,0.078200,0.797203,1.163793,0.537849,-0.075697,-0.026802
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192,2021_08_25,9,5,0.0466,0.0733,0.1892,0.2083,0.0937,0.1093,0.1559,...,2021,-0.441524,0.476190,0.337575,-1.086225,-0.375639,2.019210,0.668788,-0.337575,-0.540436
3193,2021_08_25,9,6,0.0513,0.0754,0.1967,0.2286,0.1010,0.1218,0.1754,...,2021,-0.445792,0.480523,0.321465,-1.144175,-0.386493,1.947525,0.660732,-0.321465,-0.487896
3194,2021_08_25,9,7,0.0509,0.0773,0.2047,0.2286,0.1024,0.1218,0.1754,...,2021,-0.451773,0.479072,0.333116,-1.138575,-0.375858,1.999023,0.666558,-0.333116,-0.519848
3195,2021_08_25,9,8,0.0486,0.0713,0.2026,0.2381,0.0926,0.1229,0.1834,...,2021,-0.479372,0.425936,0.372629,-1.222175,-0.437475,2.187905,0.686314,-0.372629,-0.500440


In [36]:
usgs_subdatasets = dict()

usgs_subdatasets['plp2021'] = dataset_usgs.query('Year == "2021"').copy()
usgs_subdatasets['plp2019'] = dataset_usgs.loc[dataset_usgs['Year'] == "2019"].copy()

usgs_subdatasets['water'] = dataset_usgs.loc[dataset_usgs['Label'] == "Water"].copy()
usgs_subdatasets['coast'] = dataset_usgs.loc[dataset_usgs['Label'] == "Coast"].copy()
usgs_subdatasets['plastic'] = dataset_usgs.loc[dataset_usgs['Label'] == "Plastic"].copy()
usgs_subdatasets['wood'] = dataset_usgs.loc[dataset_usgs['Label'] == "Wood"].copy()
usgs_subdatasets['plastic_and_water'] = dataset_usgs.query("Label == 'Water' or Label == 'Plastic'").copy()

usgs_subdatasets['plp2021_plastic_water'] = usgs_subdatasets['plp2021'].query('Label == "Plastic" or Label == "Water"').copy()
usgs_subdatasets['plp2019_plastic_water'] = usgs_subdatasets['plp2019'].query('Label == "Plastic" or Label == "Water"').copy()

usgs_subdatasets['wood_-100'] = usgs_subdatasets['wood'].query('Cover_percent < -99').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['wood_-100'])):
    cover_percents.append("Unknown")
usgs_subdatasets['wood_-100']['Cover_percent'] = cover_percents

usgs_subdatasets['wood_-000'] = usgs_subdatasets['wood'].query('Cover_percent < 0 and Cover_percent > -99').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['wood_-000'])):
    cover_percents.append("Unknown")
usgs_subdatasets['wood_-000']['Cover_percent'] = cover_percents

usgs_subdatasets['wood_unknownpercent'] = usgs_subdatasets['wood'].query('Cover_percent < 100').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['wood_unknownpercent'])):
    cover_percents.append("Unknown")
usgs_subdatasets['wood_unknownpercent']['Cover_percent'] = cover_percents

usgs_subdatasets['plastic_20'] = usgs_subdatasets['plastic'].query('Cover_percent >= 0 and Cover_percent <= 20').copy()#Até 20% de cobertura plástica
usgs_subdatasets['plastic_40'] = usgs_subdatasets['plastic'].query("Cover_percent > 20 and Cover_percent <= 40").copy()#21 a 40% de cobertura plástica
usgs_subdatasets['plastic_60'] = usgs_subdatasets['plastic'].query("Cover_percent > 40 and Cover_percent <= 60").copy()#41 a 60% de cobertura plástica
usgs_subdatasets['plastic_80'] = usgs_subdatasets['plastic'].query("Cover_percent > 60 and Cover_percent <= 80").copy()#61 a 80% de cobertura plástica
usgs_subdatasets['plastic_100'] = usgs_subdatasets['plastic'].query("Cover_percent > 80").copy()#81 a 100% de cobertura plástica
usgs_subdatasets['plastic_min_20'] = usgs_subdatasets['plastic'].query("Cover_percent >= 20").copy()#min 20% de cobertura plástica 
usgs_subdatasets['plastic_min_50'] = usgs_subdatasets['plastic'].query("Cover_percent >= 50").copy()#min 50% de cobertura plástica

usgs_subdatasets['plastic_-100'] = usgs_subdatasets['plastic'].query('Cover_percent < -99').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['plastic_-100'])):
    cover_percents.append("Unknown")
usgs_subdatasets['plastic_-100']['Cover_percent'] = cover_percents

usgs_subdatasets['plastic_-000'] = usgs_subdatasets['plastic'].query('Cover_percent < 0 and Cover_percent > -99').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['plastic_-000'])):
    cover_percents.append("Unknown")
usgs_subdatasets['plastic_-000']['Cover_percent'] = cover_percents

usgs_subdatasets['plastic_unknownpercent'] = usgs_subdatasets['plastic'].query('Cover_percent < 0').copy()
cover_percents = []
for i in range(len(usgs_subdatasets['plastic_unknownpercent'])):
    cover_percents.append("Unknown")
usgs_subdatasets['plastic_unknownpercent']['Cover_percent'] = cover_percents

usgs_subdatasets['plastic_bags'] = usgs_subdatasets['plastic'].query('Polymer == "Bags"').copy()
usgs_subdatasets['plastic_bottles'] = usgs_subdatasets['plastic'].query('Polymer == "Bottles"').copy()
usgs_subdatasets['plastic_mesh'] = usgs_subdatasets['plastic'].query('Polymer == "HDPE mesh"').copy()
usgs_subdatasets['plastic_mix'] = usgs_subdatasets['plastic'].query('Polymer == "Bags and Bottles"').copy()

#### Testing resample methods (English and Portuguese)

In [40]:
datasets_names = ["Means per resampling method", "Standard deviation per resampling method"]
traces = [
            [
                [dataset_dart_nn_10[feature].mean() for feature in feature_names], #Nearest neighbor 10 m
                [dataset_bilinear_10[feature].mean() for feature in feature_names],#Bilinear Interpolation 10 m
                [dataset_dart_cubic_10[feature].mean() for feature in feature_names], #Cubic Interpolation 10 m
                [dataset_dart_nn_20[feature].mean() for feature in feature_names], #Nearest neighbor 20 m
                [dataset_bilinear_20[feature].mean() for feature in feature_names],#Bilinear Interpolation 20 m
                [dataset_dart_cubic_20[feature].mean() for feature in feature_names], #Cubic Interpolation 20 m
                [dataset_usgs[feature].mean() for feature in feature_names] #Acolite
             ],
             [
                [dataset_dart_nn_10[feature].std() for feature in feature_names], #Nearest neighbor 10 m
                [dataset_bilinear_10[feature].std() for feature in feature_names],#Bilinear Interpolation 10 m
                [dataset_dart_cubic_10[feature].std() for feature in feature_names], #Cubic Interpolation 10 m
                [dataset_dart_nn_20[feature].std() for feature in feature_names], #Nearest neighbor 20 m
                [dataset_bilinear_20[feature].std() for feature in feature_names],#Bilinear Interpolation 20 m
                [dataset_dart_cubic_20[feature].std() for feature in feature_names], #Cubic Interpolation 20 m
                [dataset_usgs[feature].std() for feature in feature_names] #Acolite 
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names], [feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names]]
legends = [['DART nearest neighbor 10m mean', 'DART bilinear interpolation 10m mean', 'DART cubic interpolation 10m mean', 'DART nearest neighbor 20m mean', 'DART bilinear interpolation 20m mean', 'DART cubic interpolation 20m mean', 'USGS acolite mean'],
           ['DART nearest neighbor 10m std', 'DART bilinear interpolation 10m std', 'DART cubic interpolation 10m std', 'DART nearest neighbor 20m std', 'DART bilinear interpolation 20m std', 'DART cubic interpolation 20m std', 'USGS acolite std']]
modes = [['markers+lines', 'markers+lines', 'markers+lines', 'dash', 'dash', 'dash', 'markers+lines'],
         ['markers+lines', 'markers+lines', 'markers+lines', 'dash', 'dash', 'dash', 'markers+lines']]
colors = [['#c9b207', '#008000', '#5425ff','#c9b207', '#008000', '#5425ff', '#FF0000'], ['#c9b207', '#008000', '#5425ff', '#c9b207', '#008000', '#5425ff','#FF0000']]
chart_title = "Statistics per resampling method"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1800
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_std_resampling

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/english/exploratory_analysis/descriptive_statistics/mean_std_resampling


In [41]:
datasets_names = ["Médias por método de reamostragem", "Desvio padrão por método de reamostragem"]
legends = [['Média DART - vizinho mais próximo 10 m', 'Média DART - interpolação bilinear 10 m', 'Média DART - interpolação cúbica 10 m', 'Média DART - vizinho mais próximo 20 m', 'Média DART - interpolação bilinear 20 m', 'Média DART - interpolação cúbica 20 m', 'Média USGS'],
           ['Std DART - vizinho mais próximo 10 m', 'Std DART - interpolação bilinear 10 m', 'Std DART - interpolação cúbica 10 m', 'Std DART - vizinho mais próximo 20 m', 'Std DART - interpolação bilinear 20 m', 'Std DART - interpolação cúbica 20 m', 'Std USGS']]
chart_title = "Estatísticas por método de reamostragem"
x_title = "Banda"
y_title = "Reflectância"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_reamostragem

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_reamostragem


In [42]:
datasets_names = ['DART nearest neighbor quartiles', 'DART bilinear interpolation quartiles', 'DART cubic interpolation quartiles', 'USGS quartiles']

traces = [
            [
                [dataset_dart_nn_10[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_bilinear_10[feature].describe()['min'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_dart_cubic_10[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['max'] for feature in feature_names]
             ],
            [
                [dataset_usgs[feature].describe()['min'] for feature in feature_names],
                [dataset_usgs[feature].describe()['25%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['50%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['75%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['DART nearest neighbor min', 'DART nearest neighbor 25%', 'DART nearest neighbor 50%', 'DART nearest neighbor 75%', 'DART nearest neighbor max'],
            ['DART bilinear interpolation min', 'DART bilinear interpolation 25%', 'DART bilinear interpolation 50%', 'DART bilinear interpolation 75%', 'DART bilinear interpolation max'],
            ['DART cubic interpolation min', 'DART cubic interpolation 25%', 'DART cubic interpolation 50%', 'DART cubic interpolation 75%', 'DART cubic interpolation max'],
            ['USGS min', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS max']
          ]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#f9e54e', '#f9e54e', '#d9c108', '#a89506', '#a89506'], 
          ['#00b300', '#00b300', '#009100', '#005e00', '#005e00'],
          ['#7c58ff', '#7c58ff', '#3903ff', '#2900be', '#2900be'],
          ['#f44', '#f44', '#d00', '#b00', '#b00']]

chart_title = "Statistics per resampling method"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1800
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_10m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_10m


In [43]:
datasets_names = ['Quartis DART - vizinho mais próximo', 'Quartis DART - interpolação bilinear', 'Quartis DART - interpolação cúbica', 'Quartis USGS']
legends = [
            ['DART vizinho mais próximo mín', 'DART vizinho mais próximo 25%', 'DART vizinho mais próximo 50%', 'DART vizinho mais próximo 75%', 'DART vizinho mais próximo máx'],
            ['DART interpolação bilinear mín', 'DART interpolação bilinear 25%', 'DART interpolação bilinear 50%', 'DART interpolação bilinear 75%', 'DART interpolação bilinear máx'],
            ['DART interpolação cúbica mín', 'DART interpolação cúbica 25%', 'DART interpolação cúbica 50%', 'DART interpolação cúbica 75%', 'DART interpolação cúbica máx'],
            ['USGS mín', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS máx']
          ]

chart_title = "Estatísticas por método de reamostragem"
x_title = "Banda"
y_title = "Reflectância"
height = 600
width = 1900
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_10m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_10m


In [44]:
datasets_names = ['DART nearest neighbor quartiles', 'DART bilinear interpolation quartiles', 'DART cubic interpolation quartiles', 'USGS quartiles']

traces = [
            [
                [dataset_dart_nn_20[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_bilinear_20[feature].describe()['min'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_dart_cubic_20[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['max'] for feature in feature_names]
             ],
            [
                [dataset_usgs[feature].describe()['min'] for feature in feature_names],
                [dataset_usgs[feature].describe()['25%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['50%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['75%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['DART nearest neighbor min', 'DART nearest neighbor 25%', 'DART nearest neighbor 50%', 'DART nearest neighbor 75%', 'DART nearest neighbor max'],
            ['DART bilinear interpolation min', 'DART bilinear interpolation 25%', 'DART bilinear interpolation 50%', 'DART bilinear interpolation 75%', 'DART bilinear interpolation max'],
            ['DART cubic interpolation min', 'DART cubic interpolation 25%', 'DART cubic interpolation 50%', 'DART cubic interpolation 75%', 'DART cubic interpolation max'],
            ['USGS min', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS max']
          ]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#f9e54e', '#f9e54e', '#d9c108', '#a89506', '#a89506'], 
          ['#00b300', '#00b300', '#009100', '#005e00', '#005e00'],
          ['#7c58ff', '#7c58ff', '#3903ff', '#2900be', '#2900be'],
          ['#f44', '#f44', '#d00', '#b00', '#b00']]

chart_title = "Statistics per resampling method"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1800
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_20m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_20m


In [45]:
datasets_names = ['Quartis DART - vizinho mais próximo', 'Quartis DART - interpolação bilinear', 'Quartis DART - interpolação cúbica', 'Quartis USGS']
legends = [
            ['DART vizinho mais próximo mín', 'DART vizinho mais próximo 25%', 'DART vizinho mais próximo 50%', 'DART vizinho mais próximo 75%', 'DART vizinho mais próximo máx'],
            ['DART interpolação bilinear mín', 'DART interpolação bilinear 25%', 'DART interpolação bilinear 50%', 'DART interpolação bilinear 75%', 'DART interpolação bilinear máx'],
            ['DART interpolação cúbica mín', 'DART interpolação cúbica 25%', 'DART interpolação cúbica 50%', 'DART interpolação cúbica 75%', 'DART interpolação cúbica máx'],
            ['USGS mín', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS máx']
          ]

chart_title = "Estatísticas por método de reamostragem"
x_title = "Banda"
y_title = "Reflectância"
height = 600
width = 1900
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_20m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_20m


In [46]:
datasets_names = ['DART quartiles', 'USGS quartiles']

traces = [
            [
                [dataset_dart_nn_10[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_nn_10[feature].describe()['max'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['min'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_bilinear_10[feature].describe()['max'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_cubic_10[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_usgs[feature].describe()['min'] for feature in feature_names],
                [dataset_usgs[feature].describe()['25%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['50%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['75%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names, 
           feature_names, feature_names, feature_names, feature_names, feature_names, 
           feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['DART nearest neighbor min', 'DART nearest neighbor 25%', 'DART nearest neighbor 50%', 'DART nearest neighbor 75%', 'DART nearest neighbor max',
             'DART bilinear interpolation min', 'DART bilinear interpolation 25%', 'DART bilinear interpolation 50%', 'DART bilinear interpolation 75%', 'DART bilinear interpolation max',
             'DART cubic interpolation min', 'DART cubic interpolation 25%', 'DART cubic interpolation 50%', 'DART cubic interpolation 75%', 'DART cubic interpolation max'],
            ['USGS min', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS max']
          ]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines',
          'dot', 'dash', 'dash', 'dash', 'markers+lines',
          'dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#f9e54e', '#f9e54e', '#d9c108', '#a89506', '#a89506', 
           '#00b300', '#00b300', '#009100', '#005e00', '#005e00',
           '#7c58ff', '#7c58ff', '#3903ff', '#2900be', '#2900be'],
          ['#f44', '#f44', '#d00', '#b00', '#b00']]

chart_title = "Statistics per resampling method"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1200
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_grouped_10m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_grouped_10m


In [47]:
datasets_names = ['Quartis DART', 'Quartis USGS']

legends = [
            ['DART vizinho mais próximo mín', 'DART vizinho mais próximo 25%', 'DART vizinho mais próximo 50%', 'DART vizinho mais próximo 75%', 'DART vizinho mais próximo máx',
             'DART interpolação bilinear mín', 'DART interpolação bilinear 25%', 'DART interpolação bilinear 50%', 'DART interpolação bilinear 75%', 'DART interpolação bilinear máx',
             'DART interpolação cúbica mín', 'DART interpolação cúbica 25%', 'DART interpolação cúbica 50%', 'DART interpolação cúbica 75%', 'DART interpolação cúbica máx'],
            ['USGS mín', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS máx']
          ]

chart_title = "Estatísticas por método de reamostragem"
x_title = "Banda"
y_title = "Reflectância"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_agrupado_10m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_agrupado_10m


In [48]:
datasets_names = ['DART quartiles', 'USGS quartiles']

traces = [
            [
                [dataset_dart_nn_20[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_nn_20[feature].describe()['max'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['min'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_bilinear_20[feature].describe()['max'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['min'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart_cubic_20[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_usgs[feature].describe()['min'] for feature in feature_names],
                [dataset_usgs[feature].describe()['25%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['50%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['75%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names, 
           feature_names, feature_names, feature_names, feature_names, feature_names, 
           feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['DART nearest neighbor min', 'DART nearest neighbor 25%', 'DART nearest neighbor 50%', 'DART nearest neighbor 75%', 'DART nearest neighbor max',
             'DART bilinear interpolation min', 'DART bilinear interpolation 25%', 'DART bilinear interpolation 50%', 'DART bilinear interpolation 75%', 'DART bilinear interpolation max',
             'DART cubic interpolation min', 'DART cubic interpolation 25%', 'DART cubic interpolation 50%', 'DART cubic interpolation 75%', 'DART cubic interpolation max'],
            ['USGS min', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS max']
          ]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines',
          'dot', 'dash', 'dash', 'dash', 'markers+lines',
          'dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#f9e54e', '#f9e54e', '#d9c108', '#a89506', '#a89506', 
           '#00b300', '#00b300', '#009100', '#005e00', '#005e00',
           '#7c58ff', '#7c58ff', '#3903ff', '#2900be', '#2900be'],
          ['#f44', '#f44', '#d00', '#b00', '#b00']]

chart_title = "Statistics per resampling method"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1200
guidance = "horizontal"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_grouped_20m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/english/exploratory_analysis/descriptive_statistics/quartiles_resampling_method_grouped_20m


In [49]:
datasets_names = ['Quartis DART', 'Quartis USGS']

legends = [
            ['DART vizinho mais próximo mín', 'DART vizinho mais próximo 25%', 'DART vizinho mais próximo 50%', 'DART vizinho mais próximo 75%', 'DART vizinho mais próximo máx',
             'DART interpolação bilinear mín', 'DART interpolação bilinear 25%', 'DART interpolação bilinear 50%', 'DART interpolação bilinear 75%', 'DART interpolação bilinear máx',
             'DART interpolação cúbica mín', 'DART interpolação cúbica 25%', 'DART interpolação cúbica 50%', 'DART interpolação cúbica 75%', 'DART interpolação cúbica máx'],
            ['USGS mín', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS máx']
          ]

chart_title = "Estatísticas por método de reamostragem"
x_title = "Banda"
y_title = "Reflectância"

export_name = (str(input("Path/filename: "))) 
#Por exemplo: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_agrupado_20m

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Path/filename: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_metodo_reamostragem_agrupado_20m


## Exploratory analysis

### Kolmogorov-Smirnov

In [55]:
print("DART x USGS")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dataset_dart[feature], dataset_usgs[feature]))
print("*******************")
print("                   ")

print("DART (only Plastic and water) x USGS (only Plastic and water)")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dart_subdatasets['plastic_and_water'][feature], usgs_subdatasets['plastic_and_water'][feature]))
print("*******************")
print("                   ")

print("DART (only Plastic and water) x USGS 2019 (only Plastic and water)")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dart_subdatasets['plastic_and_water'][feature], usgs_subdatasets['plp2019_plastic_water'][feature]))
print("*******************")
print("                   ")

print("DART (only Plastic and water) x USGS 2021 (only Plastic and water)")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dart_subdatasets['plastic_and_water'][feature], usgs_subdatasets['plp2021_plastic_water'][feature]))
print("*******************")
print("                   ")


print("USGS 2019 x USGS 2021")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(usgs_subdatasets['plp2019'][feature], usgs_subdatasets['plp2021'][feature]))
print("*******************")
print("                   ")

DART x USGS
Blue KstestResult(statistic=0.7466163046899591, pvalue=0.0)
Green KstestResult(statistic=0.5848284545168398, pvalue=0.0)
Red KstestResult(statistic=0.5731822474032106, pvalue=0.0)
RedEdge1 KstestResult(statistic=0.5904941768964431, pvalue=0.0)
RedEdge2 KstestResult(statistic=0.7409505823103557, pvalue=0.0)
RedEdge3 KstestResult(statistic=0.7982373308152345, pvalue=0.0)
NIR1 KstestResult(statistic=0.7135662574756059, pvalue=0.0)
NIR2 KstestResult(statistic=0.7727415801070192, pvalue=0.0)
SWIR1 KstestResult(statistic=0.511488825936418, pvalue=0.0)
SWIR2 KstestResult(statistic=0.5105445388731508, pvalue=0.0)
NDWI KstestResult(statistic=0.7886433742524394, pvalue=0.0)
WRI KstestResult(statistic=0.7036575385583884, pvalue=0.0)
NDVI KstestResult(statistic=0.9612356670713612, pvalue=0.0)
AWEI KstestResult(statistic=0.5705130626377085, pvalue=0.0)
MNDWI KstestResult(statistic=0.5434435001573812, pvalue=0.0)
SR KstestResult(statistic=0.9561994694006026, pvalue=0.0)
PI KstestResult(s

In [58]:
print("DART Plastic x USGS Plastic")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dart_subdatasets['plastic'][feature], usgs_subdatasets['plastic'][feature]))
print("*******************")
print("                   ")

print("DART Water x USGS Water")
for feature in feature_names + radiometric_indexes:
    print(feature, ks_2samp(dart_subdatasets['water'][feature], usgs_subdatasets['water'][feature]))
print("*******************")
print("                   ")
print("                   ")

DART Plastic x USGS Plastic
Blue KstestResult(statistic=0.941747572815534, pvalue=2.307100562629868e-126)
Green KstestResult(statistic=0.9320388349514563, pvalue=1.5549710755286305e-119)
Red KstestResult(statistic=0.883495145631068, pvalue=1.299498511793246e-95)
RedEdge1 KstestResult(statistic=0.6132108183079057, pvalue=2.112898308941087e-37)
RedEdge2 KstestResult(statistic=0.8462205270457698, pvalue=4.4881758250161316e-83)
RedEdge3 KstestResult(statistic=0.8932038834951457, pvalue=1.719940405808602e-99)
NIR1 KstestResult(statistic=0.941747572815534, pvalue=2.307100562629868e-126)
NIR2 KstestResult(statistic=0.8932038834951457, pvalue=1.719940405808602e-99)
SWIR1 KstestResult(statistic=0.44660194174757284, pvalue=4.528510754565099e-19)
SWIR2 KstestResult(statistic=0.5436893203883495, pvalue=8.774089864741568e-29)
NDWI KstestResult(statistic=0.23751733703190014, pvalue=1.5940176650852678e-05)
WRI KstestResult(statistic=0.6338418862690708, pvalue=2.851042921832638e-40)
NDVI KstestResult(

### Descriptive statistics

In [None]:
#FAZER SALVAR OS GRAFICOS EM PASTAS PADRAO (USAR LIBRARY OS PARA CRIAR AS PASTAS)

In [59]:
rsdata_charts.pie_chart(
    [
    pd.concat([dart_subdatasets['plastic'], dart_subdatasets['sand'], dart_subdatasets['water']], ignore_index=True),
    pd.concat([usgs_subdatasets['plp2019'].query('Label=="Plastic"'), usgs_subdatasets['plp2019'].query('Label=="Coast"'), usgs_subdatasets['plp2019'].query('Label=="Water"')], ignore_index=True),
    pd.concat([usgs_subdatasets['plp2021'].query('Label=="Plastic"'), usgs_subdatasets['plp2021'].query('Label=="Coast"'), usgs_subdatasets['plp2021'].query('Label=="Water"'), usgs_subdatasets['plp2021'].query('Label=="Wood"')], ignore_index=True)
    ], 
    ['Label', 'Label', 'Label'], 
    ["DART (simulation)","USGS 2019", "USGS 2021"], 
    "DART x USGS classes", 630, 900, 
    ['#FF69B4', '#FFD700', '#1E90FF'], str(input("Chart path: ")))
    #For example: charts/english/exploratory_analysis/descriptive_statistics/classes

Chart path: charts/english/exploratory_analysis/descriptive_statistics/classes


In [60]:
rsdata_charts.pie_chart(
    [
    pd.concat([dart_subdatasets['plastic'], dart_subdatasets['sand'], dart_subdatasets['water']], ignore_index=True),
    pd.concat([usgs_subdatasets['plp2019'].query('Label=="Plastic"'), usgs_subdatasets['plp2019'].query('Label=="Coast"'), usgs_subdatasets['plp2019'].query('Label=="Water"')], ignore_index=True),
    pd.concat([usgs_subdatasets['plp2021'].query('Label=="Plastic"'), usgs_subdatasets['plp2021'].query('Label=="Coast"'), usgs_subdatasets['plp2021'].query('Label=="Water"'), usgs_subdatasets['plp2021'].query('Label=="Wood"')], ignore_index=True)
    ], 
    ['Classe', 'Classe', 'Classe'], 
    ["DART (simulação)", "USGS 2019", "USGS 2021"], 
    "Classes DART x USGS", 630, 900, 
    ['#FF69B4', '#FFD700', '#1E90FF'], str(input("Caminho do gráfico: ")))
    #For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/classes

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/classes


In [61]:
rsdata_charts.pie_chart([dart_subdatasets['plastic']], 
                        ['Polymer'], ["Polymers in DART data (simulated)"], " ", 600, 450, 
                        ['#ADF224','#1AB1B1', '#2f1b70', '#D81F88', '#FF8825', '#F1C800'], 
                        str(input("Chart path: ")))
                        #For example: charts/english/exploratory_analysis/descriptive_statistics/polymers_dart

Chart path: charts/english/exploratory_analysis/descriptive_statistics/polymers_dart


In [71]:
rsdata_charts.pie_chart([dart_subdatasets['plastic']], 
                        ['Polymer'], ["Polímeros no dados DART (simulados)"], " ", 600, 450, 
                        ['#ADF224','#1AB1B1', '#2f1b70', '#D81F88', '#FF8825', '#F1C800'], 
                        str(input("Caminho do gráfico: ")))
                        #For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/polimeros_dart


Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/polimeros_dart


In [63]:
rsdata_charts.pie_chart([pd.concat([usgs_subdatasets['plp2019'].query('Polymer!="None"'), usgs_subdatasets['plp2021'].query('Polymer!="None"')], ignore_index=True)], 
                        ['Polymer'], ["Polymers in USGS data (real)"], " ", 600, 450, 
                        ['#49C658','#8945AB', '#FF675F', '#FCFE5E'], 
                        str(input("Chart path: ")))
                        #For example: charts/english/exploratory_analysis/descriptive_statistics/polymers_usgs

Chart path: charts/english/exploratory_analysis/descriptive_statistics/polymers_usgs


In [64]:
rsdata_charts.pie_chart([pd.concat([usgs_subdatasets['plp2019'].query('Polymer!="None"'), usgs_subdatasets['plp2021'].query('Polymer!="None"')], ignore_index=True)], 
                        ['Polímero'], ["Polímeros nos dados USGS (reais)"], " ", 600, 450, 
                        ['#49C658', '#8945AB', '#FF675F', '#FCFE5E'], 
                        str(input("Caminho do gráfico: ")))
                        #For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/polimeros_usgs

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/polimeros_usgs


In [65]:
rsdata_charts.pie_chart([dataset_usgs], ['Year'], [" "], "USGS data sources", 400, 400, ['#c20', '#ff8d77'], str(input("Chart path: ")))
#For example: charts/english/exploratory_analysis/descriptive_statistics/usgs_sources

Chart path: charts/english/exploratory_analysis/descriptive_statistics/usgs_sources


In [66]:
rsdata_charts.pie_chart([dataset_usgs], ['Year'], [" "], "Fontes dos dados USGS", 400, 400, ['#c20', '#ff8d77'], str(input("Caminho do gráfico: ")))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/fontes_usgs

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/fontes_usgs


In [67]:
rsdata_charts.pie_chart([usgs_subdatasets['plp2019'], usgs_subdatasets['plp2021']], ['Path', 'Path'], 
                        ["Days in 2019", "Days in 2021"], "USGS acquisition dates", 500, 1000, ['#991900', '#c20', '#f53', '#ff9c88', '#ffc6bb'], 
                        str(input("Chart path: ")))
                        #For example: charts/english/exploratory_analysis/descriptive_statistics/usgs_dates

Chart path: charts/english/exploratory_analysis/descriptive_statistics/usgs_dates


In [68]:
rsdata_charts.pie_chart([usgs_subdatasets['plp2019'], usgs_subdatasets['plp2021']], ['Path', 'Path'], 
                        ["Dias em 2019", "Dias em 2021"], "Datas de aquisição USGS", 500, 1000, ['#991900', '#c20', '#f53', '#ff9c88', '#ffc6bb'], 
                        str(input("Caminho do gráfico: ")))
                        #For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/datas_usgs

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/datas_usgs


In [69]:
rsdata_charts.pie_chart([usgs_subdatasets['plastic'], usgs_subdatasets['water'], usgs_subdatasets['coast'], usgs_subdatasets['wood']], ['Year', 'Year', 'Year', 'Year'], 
                        ["Plastic", "Water", "Coast", "Wood"], "USGS data sources - per class", 400, 1200, ['#c20', '#ff8d77'], 
                        str(input("Chart path: ")))
                        #For example: charts/english/exploratory_analysis/descriptive_statistics/usgs_sources_per_class

Chart path: charts/english/exploratory_analysis/descriptive_statistics/usgs_sources_per_class


In [70]:
rsdata_charts.pie_chart([usgs_subdatasets['plastic'], usgs_subdatasets['water'], usgs_subdatasets['coast'], usgs_subdatasets['wood']], ['Year', 'Year', 'Year', 'Year'], 
                        ["Plástico", "Água", "Costa", "Madeira"], "Fontes dos dados USGS - por classe", 400, 1200, ['#c20', '#ff8d77'], 
                        str(input("Caminho do gráfico: ")))
                        #For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/fontes_usgs_por_classe


Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/fontes_usgs_por_classe


In [72]:
datasets_names = ["Mean","Std"]
traces = [
            [
                [dataset_dart[feature].mean() for feature in feature_names],
                [dataset_usgs[feature].mean() for feature in feature_names]
             ],
             [
                [dataset_dart[feature].std() for feature in feature_names],
                [dataset_usgs[feature].std() for feature in feature_names]
             ]
          ]
labels = [[feature_names, feature_names], [feature_names, feature_names]]
legends = [['DART mean', 'USGS mean'],
           ['DART std', 'USGS std']]
modes = [['markers+lines', 'markers+lines'],
         ['dash', 'dash']]
colors = [['#008000', '#FF0000'], ['#008000', '#FF0000']]
chart_title = "DART x USGS statistics"
x_title = "Band"
y_title = "Reflectance"
height = 450
width = 950
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_std

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/mean_std


In [73]:
datasets_names = ["Média","Desvio padrão"]
legends = [['Média DART', 'Média USGS'],
           ['Desvio padrão DART', 'Desvio padrão USGS']]
chart_title = "Estatísticas DART x USGS"
x_title = "Banda"
y_title = "Reflectância"
width = 1050

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao


In [74]:
datasets_names = ["Mean (only plastic and water)","Std (only plastic and water)"]
traces = [
            [
                [dart_subdatasets['plastic_and_water'][feature].mean() for feature in feature_names], 
                [usgs_subdatasets['plastic_and_water'][feature].mean() for feature in feature_names]
             ],
             [
                [dart_subdatasets['plastic_and_water'][feature].std() for feature in feature_names], 
                [usgs_subdatasets['plastic_and_water'][feature].std() for feature in feature_names]
             ]
          ]
labels = [[feature_names, feature_names], [feature_names, feature_names]]
legends = [['DART mean (only plastic and water)', 'USGS mean (only plastic and water)'],
           ['DART std (only plastic and water)', 'USGS std (only plastic and water)']]
modes = [['markers+lines', 'markers+lines'],
         ['dash', 'dash']]
colors = [['#008000', '#FF0000'], ['#008000', '#FF0000']]
chart_title = "DART x USGS statistics (only plastic and water)"
x_title = "Band"
y_title = "Reflectance"
height = 450
width = 950
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_std_plastic_water

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/mean_std_plastic_water


In [75]:
datasets_names = ["Média (apenas plástico e água)","Desvio padrão (apenas plástico e água)"]
legends = [['Média DART', 'Média USGS'],
           ['Desvio padrão DART', 'Desvio padrão USGS']]
chart_title = "Estatísticas DART x USGS (apenas plástico e água)"
x_title = "Banda"
y_title = "Reflectância"
width = 1050

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_plastico_agua

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_plastico_agua


In [76]:
datasets_names = ["Mean (only plastic and water)","Std (only plastic and water)"]
traces = [
            [
                [dart_subdatasets['plastic_and_water'][feature].mean() for feature in feature_names], 
                [usgs_subdatasets['plp2019_plastic_water'][feature].mean() for feature in feature_names],
                [usgs_subdatasets['plp2021_plastic_water'][feature].mean() for feature in feature_names]
             ],
             [
                [dart_subdatasets['plastic_and_water'][feature].std() for feature in feature_names], 
                [usgs_subdatasets['plp2019_plastic_water'][feature].std() for feature in feature_names],
                [usgs_subdatasets['plp2021_plastic_water'][feature].std() for feature in feature_names],
             ]
          ]
labels = [[feature_names, feature_names, feature_names], [feature_names, feature_names, feature_names]]
legends = [['DART mean (only plastic and water)', 'USGS 2019 mean (only plastic and water)', 'USGS 2021 mean (only plastic and water)'],
           ['DART std (only plastic and water)', 'USGS 2019 std (only plastic and water)', 'USGS 2021 std (only plastic and water)']]
modes = [['markers+lines', 'markers+lines', 'markers+lines'],
         ['dash', 'dash', 'dash']]
colors = [['#008000', '#FF0000', '#00D'], ['#008000', '#FF0000', '#00D']]
chart_title = "DART x USGS statistics (only plastic and water)"
x_title = "Band"
y_title = "Reflectance"
height = 450
width = 950
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_std_plastic_water_per_year

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/mean_std_plastic_water_per_year


In [77]:
datasets_names = ["Média (apenas plástico e água)","Desvio padrão (apenas plástico e água)"]
legends = [['Média DART', 'Média USGS 2019', 'Média USGS 2021'],
           ['Desvio padrão DART', 'Desvio padrão USGS 2019', 'Desvio padrão USGS 2021']]
chart_title = "Estatísticas DART x USGS (apenas plástico e água)"
x_title = "Banda"
y_title = "Reflectância"
width = 1050

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_plastico_agua_por ano

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/media_dpadrao_plastico_agua_por ano


In [78]:
datasets_names = ["DART (simulated data)", "USGS (real data)"]

traces = [
             [
                [dataset_dart[feature].describe()['min'] for feature in feature_names],
                [dataset_dart[feature].describe()['25%'] for feature in feature_names],
                [dataset_dart[feature].describe()['50%'] for feature in feature_names],
                [dataset_dart[feature].describe()['75%'] for feature in feature_names],
                [dataset_dart[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dataset_usgs[feature].describe()['min'] for feature in feature_names],
                [dataset_usgs[feature].describe()['25%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['50%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['75%'] for feature in feature_names],
                [dataset_usgs[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [['DART min', 'DART 25%', 'DART 50%', 'DART 75%', 'DART max'],
           ['USGS min', 'USGS 25%', 'USGS 50%', 'USGS 75%', 'USGS max']]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#008000', '#008000', '#008000', '#008000', '#008000'],
          ['#FF0000', '#FF0000', '#FF0000', '#FF0000', '#FF0000']]

chart_title = "DART x USGS quartiles"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1400
guidance = "horizontal"


export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/quartiles

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/quartiles


In [79]:
datasets_names = ["DART (dados simulados)", "USGS (dados reais)"]

chart_title = "Quartis DART x USGS"
x_title = "Banda"
y_title = "Reflectância"
guidance = "horizontal"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis


In [80]:
datasets_names = ["DART Plastic (simulated)", "DART Water (simulated)", "DART Sand (simulated)"]

traces = [
            [
                [dart_subdatasets['plastic'][feature].describe()['min'] for feature in feature_names],
                [dart_subdatasets['plastic'][feature].describe()['25%'] for feature in feature_names],
                [dart_subdatasets['plastic'][feature].describe()['50%'] for feature in feature_names],
                [dart_subdatasets['plastic'][feature].describe()['75%'] for feature in feature_names],
                [dart_subdatasets['plastic'][feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dart_subdatasets['water'][feature].describe()['min'] for feature in feature_names],
                [dart_subdatasets['water'][feature].describe()['25%'] for feature in feature_names],
                [dart_subdatasets['water'][feature].describe()['50%'] for feature in feature_names],
                [dart_subdatasets['water'][feature].describe()['75%'] for feature in feature_names],
                [dart_subdatasets['water'][feature].describe()['max'] for feature in feature_names]
             ],
             [
                [dart_subdatasets['sand'][feature].describe()['min'] for feature in feature_names],
                [dart_subdatasets['sand'][feature].describe()['25%'] for feature in feature_names],
                [dart_subdatasets['sand'][feature].describe()['50%'] for feature in feature_names],
                [dart_subdatasets['sand'][feature].describe()['75%'] for feature in feature_names],
                [dart_subdatasets['sand'][feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['DART Plastic min', 'DART Plastic 25%', 'DART Plastic 50%', 'DART Plastic 75%', 'DART Plastic max'],
            ['DART Water min', 'DART Water 25%', 'DART Water 50%', 'DART Water 75%', 'DART Water max'],
            ['DART Sand min', 'DART Sand 25%', 'DART Sand 50%', 'DART Sand 75%', 'DART Sand max']
          ]

modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#ffadd6', '#ffadd6', '#FF69B4', '#ff148a', '#ff148a'], 
          ['#73baff', '#73baff', '#1E90FF', '#005db7', '#005db7'],
          ['#ffe766', '#ffe766', '#FFD700', '#ddba00', '#ddba00']]

chart_title = "Class statistics - DART quartiles"
x_title = "Band"
y_title = "Reflectance"
height = 500
width = 1200
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_dart

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_dart


In [81]:
datasets_names = ["DART Plástico (simulado)", "DART Água (simulado)", "DART Areia (simulado)"]

legends = [
            ['DART Plástico mín', 'DART Plástico 25%', 'DART Plástico 50%', 'DART Plástico 75%', 'DART Plástico máx'],
            ['DART Água mín', 'DART Água 25%', 'DART Água 50%', 'DART Água 75%', 'DART Água máx'],
            ['DART Areia mín', 'DART Areia 25%', 'DART Areia 50%', 'DART Areia 75%', 'DART Areia máx']
          ]

chart_title = "Estatísticas por classe - Quartis DART"
x_title = "Banda"
y_title = "Reflectância"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_dart

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_dart


In [82]:
datasets_names = ["USGS Plastic (real)", "USGS Water (real)", "USGS Coast (real)", "USGS Wood (real)"]

traces = [
             [
                [usgs_subdatasets['plastic'][feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['plastic'][feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['plastic'][feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['plastic'][feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['plastic'][feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['water'][feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['water'][feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['water'][feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['water'][feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['water'][feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['coast'][feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['coast'][feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['coast'][feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['coast'][feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['coast'][feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['wood'][feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['wood'][feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['wood'][feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['wood'][feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['wood'][feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['USGS Plastic min', 'USGS Plastic 25%', 'USGS Plastic 50%', 'USGS Plastic 75%', 'USGS Plastic max'],
            ['USGS Water min', 'USGS Water 25%', 'USGS Water 50%', 'USGS Water 75%', 'USGS Water max'],
            ['USGS Coast min', 'USGS Coast 25%', 'USGS Coast 50%', 'USGS Coast 75%', 'USGS Coast max'],
            ['USGS Wood min', 'USGS Wood 25%', 'USGS Wood 50%', 'USGS Wood 75%', 'USGS Wood max']]


modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#ffadd6', '#ffadd6', '#FF69B4', '#ff148a', '#ff148a'], 
          ['#73baff', '#73baff', '#1E90FF', '#005db7', '#005db7'],
          ['#ffe766', '#ffe766', '#FFD700', '#ddba00', '#ddba00'],
          ['#a38fd3', '#a38fd3', '#7152bb', '#533990', '#533990']
         ]

chart_title = "Class statistics - USGS quartiles"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1200
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs


In [83]:
datasets_names = ["USGS Plástico (real)", "USGS Água (real)", "USGS Costa (real)", "USGS Madeira (real)"]

legends = [
            ['USGS Plástico mín', 'USGS Plástico 25%', 'USGS Plástico 50%', 'USGS Plástico 75%', 'USGS Plástico máx'],
            ['USGS Água mín', 'USGS Water 25%', 'USGS Água 50%', 'USGS Água 75%', 'USGS Água máx'],
            ['USGS Costa mín', 'USGS Costa 25%', 'USGS Costa 50%', 'USGS Costa 75%', 'USGS Costa máx'],
            ['USGS Madeira mín', 'USGS Madeira 25%', 'USGS Madeira 50%', 'USGS Madeira 75%', 'USGS Madeira máx']]

chart_title = "Estatísticas por classe - Quartis USGS"
x_title = "Banda"
y_title = "Reflectância"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs


In [84]:
datasets_names = ["USGS 2019 Plastic (real)", "USGS 2019 Water (real)", "USGS 2019 Coast (real)"]

traces = [
             [
                [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['water'].query('Year == "2019"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2019"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2019"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2019"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2019"')[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['coast'].query('Year == "2019"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2019"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2019"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2019"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2019"')[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['USGS Plastic min', 'USGS Plastic 25%', 'USGS Plastic 50%', 'USGS Plastic 75%', 'USGS Plastic max'],
            ['USGS Water min', 'USGS Water 25%', 'USGS Water 50%', 'USGS Water 75%', 'USGS Water max'],
            ['USGS Coast min', 'USGS Coast 25%', 'USGS Coast 50%', 'USGS Coast 75%', 'USGS Coast max']
          ]


modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']]

colors = [['#ffadd6', '#ffadd6', '#FF69B4', '#ff148a', '#ff148a'], 
          ['#73baff', '#73baff', '#1E90FF', '#005db7', '#005db7'],
          ['#ffe766', '#ffe766', '#FFD700', '#ddba00', '#ddba00']]

chart_title = "Class statistics - USGS 2019 quartiles"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1200
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs_2019

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs_2019


In [85]:
datasets_names = ["USGS 2019 Plástico (real)", "USGS 2019 Água (real)", "USGS 2019 Costa (real)"]

legends = [
            ['USGS Plástico mín', 'USGS Plástico 25%', 'USGS Plástico 50%', 'USGS Plástico 75%', 'USGS Plástico máx'],
            ['USGS Água mín', 'USGS Water 25%', 'USGS Água 50%', 'USGS Água 75%', 'USGS Água máx'],
            ['USGS Costa mín', 'USGS Costa 25%', 'USGS Costa 50%', 'USGS Costa 75%', 'USGS Costa máx']]

chart_title = "Estatísticas por classe - Quartis USGS 2019"
x_title = "Banda"
y_title = "Reflectância"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs_2019

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs_2019


In [86]:
datasets_names = ["USGS 2021 Plastic (real)", "USGS 2021 Water (real)", "USGS 2021 Coast (real)", "USGS 2021 Wood (real)"]

traces = [
             [
                [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['water'].query('Year == "2021"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2021"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2021"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2021"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['water'].query('Year == "2021"')[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['coast'].query('Year == "2021"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2021"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2021"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2021"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['coast'].query('Year == "2021"')[feature].describe()['max'] for feature in feature_names]
             ],
             [
                [usgs_subdatasets['wood'].query('Year == "2021"')[feature].describe()['min'] for feature in feature_names],
                [usgs_subdatasets['wood'].query('Year == "2021"')[feature].describe()['25%'] for feature in feature_names],
                [usgs_subdatasets['wood'].query('Year == "2021"')[feature].describe()['50%'] for feature in feature_names],
                [usgs_subdatasets['wood'].query('Year == "2021"')[feature].describe()['75%'] for feature in feature_names],
                [usgs_subdatasets['wood'].query('Year == "2021"')[feature].describe()['max'] for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [
            ['USGS Plastic min', 'USGS Plastic 25%', 'USGS Plastic 50%', 'USGS Plastic 75%', 'USGS Plastic max'],
            ['USGS Water min', 'USGS Water 25%', 'USGS Water 50%', 'USGS Water 75%', 'USGS Water max'],
            ['USGS Coast min', 'USGS Coast 25%', 'USGS Coast 50%', 'USGS Coast 75%', 'USGS Coast max'],
            ['USGS Wood min', 'USGS Wood 25%', 'USGS Wood 50%', 'USGS Wood 75%', 'USGS Wood max']]


modes = [['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines'],
         ['dot', 'dash', 'dash', 'dash', 'markers+lines']
        ]

colors = [['#ffadd6', '#ffadd6', '#FF69B4', '#ff148a', '#ff148a'], 
          ['#73baff', '#73baff', '#1E90FF', '#005db7', '#005db7'],
          ['#ffe766', '#ffe766', '#FFD700', '#ddba00', '#ddba00'],
          ['#a38fd3', '#a38fd3', '#7152bb', '#533990', '#533990']
         ]

chart_title = "Class statistics - USGS 2021 quartiles"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1200
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs_2021

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/class_quartiles_usgs_2021


In [87]:
datasets_names = ["USGS 2021 Plástico (real)", "USGS 2021 Água (real)", "USGS 2021 Costa (real)", "USGS 2021 Madeira (real)"]

legends = [
            ['USGS Plástico mín', 'USGS Plástico 25%', 'USGS Plástico 50%', 'USGS Plástico 75%', 'USGS Plástico máx'],
            ['USGS Água mín', 'USGS Water 25%', 'USGS Água 50%', 'USGS Água 75%', 'USGS Água máx'],
            ['USGS Costa mín', 'USGS Costa 25%', 'USGS Costa 50%', 'USGS Costa 75%', 'USGS Costa máx'],
            ['USGS Madeira mín', 'USGS Madeira 25%', 'USGS Madeira 50%', 'USGS Madeira 75%', 'USGS Madeira máx']]

chart_title = "Estatísticas por classe - Quartis USGS 2021"
x_title = "Banda"
y_title = "Reflectância"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs_2021

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/quartis_por_classe_usgs_2021


In [88]:
datasets_names = ["DART", "USGS", "USGS (minimum plastic: 20%)", "USGS (minimum plastic: 50%)"]

traces = [
             [
                 [dart_subdatasets['plastic'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['water'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['sand'][feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['coast'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['wood'][feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic_min_20'][feature].mean() for feature in feature_names], 
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic_min_50'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names], 
          [feature_names, feature_names]]

legends = [['Plastic (DART mean)', 'Water (DART mean)', 'Sand (DART mean)'],
           ['Plastic (USGS mean)', 'Water (USGS mean)', 'Coast (USGS mean)', 'Wood (USGS mean)'],
           ['Plastic - min 20% (USGS mean)', 'Water (USGS mean)'],
           ['Plastic - min 50% (USGS mean)', 'Water (USGS mean)']]

modes = [['lines', 'lines', 'lines'],
         ['dot', 'dot', 'dot', 'dot'],
         ['dash', 'dash'],
         ['markers+lines', 'markers+lines']]

colors = [['#FF69B4', '#1E90FF', '#FFD700'], 
          ['#FF69B4', '#1E90FF', '#FFD700', '#7152bb'],
          ['#FF69B4', '#1E90FF'],
          ['#FF69B4', '#1E90FF']]

chart_title = "Class statistics - Mean spectral signatures"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1300
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_spectral_signatures

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/mean_spectral_signatures


In [89]:
datasets_names = ["DART", "USGS", "USGS (cobertura plástica mínima: 20%)", "USGS (cobertura plástica mínima: 50%)"]

legends = [['Plástico (média DART)', 'Água (média DART)', 'Areia (média DART)'],
           ['Plástico (média USGS)', 'Água (média USGS)', 'Costa (média USGS)', 'Madeira (média USGS)'],
           ['Plástico - min 20% (média USGS)', 'Água (média USGS)'],
           ['Plástico - min 50% (média USGS)', 'Água (média USGS)']]

chart_title = "Estatísticas por classe - Assinaturas espectrais médias"
x_title = "Banda"
y_title = "Reflectância"
width = 1550

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/assinaturas_espectrais_medias

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/assinaturas_espectrais_medias


In [90]:
datasets_names = ["DART", "USGS 2019", "USGS 2021", "USGS (minimum plastic: 50%)"]

traces = [
             [
                 [dart_subdatasets['plastic'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['water'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['sand'][feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic'].query('Year == "2019"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'].query('Year == "2019"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['coast'].query('Year == "2019"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['wood'].query('Year == "2019"')[feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic'].query('Year == "2021"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'].query('Year == "2021"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['coast'].query('Year == "2021"')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['wood'].query('Year == "2021"')[feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic_min_50'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names]]

legends = [['Plastic (DART mean)', 'Water (DART mean)', 'Sand (DART mean)'],
           ['Plastic (USGS 2019 mean)', 'Water (USGS 2019 mean)', 'Coast (USGS 2019 mean)', 'Wood (USGS 2019 mean)'],
           ['Plastic (USGS 2021 mean)', 'Water (USGS 2021 mean)', 'Coast (USGS 2021 mean)', 'Wood (USGS 2021 mean)'],
           ['Plastic - min 50% (USGS mean)', 'Water (USGS mean)']]

modes = [['lines', 'lines', 'lines'],
         ['dot', 'dot', 'dot', 'dot'],
         ['dash', 'dash', 'dash', 'dash'],
         ['markers+lines', 'markers+lines']]

colors = [['#FF69B4', '#1E90FF', '#FFD700'], 
          ['#FF69B4', '#1E90FF', '#FFD700', '#7152bb'],
          ['#FF69B4', '#1E90FF', '#FFD700', '#7152bb'],
          ['#FF69B4', '#1E90FF']]

chart_title = "Class statistics - Mean spectral signatures"
x_title = "Band"
y_title = "Reflectance"
height = 600
width = 1300
guidance = "horizontal"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/descriptive_statistics/mean_spectral_signatures_per_year

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/descriptive_statistics/mean_spectral_signatures_per_year


In [91]:
datasets_names = ["DART", "USGS 2019", "USGS 2021", "USGS (cobertura plástica mínima: 50%)"]

legends = [['Plástico (média DART)', 'Água (média DART)', 'Areia (média DART)'],
           ['Plástico (média USGS 2019)', 'Água (média USGS 2019)', 'Costa (média USGS 2019)', 'Madeira (média USGS 2019)'],
           ['Plástico (média USGS 2021)', 'Água (média USGS 2021)', 'Costa (média USGS 2021)', 'Madeira (média USGS 2021)'],
           ['Plástico - min 50% (média USGS)', 'Água (média USGS)']]

chart_title = "Estatísticas por classe - Assinaturas espectrais médias"
x_title = "Banda"
y_title = "Reflectância"
width = 1550

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/assinaturas_espectrais_medias_por_ano

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/estatisticas_descritivas/assinaturas_espectrais_medias_por_ano


In [92]:
datasets_names = ["DART features correlation", "USGS features correlation"]

datasets = [round(dataset_dart[feature_names + radiometric_indexes].corr(),3), round(dataset_usgs[feature_names + radiometric_indexes].corr(),3)]

x_labels = [feature_names + radiometric_indexes, feature_names + radiometric_indexes]

y_labels = [feature_names + radiometric_indexes, feature_names + radiometric_indexes]

colorscale = "RdBu"

chart_title = "Features correlation"

height = 1600

width = 1000

guidance = "vertical"

export_name = str(input("Chart path: "))
#For example: charts/english/exploratory_analysis/correlation/features_correlation

rsdata_charts.heatmap_chart(datasets_names, datasets, x_labels, y_labels, colorscale, chart_title, height, width,  guidance=guidance, export_name=export_name)

Chart path: charts/english/exploratory_analysis/correlation/features_correlation


In [93]:
datasets_names = ["Dados DART", "Dados USGS"]

chart_title = "Correlação"

guidance = "vertical"

export_name = str(input("Caminho do gráfico: "))
#For example: charts/portugues/analise_exploratoria/correlacao/correlacao

rsdata_charts.heatmap_chart(datasets_names, datasets, x_labels, y_labels, colorscale, chart_title, height, width,  guidance=guidance, export_name=export_name)

Caminho do gráfico: charts/portugues/analise_exploratoria/correlacao/correlacao


In [95]:
rsdata_charts.heatmap_chart(["DART plastic features correlation"], 
                                round(dart_subdatasets['plastic'][feature_names + radiometric_indexes].corr(),3), 
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical", 
                                export_name = str(input("Dart plastic correlation chart path: "))
                                #For example: charts/english/exploratory_analysis/correlation/dart_plastic
                                )

rsdata_charts.heatmap_chart(["DART water features correlation"], 
                                round(dart_subdatasets['water'][feature_names + radiometric_indexes].corr(),3),
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical",
                                export_name = str(input("Dart water correlation chart path: "))
                                #For example: charts/english/exploratory_analysis/correlation/dart_water
                                )

#rsdata_charts.heatmap_chart(["DART sand features correlation"], 
#                                round(dart_subdatasets['sand'][feature_names + radiometric_indexes].corr(),3), 
#                                feature_names + radiometric_indexes, 
#                                feature_names + radiometric_indexes,
#                                "RdBu", 
#                                " ", 
#                                900, 
#                                900, 
#                                "vertical", 
#                                export_name = str(input("Dart sand correlation chart path: "))
#                                #For example: charts/english/exploratory_analysis/correlation/dart_sand
#                                )

Dart plastic correlation chart path: charts/english/exploratory_analysis/correlation/dart_plastic
Dart water correlation chart path: charts/english/exploratory_analysis/correlation/dart_water


In [96]:
rsdata_charts.heatmap_chart(["DART - Correlação no plástico"], 
                                round(dart_subdatasets['plastic'][feature_names + radiometric_indexes].corr(),3), 
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical", 
                                export_name = str(input("Local para o gráfico de correlação no plástico simulado: "))
                                #For example: charts/portugues/analise_exploratoria/correlacao/dart_plastico
                                )

rsdata_charts.heatmap_chart(["DART - Correlação na água"], 
                                round(dart_subdatasets['water'][feature_names + radiometric_indexes].corr(),3),
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical",
                                export_name = str(input("Local para o gráfico de correlação da água simulada: "))
                                #For example: charts/portugues/analise_exploratoria/correlacao/dart_agua
                                )

#rsdata_charts.heatmap_chart(["DART - Correlação na areia"], 
#                                round(dart_subdatasets['sand'][feature_names + radiometric_indexes].corr(),3), 
#                                feature_names + radiometric_indexes, 
#                                feature_names + radiometric_indexes, 
#                                "RdBu", 
#                                " ", 
#                                900, 
#                                900, 
#                                "vertical", 
#                                export_name = str(input("Local para o gráfico de correlação da areia simulada: "))
#                                #For example: files/charts/portugues/analise_exploratoria/correlacao/dart_areia
#                                )

Local para o gráfico de correlação no plástico simulado: charts/portugues/analise_exploratoria/correlacao/dart_plastico
Local para o gráfico de correlação da água simulada: charts/portugues/analise_exploratoria/correlacao/dart_agua
Local para o gráfico de correlação da areia simulada: charts/portugues/analise_exploratoria/correlacao/dart_areia


In [97]:
datasets_names = [["USGS plastic features correlation"], ["USGS water features correlation"], 
                  ["USGS coast features correlation"], ["DART wood features correlation"]]

datasets = [round(usgs_subdatasets['plastic'][feature_names + radiometric_indexes].corr(),3),
            round(usgs_subdatasets['water'][feature_names + radiometric_indexes].corr(),3),
            round(usgs_subdatasets['coast'][feature_names + radiometric_indexes].corr(),3),
            round(usgs_subdatasets['wood'][feature_names + radiometric_indexes].corr(),3)]

export_names = [str(input("USGS plastic correlation chart path: ")), 
               str(input("USGS water correlation chart path: ")), 
               str(input("USGS coast correlation chart path: ")), 
               str(input("USGS wood correlation chart path: "))] 
                                
            #For example: charts/english/exploratory_analysis/correlation/usgs_plastic,
            #charts/english/exploratory_analysis/correlation/usgs_water,
            #charts/english/exploratory_analysis/correlation/usgs_coast,
            #charts/english/exploratory_analysis/correlation/usgs_wood]

for i in range(len(datasets_names)):
    rsdata_charts.heatmap_chart(datasets_names[i], 
                                datasets[i], 
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical", 
                                export_names[i])

USGS plastic correlation chart path: charts/english/exploratory_analysis/correlation/usgs_plastic
USGS water correlation chart path: charts/english/exploratory_analysis/correlation/usgs_water
USGS coast correlation chart path: charts/english/exploratory_analysis/correlation/usgs_coast
USGS wood correlation chart path: charts/english/exploratory_analysis/correlation/usgs_wood


In [98]:
datasets_names = [["USGS - Correlação no plástico"], ["USGS - Correlação na água"], 
                  ["USGS - Correlação na costa"], ["USGS - Correlação na madeira"]]

export_names = [str(input("Local do gráfico de correlação de plástico observado: ")), 
               str(input("Local do gráfico de correlação de água observada: ")), 
               str(input("Local do gráfico de correlação de costa observada: ")), 
               str(input("Local do gráfico de correlação de madeira observada: "))] 
                                
            #For example: charts/portugues/analise_exploratoria/correlacao/usgs_plastico,
            #charts/portugues/analise_exploratoria/correlacao/usgs_agua,
            #charts/portugues/analise_exploratoria/correlacao/usgs_costa,
            #charts/portugues/analise_exploratoria/correlacao/usgs_madeira

for i in range(len(datasets_names)):
    rsdata_charts.heatmap_chart(datasets_names[i], 
                                datasets[i], 
                                feature_names + radiometric_indexes, 
                                feature_names + radiometric_indexes, 
                                "RdBu", 
                                " ", 
                                900, 
                                900, 
                                "vertical", 
                                export_names[i])

Local do gráfico de correlação de plástico observado: charts/portugues/analise_exploratoria/correlacao/usgs_plastico
Local do gráfico de correlação de água observada: charts/portugues/analise_exploratoria/correlacao/usgs_agua
Local do gráfico de correlação de costa observada: charts/portugues/analise_exploratoria/correlacao/usgs_costa
Local do gráfico de correlação de madeira observada: charts/portugues/analise_exploratoria/correlacao/usgs_madeira


In [99]:
datasets_names = ["DART", "USGS"]

traces = [
            [dart_subdatasets['plastic'], dart_subdatasets['water'], dart_subdatasets['sand']],
            [usgs_subdatasets['plastic'], usgs_subdatasets['water'], usgs_subdatasets['coast'], usgs_subdatasets['wood']]
         ]
labels = [
            ["Plastic", "Water", "Sand"],
            ["Plastic", "Water", "Coast", "Wood"]
         ]

labels_group = [
            "DART classes", 
            "USGS classes"
        ]

n_bins = 8

colors = [
            ['#ffadd6', '#73baff', '#ffe766'], 
            ['#ffadd6', '#73baff', '#ffe766', '#a38fd3'], 
         ]

y_title = "Reflectance"
height = 450
width = 700

path = str(input("Charts path: "))
#For example: charts/english/exploratory_analysis/histograms/

for feature in feature_names + radiometric_indexes:
    bands = [feature, feature]
    
    chart_title = "Frequency percent distribution in "+feature
    x_title = feature    
    export_name = path+feature
    

    rsdata_charts.stackedbars_chart(datasets_names, traces, bands, n_bins, labels, labels_group, colors, chart_title, x_title, y_title, height, width, export_name)  

Charts path: charts/english/exploratory_analysis/histograms/


In [100]:
labels = [
            ["Plástico", "Água", "Areia"],
            ["Plástico", "Água", "Costa", "Madeira"]
         ]

labels_group = [
            "Classes DART", 
            "Classes USGS"
        ]

y_title = "Reflectância"

path = str(input("Local dos gráficos: "))
#For example: charts/portugues/analise_exploratoria/histogramas/

for feature in feature_names + radiometric_indexes:
    bands = [feature, feature]
    
    chart_title = "Distribuição percentual de frequências no atributo "+feature
    x_title = feature    
    export_name = path+feature

    rsdata_charts.stackedbars_chart(datasets_names, traces, bands, n_bins, labels, labels_group, colors, chart_title, x_title, y_title, height, width, export_name)  

Local dos gráficos: charts/portugues/analise_exploratoria/histogramas/


In [124]:
datasets_names = ["DART classes", "USGS classes"]

labels = [
            ["Sand", "Water", "Plastic"],    
            ["Coast", "Water", "Plastic", "Wood"]
         ]

colors = [
            ['#ffe766', '#73baff', '#ffadd6'], 
            ['#ffe766', '#73baff', '#ffadd6', '#a38fd3'], 
         ]

y_title = "Values"
height = 600
width = 950
guidance="horizontal"

path = str(input("Dart boxplots charts path: ")) 
#For example: charts/english/exploratory_analysis/boxplots/classes_
                                

for feature in ["NDVI", "FDI"]:
    traces = [
                [
                    dart_subdatasets['sand'][feature],
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic'][feature]
                ],
                [
                    usgs_subdatasets['coast'][feature],
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic'][feature],
                    usgs_subdatasets['wood'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = feature+" index scattering"

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

Dart boxplots charts path: charts/english/exploratory_analysis/boxplots/classes_


In [125]:
datasets_names = ["Classes DART", "Classes USGS"]

labels = [
            ["Areia", "Água", "Plástico"],    
            ["Costa", "Água", "Plástico", "Madeira"]
         ]

y_title = "Valores"
height = 600
width = 950
guidance="horizontal"

path = str(input("Caminhos para gráficos boxplot dos dados simulados (DART): ")) 
#For example: charts/portugues/analise_exploratoria/boxplots/classes_

for feature in ["NDVI", "FDI"]:
    traces = [
                [
                    dart_subdatasets['sand'][feature],
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic'][feature]
                ],
                [
                    usgs_subdatasets['coast'][feature],
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic'][feature],
                    usgs_subdatasets['wood'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = "Espalhamento no índice "+feature

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

Caminhos para gráficos boxplot dos dados simulados (DART): charts/portugues/analise_exploratoria/boxplots/classes_


In [132]:
datasets_names = ["DART Plastic in Foam", "DART Plastic in Water", "USGS Plastic in Water"]

labels = [
            ["20% plastic", "40% plastic", "60% plastic", "80% plastic", "100% plastic"],    
            ["Water", "20% plastic", "40% plastic", "60% plastic", "80% plastic", "100% plastic"],
            ["Water", "20% plastic", "40% plastic", "60% plastic", "80% plastic", "100% plastic", "Unknown percent"]
         ]

colors = [
            ['#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6'],#'#ffe766',             
            ['#73baff', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6'],
            ['#73baff', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6']            
         ]


y_title = "Reflectance"
height = 600
width = 950
guidance="horizontal"

path = str(input("DART plastic boxplots charts path: ")) 
#For example: charts/english/exploratory_analysis/boxplots/

for feature in feature_names:
    traces = [
                [
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 100")[feature]
                ],
                [
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 100")[feature]
                ],
                [
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic_20'][feature],
                    usgs_subdatasets['plastic_40'][feature],
                    usgs_subdatasets['plastic_60'][feature],
                    usgs_subdatasets['plastic_80'][feature],
                    usgs_subdatasets['plastic_100'][feature],
                    usgs_subdatasets['plastic_unknownpercent'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = feature+" band scattering"

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

DART plastic boxplots charts path: charts/english/exploratory_analysis/boxplots/


In [133]:
y_title = "Values"

for feature in radiometric_indexes:
    traces = [
                [
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 100")[feature]
                ],
                [
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 100")[feature]
                ],
                [
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic_20'][feature],
                    usgs_subdatasets['plastic_40'][feature],
                    usgs_subdatasets['plastic_60'][feature],
                    usgs_subdatasets['plastic_80'][feature],
                    usgs_subdatasets['plastic_100'][feature],
                    usgs_subdatasets['plastic_unknownpercent'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = feature+" index scattering"

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

In [135]:
datasets_names = ["DART - Plástico na espuma", "DART - Plástico na água", "USGS - Plástico na água"]

labels = [
            ["20% plástico", "40% plástico", "60% plástico", "80% plástico", "100% plástico"],    
            ["Água", "20% plástico", "40% plástico", "60% plástico", "80% plástico", "100% plástico"],
            ["Água", "20% plástico", "40% plástico", "60% plástico", "80% plástico", "100% plástico", "Percentual desconhecido"]
         ]

colors = [
            ['#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6'],#'#ffe766',             
            ['#73baff', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6'],
            ['#73baff', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6', '#ffadd6']            
         ]


y_title = "Reflectância"
height = 600
width = 950
guidance="horizontal"

path = str(input("Caminhos para gráficos boxplot do plástico simulado (DART):")) 
#For example: charts/portugues/analise_exploratoria/boxplots/

for feature in feature_names:
    traces = [
                [
                    #dart_subdatasets['sand'][feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 100")[feature]
                ],
                [
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 100")[feature]
                ],
                [
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic_20'][feature],
                    usgs_subdatasets['plastic_40'][feature],
                    usgs_subdatasets['plastic_60'][feature],
                    usgs_subdatasets['plastic_80'][feature],
                    usgs_subdatasets['plastic_100'][feature],
                    usgs_subdatasets['plastic_unknownpercent'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = "Espalhamento na banda "+feature

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

Caminhos para gráficos boxplot do plástico simulado (DART):charts/portugues/analise_exploratoria/boxplots/


In [136]:
y_title = "Valores"

for feature in radiometric_indexes:
    traces = [
                [
                    #dart_subdatasets['sand'][feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_foam'].query("Cover_percent == 100")[feature]
                ],
                [
                    dart_subdatasets['water'][feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 20")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 40")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 60")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 80")[feature],
                    dart_subdatasets['plastic_in_water'].query("Cover_percent == 100")[feature]
                ],
                [
                    usgs_subdatasets['water'][feature],
                    usgs_subdatasets['plastic_20'][feature],
                    usgs_subdatasets['plastic_40'][feature],
                    usgs_subdatasets['plastic_60'][feature],
                    usgs_subdatasets['plastic_80'][feature],
                    usgs_subdatasets['plastic_100'][feature],
                    usgs_subdatasets['plastic_unknownpercent'][feature]
                ]
        ]

    x_title = feature
    
    chart_title = "Espalhamento no índice "+feature

    export_name = path+feature

    rsdata_charts.boxplot_chart(datasets_names, traces, labels, colors, chart_title, x_title, y_title, height, width, guidance, export_name=export_name)

In [None]:
datasets_names = ["DART", "USGS"]

traces = [
            [dart_subdatasets['plastic'], dart_subdatasets['water'], dart_subdatasets['sand']],
            [usgs_subdatasets['plastic'], usgs_subdatasets['water'], usgs_subdatasets['coast'], usgs_subdatasets['wood']]
         ]
    
labels = [
            ["Plastic", "Water", "Sand"],
            ["Plastic", "Water", "Coast", "Wood"]
         ]

labels_group = [
                "DART classes", 
                "USGS classes"
                ]


colors = [
            ['#ffadd6', '#73baff', '#ffe766'], 
            ['#ffadd6', '#73baff', '#ffe766', '#a38fd3'], 
         ]


chart_title = "Classes scattering"
height = 400
width = 900

x_title = "Pixel Ids"
path = str(input("DART scatter charts path: ")) 
#For example: charts/english/exploratory_analysis/scatter/

for feature in feature_names + radiometric_indexes:
    y = feature
    y_title = feature
    
    export_name = path+feature+"/"+y

    rsdata_charts.scatter_chart_und(datasets_names, traces, y, labels, labels_group, colors, chart_title, x_title, y_title, height, width, legend_orientation="v", guidance="horizontal", export_name=export_name) 

DART scatter charts path: charts/english/exploratory_analysis/scatter/


In [None]:
labels = [
            ["Plástico", "Água", "Areia"],
            ["Plástico", "Água", "Costa", "Madeira"]
         ]

labels_group = [
                "Classes DART", 
                "Classes USGS"
                ]

chart_title = "Dispersão das classes"
height = 400
width = 900

path = str(input("Caminhos para gráficos de dispersão dos dados simulados (DART): ")) 
#For example: charts/portugues/analise_exploratoria/dispersao/

for feature in feature_names + radiometric_indexes:
    y = feature
    y_title = feature
    
    export_name = path+feature+"/"+y

    rsdata_charts.scatter_chart_und(datasets_names, traces, y, labels, labels_group, colors, chart_title, x_title, y_title, height, width, legend_orientation="v", guidance="horizontal", export_name=export_name) 

In [None]:
datasets_names = ["DART", "USGS"]

traces = [
            [dart_subdatasets['plastic'], dart_subdatasets['water']],
            [usgs_subdatasets['plastic'], usgs_subdatasets['water'], usgs_subdatasets['coast'], usgs_subdatasets['wood']]
         ]
    
labels = [
            ["Plastic", "Water"],
            ["Plastic", "Water", "Coast", "Wood"]
         ]

labels_group = [
                "DART classes", 
                "USGS classes"
                ]


colors = [
            ['#ffadd6', '#73baff'], #'#ffe766' 
            ['#ffadd6', '#73baff', '#ffe766', '#a38fd3'], 
         ]


chart_title = "Classes scattering"
height = 400
width = 900

path = str(input("DART scatter charts path: ")) 
#For example: charts/english/exploratory_analysis/scatter/

for feature_a in feature_names + radiometric_indexes:
    x = feature_a
    x_title = feature_a
    for feature_b in feature_names + radiometric_indexes:
        y = feature_b
        y_title = feature_b
        export_name = path+feature_a+"/"+x+"_x_"+y

        rsdata_charts.scatter_chart(datasets_names, traces, x, y, labels, labels_group, colors, chart_title, x_title, y_title, height, width, legend_orientation="v", guidance="horizontal", export_name=export_name) 

In [None]:
labels = [
            ["Plástico", "Água"],
            ["Plástico", "Água", "Costa", "Madeira"]
         ]

labels_group = [
                "Classes DART", 
                "Classes USGS"
                ]

chart_title = "Dispersão das classes"
height = 400
width = 900

path = str(input("Caminhos para gráficos de espalhamento dos dados simulados (DART):")) 
#For example: charts/portugues/analise_exploratoria/dispersao/

for feature_a in feature_names + radiometric_indexes:
    x = feature_a
    x_title = feature_a
    for feature_b in feature_names + radiometric_indexes:
        y = feature_b
        y_title = feature_b
        export_name = path+feature_a+"/"+x+"_x_"+y

        rsdata_charts.scatter_chart(datasets_names, traces, x, y, labels, labels_group, colors, chart_title, x_title, y_title, height, width, legend_orientation="v", guidance="horizontal", export_name=export_name) 

In [None]:
datasets_names = ["DART", "DART (100% cover percent)"]

traces = [
             [
                 [dart_subdatasets['plastic_ldpe'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_micronapo'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_nylon'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pet'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pp'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pvc'][feature].mean() for feature in feature_names],
                 [dart_subdatasets['water'][feature].mean() for feature in feature_names]
                 #[dart_subdatasets['sand'][feature].mean() for feature in feature_names]
             ],
             [
                 [dart_subdatasets['plastic_ldpe'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_micronapo'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_nylon'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pet'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pp'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['plastic_pvc'].query('Cover_percent == 100')[feature].mean() for feature in feature_names],
                 [dart_subdatasets['water'][feature].mean() for feature in feature_names]
                 #[dart_subdatasets['sand'][feature].mean() for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names], 
          [feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names]]

legends = [['LDPE (DART)', 'MicroNapo (DART)', 'Nylon (DART)', 'PET (DART)', 'PP (DART)', 'PVC (DART)', 'Water (DART)', 'Sand (DART)'],
           ['LDPE (DART 100%)', 'MicroNapo (DART 100%)', 'Nylon (DART 100%)', 'PET (DART 100%)', 'PP (DART 100%)', 'PVC (DART 100%)', 'Water (DART)', 'Sand (DART)']] #De bags e bottles tinha poucos pixels mistos com mais de 100% de cobertura

modes = [['dot', 'dot', 'dot', 'dot', 'dot', 'dot', 'dot', 'dot'],
         ['lines', 'lines', 'lines', 'lines', 'lines', 'lines', 'lines', 'lines']]

colors = [['#ADF224','#1AB1B1', '#6342d1', '#D81F88', '#FF8825', '#F1C800', '#1E90FF', '#000'], #Plastic, Water, Sand
          ['#ADF224','#1AB1B1', '#6342d1', '#D81F88', '#FF8825', '#F1C800', '#1E90FF', '#000']]

chart_title = "DART elements statistics - Mean spectral signatures"
x_title = "Band"
y_title = "Reflectance"
height = 500
width = 800
guidance = "horizontal"

export_name = str(input("Path for chart of mean signatures of DART elements: ")) 
#For example: charts/english/exploratory_analysis/descriptive_statistics/elements_mean_signatures_dart

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

In [None]:
datasets_names = ["DART", "DART (100% de cobertura)"]

legends = [['LDPE (DART)', 'MicroNapo (DART)', 'Nylon (DART)', 'PET (DART)', 'PP (DART)', 'PVC (DART)', 'Água (DART)', 'Areia (DART)'],
           ['LDPE (DART 100%)', 'MicroNapo (DART 100%)', 'Nylon (DART 100%)', 'PET (DART 100%)', 'PP (DART 100%)', 'PVC (DART 100%)', 'Água (DART)', 'Areia (DART)']]

chart_title = "Estatísticas por elemento no DART - Assinaturas espectrais médias"
x_title = "Banda"
y_title = "Reflectância"
width = 850

export_name = str(input("Caminhos para gráficos de assinatura média por elemento dos dados simulados (DART): ")) 
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/elementos_assinaturas_medias_dart

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

In [None]:
datasets_names = ["USGS", "USGS (100% cover percent)"]

traces = [
             [
                 [usgs_subdatasets['plastic_bags'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['plastic_bottles'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['plastic_mix'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['plastic_mesh'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['coast'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['wood'][feature].mean() for feature in feature_names]
             ],
             [
                 [usgs_subdatasets['plastic_mesh'].query('Cover_percent == -100')[feature].mean() for feature in feature_names],
                 [usgs_subdatasets['water'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['coast'][feature].mean() for feature in feature_names],
                 [usgs_subdatasets['wood_-100'][feature].mean() for feature in feature_names]
             ]
          ]

labels = [[feature_names, feature_names, feature_names, feature_names, feature_names, feature_names, feature_names],
          [feature_names, feature_names, feature_names, feature_names]]

legends = [['Bags (USGS)', 'Bottles (USGS)', 'Bags and Bottles (USGS)', 'HDPE mesh (USGS)', 'Water (USGS)', 'Coast (USGS)', 'Wood (USGS)'],
           ['HDPE mesh (USGS 100%)', 'Water (USGS)', 'Coast (USGS)', 'Wood (USGS 100%)']] #De bags e bottles tinha poucos pixels mistos com mais de 100% de cobertura

modes = [['dot', 'dot', 'dot', 'dot', 'dot', 'dot', 'dot'],
         ['lines', 'lines', 'lines', 'lines']]

colors = [['#49C658','#8945AB', '#FF675F', '#FCFE5E', '#1E90FF', '#FFD700', '#82431d'],
          ['#FCFE5E', '#1E90FF', '#FFD700', '#82431d']]

chart_title = "USGS elements statistics - Mean spectral signatures"
x_title = "Band"
y_title = "Reflectance"
height = 500
width = 800
guidance = "horizontal"

export_name = str(input("Path for chart of mean signatures of USGS elements: ")) 
#For example: charts/english/exploratory_analysis/descriptive_statistics/elements_mean_signatures_usgs

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

In [None]:
datasets_names = ["USGS", "USGS (100% de cobertura)"]

legends = [['Sacolas (USGS)', 'Garrafas (USGS)', 'Sacolas e garrafas (USGS)', 'Malha de HDPE (USGS)', 'Água (USGS)', 'Costa (USGS)', 'Madeira (USGS)'],
           ['Malha de HDPE (USGS 100%)', 'Água (USGS)', 'Costa (USGS)', 'Madeira (USGS 100%)']]

chart_title = "Estatísticas por elemento no USGS - Assinaturas espectrais médias"
x_title = "Banda"
y_title = "Reflectância"
width = 850

export_name = str(input("Caminhos para gráficos de assinatura média por elemento dos dados observados (USGS): ")) 
#For example: charts/portugues/analise_exploratoria/estatisticas_descritivas/elementos_assinaturas_medias_usgs

rsdata_charts.line_chart(datasets_names, traces, labels, legends, modes, colors, chart_title, x_title, y_title, height, width, legend_orientation = "v", guidance=guidance, export_name=export_name)

## Feature selection

### DART data

#### All classes

In [113]:
# create dataset
X = dataset_dart[feature_names + radiometric_indexes]
y = dataset_dart['Label']

# holdout
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

#Considerar correlação e heatmap dos datasets na análise

#RandomForest
random_forest = RandomForestClassifier(max_depth=3, random_state=123)
random_forest.fit(X_train, y_train)
y_true = y_test
y_pred = random_forest.predict(X_test)
print(f"Accuracy: {round(accuracy_score(y_true, y_pred), 4)}")

dart_importances = pd.Series(data=random_forest.feature_importances_, index=feature_names + radiometric_indexes)

Accuracy: 1.0


#### Only plastic and water

In [114]:
# create dataset
X = dart_subdatasets['plastic_and_water'][feature_names + radiometric_indexes]
y = dart_subdatasets['plastic_and_water']['Label']

# holdout
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

#Considerar correlação e heatmap dos datasets na análise

#RandomForest
random_forest = RandomForestClassifier(max_depth=3, random_state=123)
random_forest.fit(X_train, y_train)
y_true = y_test
y_pred = random_forest.predict(X_test)
print(f"Accuracy: {round(accuracy_score(y_true, y_pred), 4)}")

dart_ap_importances = pd.Series(data=random_forest.feature_importances_, index=feature_names + radiometric_indexes)

Accuracy: 1.0


In [116]:
datasets_names = ["DART (all classes)", "DART (only water and plastic)"]
traces = [dart_ap_importances, dart_importances]
labels = [dart_ap_importances.index, dart_importances.index]
color = 'rgba(50, 171, 96, 0.6)'
line_color = 'rgba(50, 171, 96, 1.0)'
chart_title = 'Feature importances for DART data'
x_title = "Importance"
y_title = "Feature"
height = 600
width = 900
orientation='h'
guidance="horizontal"

export_name = str(input("DART feature selection charts path: ")) 
#For example: charts/english/pre_processing/feature_selection/dart

rsdata_charts.bar_chart(datasets_names, traces, labels, color, line_color, chart_title, x_title, y_title, height, width, orientation, guidance=guidance, export_name=export_name)



datasets_names = ["DART (todas as classes)", "DART (apenas água e plástico)"]
chart_title = 'Importância de cada feature nos dados DART'
x_title = "Importância"
y_title = "Feature"

export_name = str(input("Caminho para os gráficos de seleção de atributos do DART: ")) 
#For example: charts/portugues/pre_processamento/selecao_atributos/dart

rsdata_charts.bar_chart(datasets_names, traces, labels, color, line_color, chart_title, x_title, y_title, height, width, orientation, guidance=guidance, export_name=export_name)

DART feature selection charts path: charts/english/pre_processing/feature_selection/dart
Caminho para os gráficos de seleção de atributos do DART: charts/portugues/pre_processamento/selecao_atributos/dart


### USGS data

#### All classes

In [117]:
# create dataset
X = dataset_usgs[feature_names + radiometric_indexes]
y = dataset_usgs['Label']

# holdout
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

#Considerar correlação e heatmap dos datasets na análise

#RandomForest
random_forest = RandomForestClassifier(max_depth=3, random_state=123)
random_forest.fit(X_train, y_train)
y_true = y_test
y_pred = random_forest.predict(X_test)
print(f"Accuracy: {round(accuracy_score(y_true, y_pred), 4)}")

usgs_importances = pd.Series(data=random_forest.feature_importances_, index=feature_names + radiometric_indexes)

Accuracy: 0.9698


#### Only plastic and water

In [118]:
# create dataset
X = usgs_subdatasets['plastic_and_water'][feature_names + radiometric_indexes]
y = usgs_subdatasets['plastic_and_water']['Label']

# holdout
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

#Considerar correlação e heatmap dos datasets na análise

#RandomForest
random_forest = RandomForestClassifier(max_depth=3, random_state=123)
random_forest.fit(X_train, y_train)
y_true = y_test
y_pred = random_forest.predict(X_test)
print(f"Accuracy: {round(accuracy_score(y_true, y_pred), 4)}")

usgs_ap_importances = pd.Series(data=random_forest.feature_importances_, index=feature_names + radiometric_indexes)

Accuracy: 0.957


In [119]:
datasets_names = ["USGS (all classes)", "USGS (only water and plastic)"]
traces = [usgs_ap_importances, usgs_importances]
labels = [usgs_ap_importances.index, usgs_importances.index]
color = 'rgba(50, 171, 96, 0.6)'
line_color = 'rgba(50, 171, 96, 1.0)'
chart_title = 'Feature importances for USGS data'
x_title = "Importance"
y_title = "Feature"
height = 600
width = 900
orientation='h'
guidance="horizontal"

export_name = str(input("USGS feature selection charts path: ")) 
#For example: charts/english/pre_processing/feature_selection/usgs

rsdata_charts.bar_chart(datasets_names, traces, labels, color, line_color, chart_title, x_title, y_title, height, width, orientation, guidance=guidance, export_name=export_name)



datasets_names = ["USGS (todas as classes)", "USGS (apenas água e plástico)"]
chart_title = 'Importância de cada feature nos dados USGS'
x_title = "Importância"
y_title = "Feature"

export_name = str(input("Caminho para os gráficos de seleção de atributos do USGS: ")) 
#For example: charts/portugues/pre_processamento/selecao_atributos/usgs"

rsdata_charts.bar_chart(datasets_names, traces, labels, color, line_color, chart_title, x_title, y_title, height, width, orientation, guidance=guidance, export_name=export_name)

USGS feature selection charts path: charts/english/pre_processing/feature_selection/usgs
Caminho para os gráficos de seleção de atributos do USGS: charts/portugues/pre_processamento/selecao_atributos/usgs


### Testing DART x USGS

In [120]:
#RandomForest
random_forest = RandomForestClassifier(max_depth=3, random_state=123)
random_forest.fit(dart_subdatasets['plastic_and_water'][feature_names + radiometric_indexes], dart_subdatasets['plastic_and_water']['Label'])
y_true = usgs_subdatasets['plastic_and_water']['Label']
y_pred = random_forest.predict(usgs_subdatasets['plastic_and_water'][feature_names + radiometric_indexes])
print(f"Accuracy: {round(accuracy_score(y_true, y_pred), 4)}")

Accuracy: 0.3783


## Unsupervised classification

In [28]:
feature_sets = [
                    feature_names + radiometric_indexes,
                    feature_names,
                    radiometric_indexes,
                    ['NIR1', 'SR', 'WRI', 'FDI']
                ]

In [29]:
ids = ["A", "B", "C", "D"]
n_clusters = [3, 4, 5]   
paths = ['LDPE/20/', 'LDPE/40/', 'LDPE/60/', 'LDPE/80/', 'LDPE/100/',
         'MicroNapo/20/', 'MicroNapo/40/', 'MicroNapo/60/', 'MicroNapo/80/', 'MicroNapo/100/',
         'Nylon/20/', 'Nylon/40/', 'Nylon/60/', 'Nylon/80/', 'Nylon/100/',
         'PET/20/', 'PET/40/', 'PET/60/', 'PET/80/', 'PET/100/',
         'PP/20/', 'PP/40/', 'PP/60/', 'PP/80/', 'PP/100/',
         'PVC/20/', 'PVC/40/', 'PVC/60/', 'PVC/80/', 'PVC/100/']
dates = ['2019_04_18', '2019_05_03', '2019_05_18', '2019_05_28', '2019_06_07', '2021_06_21', '2021_07_01', '2021_07_06', '2021_07_21', '2021_08_25']

dart_path = str(input("DART unsupervised classification charts path: "))
#For example: charts/english/unsupervised_classification/kmeans/feature_set_

dart_caminho = str(input("Caminho para os gráficos de classificação não supervisionada do DART: ")) 
#For example: charts/portugues/classificacao_nao_supervisionada/kmeans/conjunto_atributos_

DART unsupervised classification charts path: charts/english/unsupervised_classification/kmeans/feature_set_
Caminho para os gráficos de classificação não supervisionada do DART: charts/portugues/classificacao_nao_supervisionada/kmeans/conjunto_atributos_


In [30]:
for i in range(len(feature_sets)):

    for n in n_clusters:
        names_clusters = [] 
        classified_data = rsdata_classification.k_means(dataset_dart, feature_sets[i], n, 123)
        
        for k in range(n): 
            names_clusters.append("Cluster "+str(k))
        
        labels_group = names_clusters
        y = []
        
        for k in range(n):
            rsdata_charts.pie_chart(
                [classified_data.query('Cluster == '+str(k))], 
                ['Label'], 
                ["Cluster "+str(k)], 
                "", 450, 300, 
                ['#1E90FF', '#FF69B4', '#FFD700'], dart_path+ids[i]+'/dart/k'+str(n)+'/cl'+str(k))
            
            rsdata_charts.pie_chart(
                [classified_data.query('Cluster == '+str(k))], 
                ['Classe'], 
                ["Cluster "+str(k)], 
                "", 450, 300, 
                ['#1E90FF', '#FF69B4', '#FFD700'], dart_caminho+ids[i]+'/dart/k'+str(n)+'/cl'+str(k))
                
            y.append(
                [
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'LDPE'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'LDPE'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'LDPE'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'LDPE'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'LDPE'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'PVC'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'PVC'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'PVC'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'PVC'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'PVC'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'PP'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'PP'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'PP'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'PP'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'PP'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'PET'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'PET'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'PET'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'PET'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'PET'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'MicroNapo'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'MicroNapo'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'MicroNapo'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'MicroNapo'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'MicroNapo'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 20 and Polymer == 'Nylon'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 40 and Polymer == 'Nylon'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 60 and Polymer == 'Nylon'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 80 and Polymer == 'Nylon'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == 100 and Polymer == 'Nylon'"))]
                ]
            )
            rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%'],
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%'],
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%'],
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%'],
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%'],
                                            ['Plastic 20%', 'Plastic 40%', 'Plastic 60%', 'Plastic 80%', 'Plastic 100%']
                                        ],
                                        y, 
                                        ['LDPE', 'PVC', 'PP', 'PET', 'MicroNapo', 'Nylon'], #names, 
                                        ['#ADF224','#1AB1B1', '#2f1b70', '#D81F88', '#FF8825', '#F1C800'], #colors, 
                                        'Polymers by cluster', #chart_title, 
                                        'Cover percents', #x_title, 
                                        'Number of pixels', #y_title, 
                                        650, #height, 
                                        1000, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        dart_path+ids[i]+"/dart/k"+str(n)+"/polymers"#export_name
                                       )


            rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%'],
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%'],
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%'],
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%'],
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%'],
                                            ['Plástico 20%', 'Plástico 40%', 'Plástico 60%', 'Plástico 80%', 'Plástico 100%']
                                        ],
                                        y, 
                                        ['LDPE', 'PVC', 'PP', 'PET', 'MicroNapo', 'Nylon'], #names, 
                                        ['#ADF224','#1AB1B1', '#2f1b70', '#D81F88', '#FF8825', '#F1C800'], #colors, 
                                        'Polímeros por cluster', #chart_title, 
                                        'Percentuais de cobertura', #x_title, 
                                        'Número de pixels', #y_title, 
                                        650, #height, 
                                        1000, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        dart_caminho+ids[i]+"/dart/k"+str(n)+"/polimeros"#export_name
                                       )
            
        for p in paths: 
            rsdata_classification.map_kmeans(p, dataset_dart, classified_data, dart_path+ids[i]+'/dart/k'+str(n)+'/', dart_caminho+ids[i]+'/dart/k'+str(n)+'/', 3000, 5000)

In [31]:
usgs_path = str(input("USGS unsupervised classification charts path: "))
#For example: charts/english/unsupervised_classification/kmeans/feature_set_

usgs_caminho = str(input("Caminho para os gráficos de classificação não supervisionada do USGS: ")) 
#For example: charts/portugues/classificacao_nao_supervisionada/kmeans/conjunto_atributos_

USGS unsupervised classification charts path: charts/english/unsupervised_classification/kmeans/feature_set_
Caminho para os gráficos de classificação não supervisionada do USGS: charts/portugues/classificacao_nao_supervisionada/kmeans/conjunto_atributos_


In [32]:
for i in range(len(feature_sets)):
    for n in n_clusters:
        names_clusters = [] 
        classified_data = rsdata_classification.k_means(dataset_usgs, feature_sets[i], n, 123)
        
        for k in range(n): 
            names_clusters.append("Cluster "+str(k))
        
        labels_group = names_clusters
        x = []
        y = []
        z = []
        
        for k in range(n):
            rsdata_charts.pie_chart(
                [classified_data.query('Cluster == '+str(k))], 
                ['Label'], 
                ["Cluster "+str(k)], 
                "", 450, 300, 
                ['#1E90FF', '#FF69B4', '#FFD700'], usgs_path+ids[i]+'/usgs/k'+str(n)+'/cl'+str(k))
            
            
            rsdata_charts.pie_chart(
                [classified_data.query('Cluster == '+str(k))], 
                ['Classe'], 
                ["Cluster "+str(k)], 
                "", 450, 300, 
                ['#1E90FF', '#FF69B4', '#FFD700'], usgs_caminho+ids[i]+'/usgs/k'+str(n)+'/cl'+str(k))
        
            x.append(
                [                
                    [len(classified_data.query('Cluster == '+str(k)).query("Path == '2019_04_18'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2019_05_03'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2019_05_18'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2019_05_28'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2019_06_07'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2021_06_21'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2021_07_01'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2021_07_06'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2021_07_21'")), len(classified_data.query('Cluster == '+str(k)).query("Path == '2021_08_25'"))]
                ]
                )
            
            y.append(
                [                
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent <= 20 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 20 and Cover_percent <= 40 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 40 and Cover_percent <= 60 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 60 and Cover_percent <= 80 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 80 and Cover_percent <= 99 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -100 and Polymer == 'Bags'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -1 and Polymer == 'Bags'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent <= 20 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 20 and Cover_percent <= 40 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 40 and Cover_percent <= 60 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 60 and Cover_percent <= 80 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 80 and Cover_percent <= 99 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -100 and Polymer == 'Bags and Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -1 and Polymer == 'Bags and Bottles'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent <= 20 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 20 and Cover_percent <= 40 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 40 and Cover_percent <= 60 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 60 and Cover_percent <= 80 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 80 and Cover_percent <= 99 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -100 and Polymer == 'Bottles'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -1 and Polymer == 'Bottles'"))],
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent <= 20 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 20 and Cover_percent <= 40 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 40 and Cover_percent <= 60 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 60 and Cover_percent <= 80 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 80 and Cover_percent <= 99 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -100 and Polymer == 'HDPE mesh'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -1 and Polymer == 'HDPE mesh'"))]
                ]
                )
            
            z.append(
                [                
                    [len(classified_data.query('Cluster == '+str(k)).query("Cover_percent <= 20 and Polymer == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 20 and Cover_percent <= 40 and Polymer == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 40 and Cover_percent <= 60 and Polymer == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 60 and Cover_percent <= 80 and Polymer == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent > 80 and Cover_percent <= 99 and Polymer == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -100 and Label == 'Wood'")), len(classified_data.query('Cluster == '+str(k)).query("Cover_percent == -1 and Label == 'Wood'"))]
                    
                ]
                )
            
            rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Plastic 1% - 20%', 'Plastic 20% - 40%', 'Plastic 40% - 60%', 'Plastic 60% - 80%', 'Plastic 80% - 99%', 'Plastic 100%', 'Unknown percent'],
                                            ['Plastic 1% - 20%', 'Plastic 20% - 40%', 'Plastic 40% - 60%', 'Plastic 60% - 80%', 'Plastic 80% - 99%', 'Plastic 100%', 'Unknown percent'],
                                            ['Plastic 1% - 20%', 'Plastic 20% - 40%', 'Plastic 40% - 60%', 'Plastic 60% - 80%', 'Plastic 80% - 99%', 'Plastic 100%', 'Unknown percent'],
                                            ['Plastic 1% - 20%', 'Plastic 20% - 40%', 'Plastic 40% - 60%', 'Plastic 60% - 80%', 'Plastic 80% - 99%', 'Plastic 100%', 'Unknown percent']
                                        ],
                                        y, 
                                        ['Bags', 'Bags and Bottles', 'Bottles', 'HDPE mesh'], #names, 
                                        ['#49C658','#8945AB', '#FF675F', '#FCFE5E'], #colors, 
                                        'Polymers by cluster', #chart_title, 
                                        'Cover percents', #x_title, 
                                        'Number of pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_path+ids[i]+"/usgs/k"+str(n)+"/polymers"#export_name
                                           )


            rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Plástico 1% a 20%', 'Plástico 20% a 40%', 'Plástico 40% a 60%', 'Plástico 60% a 80%', 'Plástico 80% a 99%', 'Plástico 100%', 'Percentual desconhecido'],
                                            ['Plástico 1% a 20%', 'Plástico 20% a 40%', 'Plástico 40% a 60%', 'Plástico 60% a 80%', 'Plástico 80% a 99%', 'Plástico 100%', 'Percentual desconhecido'],
                                            ['Plástico 1% a 20%', 'Plástico 20% a 40%', 'Plástico 40% a 60%', 'Plástico 60% a 80%', 'Plástico 80% a 99%', 'Plástico 100%', 'Percentual desconhecido'],
                                            ['Plástico 1% a 20%', 'Plástico 20% a 40%', 'Plástico 40% a 60%', 'Plástico 60% a 80%', 'Plástico 80% a 99%', 'Plástico 100%', 'Percentual desconhecido']
                                        ],
                                        y, 
                                        ['Sacolas', 'Sacolas e garrafas', 'Garrafas', 'Malha de HDPE'], #names, 
                                        ['#49C658','#8945AB', '#FF675F', '#FCFE5E'], #colors, 
                                        'Polímeros por cluster', #chart_title, 
                                        'Percentuais de cobertura', #x_title, 
                                        'Número de pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_caminho+ids[i]+"/usgs/k"+str(n)+"/polimeros"#export_name
                                           )
        
        rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Wood 1% - 20%', 'Wood 20% - 40%', 'Wood 40% - 60%', 'Wood 60% - 80%', 'Wood 80% - 99%', 'Wood 100%', 'Unknown percent']
                                        ],
                                        z, 
                                        ['Wood'], #names, 
                                        ['#82431d'], #colors, 
                                        'Wood by cluster', #chart_title, 
                                        'Cover percents', #x_title, 
                                        'Number of pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_path+ids[i]+"/usgs/k"+str(n)+"/wood"#export_name
                                           )


        rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['Madeira 1% a 20%', 'Madeira 20% a 40%', 'Madeira 40% a 60%', 'Madeira 60% a 80%', 'Madeira 80% a 99%', 'Madeira 100%', 'Percentual desconhecido']
                                        ],
                                        z, 
                                        ['Madeira'], #names, 
                                        ['#82431d'], #colors, 
                                        'Madeira por cluster', #chart_title, 
                                        'Percentuais de cobertura', #x_title, 
                                        'Número de pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_caminho+ids[i]+"/usgs/k"+str(n)+"/madeira"#export_name
                                           )
        
        rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['18/04/2019', '03/05/2019', '18/05/2019', '28/05/2019', '07/06/2019', '21/06/2021', '01/07/2021', '06/07/2021', '21/07/2021', '25/08/2021']
                                        ],
                                        x, 
                                        ['Dates'], #names, 
                                        ['#49C658','#8945AB', '#FF675F', '#FCFE5E'], #colors, 
                                        'Dates by cluster', #chart_title, 
                                        'Dates', #x_title, 
                                        'Number of pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_path+ids[i]+"/usgs/k"+str(n)+"/dates"#export_name
                                           )


        rsdata_charts.stacked_bar_chart(names_clusters, 
                                        #x,
                                        [
                                            ['18/04/2019', '03/05/2019', '18/05/2019', '28/05/2019', '07/06/2019', '21/06/2021', '01/07/2021', '06/07/2021', '21/07/2021', '25/08/2021']
                                        ],
                                        x, 
                                        ['Datas'], #names, 
                                        ['#49C658','#8945AB', '#FF675F', '#FCFE5E'], #colors,  
                                        'Datas por cluster', #chart_title, 
                                        'Datas', #x_title, 
                                        'Número de pixels', #y_title, 
                                        650, #height, 
                                        1300, #width, 
                                        labels_group, 
                                        'h', #orientation, 
                                        'horizontal', #guidance, 
                                        usgs_caminho+ids[i]+"/usgs/k"+str(n)+"/datas"#export_name
                                           )
        
        for d in dates: 
            rsdata_classification.map_kmeans(d, dataset_usgs, classified_data, usgs_path+ids[i]+'/usgs/k'+str(n)+'/', usgs_caminho+ids[i]+'/usgs/k'+str(n)+'/', 500, 700)

## Supervised classification

In [53]:
#dart_path = str(input("DART supervised classification charts path: "))
#For example: charts/english/supervised_classification/ann/feature_set_

#dart_caminho = str(input("Caminho para os gráficos de classificação não supervisionada do DART: ")) 
#For example: charts/portugues/classificacao_supervisionada/rna/conjunto_atributos_

DART supervised classification charts path: charts/english/supervised_classification/ann/feature_set_
Caminho para os gráficos de classificação não supervisionada do DART: charts/portugues/classificacao_supervisionada/rna/conjunto_atributos_


In [54]:

usgs_path = str(input("USGS supervised classification charts path: "))
#For example: charts/english/supervised_classification/ann/feature_set_

usgs_caminho = str(input("Caminho para os gráficos de classificação não supervisionada do USGS: ")) 
#For example: charts/portugues/classificacao_supervisionada/rna/conjunto_atributos_

USGS supervised classification charts path: charts/english/supervised_classification/ann/feature_set_
Caminho para os gráficos de classificação não supervisionada do USGS: charts/portugues/classificacao_supervisionada/rna/conjunto_atributos_


### GridsearchCV

In [39]:
for feature_set in feature_sets:
    print("Starting best params search for MLPClassifier for feature set ", feature_set)
    # create dataset
    X = dart_subdatasets['plastic_and_water'][feature_set]
    y = dart_subdatasets['plastic_and_water']['Label']
    
    # configure the cross-validation procedure
    cv_outer = StratifiedKFold(n_splits=4, shuffle=True, random_state=123)
    
    #accuraccy metrics
    metrics = ['accuracy', #calcula a acurácia do subconjunto: o conjunto de rótulos predito para uma amostra deve corresponder exatamente ao conjunto de rótulos em y_true.
               'balanced_accuracy', #para lidar com conjuntos de dados desbalanceados; é definida como a média do recall obtido em cada classe.
               'f1_micro', #Calcula as métricas globalmente contando o total de verdadeiros positivos, falsos negativos e falsos positivos.
               'f1_weighted', #Calcula as métricas para cada rótulo e encontra sua média ponderada pelo suporte (o número de instâncias verdadeiras para cada rótulo). Isso altera o 'macro' para levar em conta o desbalanceamento dos dados; pode resultar em um F-score que não está entre precisão e recall.
               'precision_micro', #tp / (tp + fp) - global
               'precision_weighted', #tp / (tp + fp) - média ponderada pelo suporte (número de tp do respectivo rótulo) dos rótulos - leva em conta o desbalanceamento dos dados 
               'recall_micro', #tp / (tp + fn) - global #CORTEI OS MACROS PQ SAO tp / (tp + fn) - média não ponderada dos rótulos
               'recall_weighted', #tp / (tp + fn) - média ponderada pelo suporte (número de tp do respectivo rótulo) dos rótulos - leva em conta o desbalanceamento dos dados
               'roc_auc_ovr', #Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores / Significa One-vs-rest. Calcula a AUC de cada classe em relação ao resto [3] [4]. Isso trata o caso multiclasse da mesma forma que o caso multirótulo. Sensível ao desbalanceamento de classe
               'roc_auc_ovr_weighted' #Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores / Significa One-vs-rest. Calcula a AUC de cada classe em relação ao resto [3] [4]. Isso trata o caso multiclasse da mesma forma que o caso multirótulo. Sensível ao desbalanceamento de classe / média, ponderada pelo suporte (o número de instâncias verdadeiras para cada rótulo).
              ]
    
    # define search space
    space = dict()
    space['hidden_layer_sizes'] = [(20), (30,30), (50,50,50)] 
    space['solver'] = ['lbfgs', 'sgd', 'adam']
    space['alpha'] = [0.00001, 0.0001, 0.001]
    space['max_iter'] = [100, 250, 500]
    space['activation'] = ['identity', 'logistic', 'tanh', 'relu']
               
    best_models = []
    results = []
    
    for train_ix, test_ix in cv_outer.split(X, y):
        # select rows
        train_X, test_X = X.iloc[train_ix], X.iloc[test_ix]#Depois treinar com 100% DART e testar com 100% USGS - sem validação cruzada no caso
        train_y, test_y = y.iloc[train_ix], y.iloc[test_ix]
        
        # configure the cross-validation procedure
        cv_inner = StratifiedKFold(n_splits=4, shuffle=True, random_state=123)
        
        bests = dict()       
        rslt = []
               
        for metric in metrics:     
            model = MLPClassifier(random_state=123)
            # define search
            search = GridSearchCV(model, space, scoring=metric, cv=cv_inner, refit=True)
            # execute search
            result = search.fit(train_X, train_y)
            # get the best performing model fit on the whole training set
            best_model = result.best_estimator_
            # evaluate model on the hold out dataset
            y_pred = best_model.predict(test_X)
            #tn, fp, fn, tp = confusion_matrix(test_y, yhat).ravel()
            # evaluate the model
            assessment = {
                          'feature_set': feature_set,
                          'metric:': metric,
                          'y_true': test_y,
                          'y_pred': y_pred,
                          'confusion_matrix': confusion_matrix(test_y, y_pred, labels=['Sand', 'Water', 'Plastic']),
                          'accuracy': accuracy_score(test_y, y_pred),
                          'balanced_accuracy': balanced_accuracy_score(test_y, y_pred),
                          'f1_micro': f1_score(test_y, y_pred, average='micro'),
                          'f1_macro': f1_score(test_y, y_pred, average='macro'),
                          'f1_weighted': f1_score(test_y, y_pred, average='weighted'),
                          'fbeta_micro': fbeta_score(test_y, y_pred, average='micro', beta=0.5),
                          'fbeta_macro': fbeta_score(test_y, y_pred, average='macro', beta=0.5),
                          'fbeta_weighted': fbeta_score(test_y, y_pred, average='weighted', beta=0.5),
                          'jaccard_micro': jaccard_score(test_y, y_pred, average='micro'),
                          'jaccard_macro': jaccard_score(test_y, y_pred, average='macro'), 
                          'jaccard_weighted': jaccard_score(test_y, y_pred, average='weighted'), 
                          'precision_micro': precision_score(test_y, y_pred, average='micro'),
                          'precision_macro': precision_score(test_y, y_pred, average='macro'),
                          'precision_weighted': precision_score(test_y, y_pred, average='weighted'),  
                          'recall_micro': recall_score(test_y, y_pred, average='micro'),
                          'recall_macro': recall_score(test_y, y_pred, average='macro'),
                          'recall_weighted': recall_score(test_y, y_pred, average='weighted')
                        }
               
            print(datetime.datetime.now().strftime("%d-%b-%Y %A %I:%M"), ' - best model: ' , best_model ,' - assessment: ', assessment)
            bests.update({best_model:assessment})
            rslt.append(result)
        print("Ending loop CV")
        
        # store the result
        best_models.append(bests)
        results.append(rslt)
        # report progress
               
    print("Ending best params search for MLPClassifier for feature set ", feature_set)

0           Water
1           Water
2           Water
3           Water
4           Water
           ...   
428351    Plastic
428355    Plastic
428359    Plastic
428363    Plastic
428367    Plastic
Name: Label, Length: 219328, dtype: object

### Results for feature set A (bands)

#### Classification

In [101]:
assessment, errors, hits, confusion_matrices, acc, b_acc, f1_macro, f1_weighted, fbeta_macro, fbeta_weighted, pr_macro, rc_macro, pr_weighted, rc_weighted = rsdata_classification.multilayer_perceptron(dart_subdatasets['plastic_and_water'][feature_sets[0]], 
                                                                                            dart_subdatasets['plastic_and_water']['Label'],
                                                                                            usgs_subdatasets['plastic_and_water'][feature_sets[0]],
                                                                                            usgs_subdatasets['plastic_and_water']['Label'],
                                                                                            100)

#### Stats

In [102]:
acc_ = []
b_acc_ = []
f1_macro_ = []
f1_weighted_ = []
fbeta_macro_ = []
fbeta_weighted_ = []
pr_macro_ = []
rc_macro_ = []
pr_weighted_ = []
rc_weighted_ = []

for key in acc.keys():
    acc_.append(acc[key])
    b_acc_.append(b_acc[key])
    f1_macro_.append(f1_macro[key])
    f1_weighted_.append(f1_weighted[key])
    fbeta_macro_.append(fbeta_macro[key])
    fbeta_weighted_.append(fbeta_weighted[key])
    pr_macro_.append(pr_macro[key])
    rc_macro_.append(rc_macro[key])
    pr_weighted_.append(pr_weighted[key])
    rc_weighted_.append(rc_weighted[key])
    
acc_ = pd.DataFrame(acc_, columns = ['Acc'])
b_acc_ = pd.DataFrame(b_acc_, columns = ['B_acc'])
f1_macro_ = pd.DataFrame(f1_macro_, columns = ['F1M'])
f1_weighted_ = pd.DataFrame(f1_weighted_, columns = ['F1W'])
fbeta_macro_ = pd.DataFrame(fbeta_macro_, columns = ['FBM'])
fbeta_weighted_ = pd.DataFrame(fbeta_weighted_, columns = ['FBW'])
pr_macro_ = pd.DataFrame(pr_macro_, columns = ['PrM'])
rc_macro_ = pd.DataFrame(rc_macro_, columns = ['RcM'])
pr_weighted_ = pd.DataFrame(pr_weighted_, columns = ['PrW'])
rc_weighted_ = pd.DataFrame(rc_weighted_, columns = ['RcW'])
    
mt_a = pd.DataFrame([acc_['Acc'].mean(), b_acc_['B_acc'].mean(), f1_macro_['F1M'].mean(), f1_weighted_['F1W'].mean(), fbeta_macro_['FBM'].mean(), fbeta_weighted_['FBW'].mean(), pr_macro_['PrM'].mean(), pr_weighted_['PrW'].mean(), rc_macro_['RcM'].mean(), rc_weighted_['RcW'].mean()], index=['Overall accuracy', 'Balanced accuracy', 'F1 macro', 'F1 weighted', 'Fbeta macro', 'Fbeta weighted', 'Precision macro', 'Precision weighted', 'Recall macro', 'Recall weighted'], columns=['Feature set A'])
mt_a

Unnamed: 0,Feature set A
Overall accuracy,0.358861
Balanced accuracy,0.593941
F1 macro,0.306784
F1 weighted,0.476047
Fbeta macro,0.390098
Fbeta weighted,0.670609
Precision macro,0.520642
Precision weighted,0.931565
Recall macro,0.593941
Recall weighted,0.358861


In [103]:
stats_by_polymer, stats_by_label_year, stats_by_plastic_cover_percent, stats_by_date = rsdata_classification.stats_classification(assessment, usgs_subdatasets['plastic_and_water'])
stats_by_polymer

Unnamed: 0,Polymer,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,Bags,20,14.51 (72.6 %),8 (40.0 %),17 (85.0 %),12.15 %
1,Bottles,25,15.62 (62.5 %),11 (44.0 %),18 (72.0 %),11.32 %
2,Bags and Bottles,4,4.0 (100.0 %),4 (100.0 %),4 (100.0 %),33.25 %
3,HDPE mesh,54,53.97 (99.9 %),53 (98.1 %),54 (100.0 %),100% (15 samples) and < 100% (39 samples)


In [104]:
stats_by_label_year

Unnamed: 0,Year,Label,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,2019,Water,1346,460.74 (34.2 %),341 (25.3 %),626 (46.5 %),100.0 %
1,2019,Plastic,49,34.13 (69.7 %),23 (46.9 %),39 (79.6 %),13.45 %
2,2021,Water,597,185.39 (31.1 %),128 (21.4 %),268 (44.9 %),99.83 %
3,2021,Plastic,54,53.97 (99.9 %),53 (98.1 %),54 (100.0 %),100% (15 samples) and < 100% (39 samples)


In [105]:
stats_by_plastic_cover_percent

Unnamed: 0,Cover_percent,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,<=25%,39,24.63 (63.2 %),15 (38.5 %),29 (74.4 %),7.51 %
1,26% - 50%,9,8.5 (94.4 %),7 (77.8 %),9 (100.0 %),34.56 %
2,51% - 99%,1,1.0 (100.0 %),1 (100.0 %),1 (100.0 %),55.0 %
3,100%,15,15.0 (100.0 %),15 (100.0 %),15 (100.0 %),100%
4,Unknown,39,38.97 (99.9 %),38 (97.4 %),39 (100.0 %),Unknown


In [106]:
stats_by_date

Unnamed: 0,Cover_percent,Total water,Total plastic,Mean hits,Min hits,Max hits,Plastic mean cover percent
0,18/04/2019,139,5,35.17 (24.4 %),19 (13.2 %),81 (56.2 %),24.0 %
1,03/05/2019,373,19,309.23 (78.9 %),209 (53.3 %),368 (93.9 %),12.05 %
2,18/05/2019,566,10,11.22 (1.9 %),10 (1.7 %),12 (2.1 %),12.1 %
3,28/05/2019,136,8,27.54 (19.1 %),10 (6.9 %),126 (87.5 %),12.25 %
4,07/06/2019,132,7,111.71 (80.4 %),106 (76.3 %),117 (84.2 %),13.0 %
5,21/06/2021,121,9,98.34 (75.6 %),90 (69.2 %),105 (80.8 %),Unknown
6,01/07/2021,119,12,12.47 (9.5 %),12 (9.2 %),41 (31.3 %),Unknown
7,06/07/2021,120,11,59.11 (45.1 %),35 (26.7 %),85 (64.9 %),Unknown
8,21/07/2021,118,11,49.92 (38.7 %),31 (24.0 %),72 (55.8 %),Unknown
9,25/08/2021,119,11,19.52 (15.0 %),14 (10.8 %),30 (23.1 %),Unknown


#### Maps

In [114]:
dates = list(set(usgs_subdatasets['plastic_and_water'].loc[assessment[0].index]['Path']))

max_score = f1_weighted_.max()['F1W'] #Higher F1 weighted score

best_score = f1_weighted_.loc[f1_weighted_['F1W'] >= max_score].index[0] #Data from best F1 weighted score

classified_data_a = assessment[best_score]

In [115]:
for d in dates:
    rsdata_charts.map_nn(d, usgs_subdatasets['plastic_and_water'], classified_data_a, usgs_path+'A/', usgs_caminho+'A/', 500, 1000)

### Results for feature set B (bands)

In [116]:
assessment, errors, hits, confusion_matrices, acc, b_acc, f1_macro, f1_weighted, fbeta_macro, fbeta_weighted, pr_macro, rc_macro, pr_weighted, rc_weighted = rsdata_classification.multilayer_perceptron(dart_subdatasets['plastic_and_water'][feature_sets[1]], 
                                                                                            dart_subdatasets['plastic_and_water']['Label'],
                                                                                            usgs_subdatasets['plastic_and_water'][feature_sets[1]],
                                                                                            usgs_subdatasets['plastic_and_water']['Label'],
                                                                                            100)


In [117]:
acc_ = []
b_acc_ = []
f1_macro_ = []
f1_weighted_ = []
fbeta_macro_ = []
fbeta_weighted_ = []
pr_macro_ = []
rc_macro_ = []
pr_weighted_ = []
rc_weighted_ = []

for key in acc.keys():
    acc_.append(acc[key])
    b_acc_.append(b_acc[key])
    f1_macro_.append(f1_macro[key])
    f1_weighted_.append(f1_weighted[key])
    fbeta_macro_.append(fbeta_macro[key])
    fbeta_weighted_.append(fbeta_weighted[key])
    pr_macro_.append(pr_macro[key])
    rc_macro_.append(rc_macro[key])
    pr_weighted_.append(pr_weighted[key])
    rc_weighted_.append(rc_weighted[key])
    
acc_ = pd.DataFrame(acc_, columns = ['Acc'])
b_acc_ = pd.DataFrame(b_acc_, columns = ['B_acc'])
f1_macro_ = pd.DataFrame(f1_macro_, columns = ['F1M'])
f1_weighted_ = pd.DataFrame(f1_weighted_, columns = ['F1W'])
fbeta_macro_ = pd.DataFrame(fbeta_macro_, columns = ['FBM'])
fbeta_weighted_ = pd.DataFrame(fbeta_weighted_, columns = ['FBW'])
pr_macro_ = pd.DataFrame(pr_macro_, columns = ['PrM'])
rc_macro_ = pd.DataFrame(rc_macro_, columns = ['RcM'])
pr_weighted_ = pd.DataFrame(pr_weighted_, columns = ['PrW'])
rc_weighted_ = pd.DataFrame(rc_weighted_, columns = ['RcW'])
    
mt_b = pd.DataFrame([acc_['Acc'].mean(), b_acc_['B_acc'].mean(), f1_macro_['F1M'].mean(), f1_weighted_['F1W'].mean(), fbeta_macro_['FBM'].mean(), fbeta_weighted_['FBW'].mean(), pr_macro_['PrM'].mean(), pr_weighted_['PrW'].mean(), rc_macro_['RcM'].mean(), rc_weighted_['RcW'].mean()], index=['Overall accuracy', 'Balanced accuracy', 'F1 macro', 'F1 weighted', 'Fbeta macro', 'Fbeta weighted', 'Precision macro', 'Precision weighted', 'Recall macro', 'Recall weighted'], columns=['Feature set B'])
mt_b
#mt_b['Feature set B']


Unnamed: 0,Feature set B
Overall accuracy,0.797146
Balanced accuracy,0.666378
F1 macro,0.546098
F1 weighted,0.849084
Fbeta macro,0.543849
Fbeta weighted,0.893649
Precision macro,0.550905
Precision weighted,0.927527
Recall macro,0.666378
Recall weighted,0.797146


In [118]:
stats_by_polymer, stats_by_label_year, stats_by_plastic_cover_percent, stats_by_date = rsdata_classification.stats_classification(assessment, usgs_subdatasets['plastic_and_water'])
stats_by_polymer

Unnamed: 0,Polymer,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,Bags,20,4.98 (24.9 %),4 (20.0 %),7 (35.0 %),12.15 %
1,Bottles,25,8.79 (35.2 %),6 (24.0 %),13 (52.0 %),11.32 %
2,Bags and Bottles,4,1.55 (38.8 %),1 (25.0 %),2 (50.0 %),33.25 %
3,HDPE mesh,54,38.34 (71.0 %),32 (59.3 %),43 (79.6 %),100% (15 samples) and < 100% (39 samples)


In [119]:
stats_by_label_year

Unnamed: 0,Year,Label,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,2019,Water,1346,1064.8 (79.1 %),682 (50.7 %),1184 (88.0 %),100.0 %
1,2019,Plastic,49,15.32 (31.3 %),11 (22.4 %),22 (44.9 %),13.45 %
2,2021,Water,597,512.5 (85.8 %),466 (78.1 %),570 (95.5 %),99.83 %
3,2021,Plastic,54,38.34 (71.0 %),32 (59.3 %),43 (79.6 %),100% (15 samples) and < 100% (39 samples)


In [120]:
stats_by_plastic_cover_percent

Unnamed: 0,Cover_percent,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,<=25%,39,11.75 (30.1 %),8 (20.5 %),17 (43.6 %),7.51 %
1,26% - 50%,9,3.57 (39.7 %),3 (33.3 %),5 (55.6 %),34.56 %
2,51% - 99%,1,0.0 (0.0 %),0 (0.0 %),0 (0.0 %),55.0 %
3,100%,15,14.63 (97.5 %),12 (80.0 %),15 (100.0 %),100%
4,Unknown,39,23.71 (60.8 %),20 (51.3 %),28 (71.8 %),Unknown


In [121]:
stats_by_date

Unnamed: 0,Cover_percent,Total water,Total plastic,Mean hits,Min hits,Max hits,Plastic mean cover percent
0,18/04/2019,139,5,141.51 (98.3 %),140 (97.2 %),142 (98.6 %),24.0 %
1,03/05/2019,373,19,345.79 (88.2 %),327 (83.4 %),356 (90.8 %),12.05 %
2,18/05/2019,566,10,387.87 (67.3 %),96 (16.7 %),454 (78.8 %),12.1 %
3,28/05/2019,136,8,73.95 (51.4 %),8 (5.6 %),115 (79.9 %),12.25 %
4,07/06/2019,132,7,131.0 (94.2 %),131 (94.2 %),131 (94.2 %),13.0 %
5,21/06/2021,121,9,122.46 (94.2 %),120 (92.3 %),124 (95.4 %),Unknown
6,01/07/2021,119,12,51.37 (39.2 %),14 (10.7 %),105 (80.2 %),Unknown
7,06/07/2021,120,11,128.73 (98.3 %),125 (95.4 %),130 (99.2 %),Unknown
8,21/07/2021,118,11,126.65 (98.2 %),125 (96.9 %),128 (99.2 %),Unknown
9,25/08/2021,119,11,121.63 (93.6 %),119 (91.5 %),122 (93.8 %),Unknown


#### Maps

In [122]:
dates = list(set(usgs_subdatasets['plastic_and_water'].loc[assessment[0].index]['Path']))

max_score = f1_weighted_.max()['F1W'] #Higher F1 weighted score

best_score = f1_weighted_.loc[f1_weighted_['F1W'] >= max_score].index[0] #Data from best F1 weighted score

classified_data_b = assessment[best_score]

In [123]:
for d in dates:
    rsdata_charts.map_nn(d, usgs_subdatasets['plastic_and_water'], classified_data_b, usgs_path+'A/', usgs_caminho+'A/', 500, 1000)

### Results for feature set C (bands)

In [124]:
assessment, errors, hits, confusion_matrices, acc, b_acc, f1_macro, f1_weighted, fbeta_macro, fbeta_weighted, pr_macro, rc_macro, pr_weighted, rc_weighted = rsdata_classification.multilayer_perceptron(dart_subdatasets['plastic_and_water'][feature_sets[2]], 
                                                                                            dart_subdatasets['plastic_and_water']['Label'],
                                                                                            usgs_subdatasets['plastic_and_water'][feature_sets[2]],
                                                                                            usgs_subdatasets['plastic_and_water']['Label'],
                                                                                            100)


In [125]:
acc_ = []
b_acc_ = []
f1_macro_ = []
f1_weighted_ = []
fbeta_macro_ = []
fbeta_weighted_ = []
pr_macro_ = []
rc_macro_ = []
pr_weighted_ = []
rc_weighted_ = []

for key in acc.keys():
    acc_.append(acc[key])
    b_acc_.append(b_acc[key])
    f1_macro_.append(f1_macro[key])
    f1_weighted_.append(f1_weighted[key])
    fbeta_macro_.append(fbeta_macro[key])
    fbeta_weighted_.append(fbeta_weighted[key])
    pr_macro_.append(pr_macro[key])
    rc_macro_.append(rc_macro[key])
    pr_weighted_.append(pr_weighted[key])
    rc_weighted_.append(rc_weighted[key])
    
acc_ = pd.DataFrame(acc_, columns = ['Acc'])
b_acc_ = pd.DataFrame(b_acc_, columns = ['B_acc'])
f1_macro_ = pd.DataFrame(f1_macro_, columns = ['F1M'])
f1_weighted_ = pd.DataFrame(f1_weighted_, columns = ['F1W'])
fbeta_macro_ = pd.DataFrame(fbeta_macro_, columns = ['FBM'])
fbeta_weighted_ = pd.DataFrame(fbeta_weighted_, columns = ['FBW'])
pr_macro_ = pd.DataFrame(pr_macro_, columns = ['PrM'])
rc_macro_ = pd.DataFrame(rc_macro_, columns = ['RcM'])
pr_weighted_ = pd.DataFrame(pr_weighted_, columns = ['PrW'])
rc_weighted_ = pd.DataFrame(rc_weighted_, columns = ['RcW'])
    
mt_c = pd.DataFrame([acc_['Acc'].mean(), b_acc_['B_acc'].mean(), f1_macro_['F1M'].mean(), f1_weighted_['F1W'].mean(), fbeta_macro_['FBM'].mean(), fbeta_weighted_['FBW'].mean(), pr_macro_['PrM'].mean(), pr_weighted_['PrW'].mean(), rc_macro_['RcM'].mean(), rc_weighted_['RcW'].mean()], index=['Overall accuracy', 'Balanced accuracy', 'F1 macro', 'F1 weighted', 'Fbeta macro', 'Fbeta weighted', 'Precision macro', 'Precision weighted', 'Recall macro', 'Recall weighted'], columns=['Feature set C'])
mt_c
#mt_c['Feature set C']


Unnamed: 0,Feature set C
Overall accuracy,0.170293
Balanced accuracy,0.52978
F1 macro,0.165115
F1 weighted,0.222445
Fbeta macro,0.243738
Fbeta weighted,0.403529
Precision macro,0.513028
Precision weighted,0.926191
Recall macro,0.52978
Recall weighted,0.170293


In [126]:
stats_by_polymer, stats_by_label_year, stats_by_plastic_cover_percent, stats_by_date = rsdata_classification.stats_classification(assessment, usgs_subdatasets['plastic_and_water'])
stats_by_polymer


Unnamed: 0,Polymer,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,Bags,20,19.0 (95.0 %),19 (95.0 %),19 (95.0 %),12.15 %
1,Bottles,25,21.68 (86.7 %),21 (84.0 %),22 (88.0 %),11.32 %
2,Bags and Bottles,4,4.0 (100.0 %),4 (100.0 %),4 (100.0 %),33.25 %
3,HDPE mesh,54,51.06 (94.6 %),47 (87.0 %),53 (98.1 %),100% (15 samples) and < 100% (39 samples)


In [127]:
stats_by_label_year

Unnamed: 0,Year,Label,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,2019,Water,1346,174.87 (13.0 %),139 (10.3 %),218 (16.2 %),100.0 %
1,2019,Plastic,49,44.68 (91.2 %),44 (89.8 %),45 (91.8 %),13.45 %
2,2021,Water,597,77.81 (13.0 %),44 (7.4 %),122 (20.4 %),99.83 %
3,2021,Plastic,54,51.06 (94.6 %),47 (87.0 %),53 (98.1 %),100% (15 samples) and < 100% (39 samples)


In [128]:
stats_by_plastic_cover_percent

Unnamed: 0,Cover_percent,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,<=25%,39,34.68 (88.9 %),34 (87.2 %),35 (89.7 %),7.51 %
1,26% - 50%,9,9.0 (100.0 %),9 (100.0 %),9 (100.0 %),34.56 %
2,51% - 99%,1,1.0 (100.0 %),1 (100.0 %),1 (100.0 %),55.0 %
3,100%,15,13.79 (91.9 %),13 (86.7 %),14 (93.3 %),100%
4,Unknown,39,37.27 (95.6 %),34 (87.2 %),39 (100.0 %),Unknown


In [129]:
stats_by_date

Unnamed: 0,Cover_percent,Total water,Total plastic,Mean hits,Min hits,Max hits,Plastic mean cover percent
0,18/04/2019,139,5,46.57 (32.3 %),24 (16.7 %),76 (52.8 %),24.0 %
1,03/05/2019,373,19,56.12 (14.3 %),48 (12.2 %),66 (16.8 %),12.05 %
2,18/05/2019,566,10,10.3 (1.8 %),10 (1.7 %),11 (1.9 %),12.1 %
3,28/05/2019,136,8,8.0 (5.6 %),8 (5.6 %),8 (5.6 %),12.25 %
4,07/06/2019,132,7,98.56 (70.9 %),92 (66.2 %),102 (73.4 %),13.0 %
5,21/06/2021,121,9,57.44 (44.2 %),40 (30.8 %),74 (56.9 %),Unknown
6,01/07/2021,119,12,12.0 (9.2 %),12 (9.2 %),12 (9.2 %),Unknown
7,06/07/2021,120,11,21.22 (16.2 %),13 (9.9 %),35 (26.7 %),Unknown
8,21/07/2021,118,11,23.91 (18.5 %),20 (15.5 %),33 (25.6 %),Unknown
9,25/08/2021,119,11,14.3 (11.0 %),11 (8.5 %),17 (13.1 %),Unknown


#### Maps

In [130]:
dates = list(set(usgs_subdatasets['plastic_and_water'].loc[assessment[0].index]['Path']))

max_score = f1_weighted_.max()['F1W'] #Higher F1 weighted score

best_score = f1_weighted_.loc[f1_weighted_['F1W'] >= max_score].index[0] #Data from best F1 weighted score

classified_data_c = assessment[best_score]

In [131]:
for d in dates:
    rsdata_charts.map_nn(d, usgs_subdatasets['plastic_and_water'], classified_data_c, usgs_path+'A/', usgs_caminho+'A/', 500, 1000)

### Results for feature set D (bands)

In [133]:
assessment, errors, hits, confusion_matrices, acc, b_acc, f1_macro, f1_weighted, fbeta_macro, fbeta_weighted, pr_macro, rc_macro, pr_weighted, rc_weighted = rsdata_classification.multilayer_perceptron(dart_subdatasets['plastic_and_water'][feature_sets[3]], 
                                                                                            dart_subdatasets['plastic_and_water']['Label'],
                                                                                            usgs_subdatasets['plastic_and_water'][feature_sets[3]],
                                                                                            usgs_subdatasets['plastic_and_water']['Label'],
                                                                                            100)

In [134]:
acc_ = []
b_acc_ = []
f1_macro_ = []
f1_weighted_ = []
fbeta_macro_ = []
fbeta_weighted_ = []
pr_macro_ = []
rc_macro_ = []
pr_weighted_ = []
rc_weighted_ = []

for key in acc.keys():
    acc_.append(acc[key])
    b_acc_.append(b_acc[key])
    f1_macro_.append(f1_macro[key])
    f1_weighted_.append(f1_weighted[key])
    fbeta_macro_.append(fbeta_macro[key])
    fbeta_weighted_.append(fbeta_weighted[key])
    pr_macro_.append(pr_macro[key])
    rc_macro_.append(rc_macro[key])
    pr_weighted_.append(pr_weighted[key])
    rc_weighted_.append(rc_weighted[key])
    
acc_ = pd.DataFrame(acc_, columns = ['Acc'])
b_acc_ = pd.DataFrame(b_acc_, columns = ['B_acc'])
f1_macro_ = pd.DataFrame(f1_macro_, columns = ['F1M'])
f1_weighted_ = pd.DataFrame(f1_weighted_, columns = ['F1W'])
fbeta_macro_ = pd.DataFrame(fbeta_macro_, columns = ['FBM'])
fbeta_weighted_ = pd.DataFrame(fbeta_weighted_, columns = ['FBW'])
pr_macro_ = pd.DataFrame(pr_macro_, columns = ['PrM'])
rc_macro_ = pd.DataFrame(rc_macro_, columns = ['RcM'])
pr_weighted_ = pd.DataFrame(pr_weighted_, columns = ['PrW'])
rc_weighted_ = pd.DataFrame(rc_weighted_, columns = ['RcW'])
    
mt_d = pd.DataFrame([acc_['Acc'].mean(), b_acc_['B_acc'].mean(), f1_macro_['F1M'].mean(), f1_weighted_['F1W'].mean(), fbeta_macro_['FBM'].mean(), fbeta_weighted_['FBW'].mean(), pr_macro_['PrM'].mean(), pr_weighted_['PrW'].mean(), rc_macro_['RcM'].mean(), rc_weighted_['RcW'].mean()], index=['Overall accuracy', 'Balanced accuracy', 'F1 macro', 'F1 weighted', 'Fbeta macro', 'Fbeta weighted', 'Precision macro', 'Precision weighted', 'Recall macro', 'Recall weighted'], columns=['Feature set D'])
mt_d
#mt_d['Feature set D']

Unnamed: 0,Feature set D
Overall accuracy,0.148118
Balanced accuracy,0.5079
F1 macro,0.140715
F1 weighted,0.180133
Fbeta macro,0.201248
Fbeta weighted,0.32548
Precision macro,0.504919
Precision weighted,0.912949
Recall macro,0.5079
Recall weighted,0.148118


In [135]:
stats_by_polymer, stats_by_label_year, stats_by_plastic_cover_percent, stats_by_date = rsdata_classification.stats_classification(assessment, usgs_subdatasets['plastic_and_water'])
stats_by_polymer


Unnamed: 0,Polymer,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,Bags,20,17.54 (87.7 %),5 (25.0 %),19 (95.0 %),12.15 %
1,Bottles,25,20.96 (83.8 %),8 (32.0 %),24 (96.0 %),11.32 %
2,Bags and Bottles,4,3.99 (99.8 %),3 (75.0 %),4 (100.0 %),33.25 %
3,HDPE mesh,54,51.03 (94.5 %),42 (77.8 %),54 (100.0 %),100% (15 samples) and < 100% (39 samples)


In [136]:
stats_by_label_year

Unnamed: 0,Year,Label,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,2019,Water,1346,173.27 (12.9 %),96 (7.1 %),672 (49.9 %),100.0 %
1,2019,Plastic,49,42.49 (86.7 %),16 (32.7 %),47 (95.9 %),13.45 %
2,2021,Water,597,36.26 (6.1 %),14 (2.3 %),209 (35.0 %),99.83 %
3,2021,Plastic,54,51.03 (94.5 %),42 (77.8 %),54 (100.0 %),100% (15 samples) and < 100% (39 samples)


In [137]:
stats_by_plastic_cover_percent

Unnamed: 0,Cover_percent,Total,Mean hits,Min hits,Max hits,Mean cover percent
0,<=25%,39,32.86 (84.3 %),12 (30.8 %),37 (94.9 %),7.51 %
1,26% - 50%,9,8.64 (96.0 %),4 (44.4 %),9 (100.0 %),34.56 %
2,51% - 99%,1,0.99 (99.0 %),0 (0.0 %),1 (100.0 %),55.0 %
3,100%,15,14.11 (94.1 %),12 (80.0 %),15 (100.0 %),100%
4,Unknown,39,36.92 (94.7 %),30 (76.9 %),39 (100.0 %),Unknown


In [138]:
stats_by_date

Unnamed: 0,Cover_percent,Total water,Total plastic,Mean hits,Min hits,Max hits,Plastic mean cover percent
0,18/04/2019,139,5,10.29 (7.1 %),5 (3.5 %),34 (23.6 %),24.0 %
1,03/05/2019,373,19,101.43 (25.9 %),44 (11.2 %),355 (90.6 %),12.05 %
2,18/05/2019,566,10,14.87 (2.6 %),10 (1.7 %),145 (25.2 %),12.1 %
3,28/05/2019,136,8,18.94 (13.2 %),8 (5.6 %),136 (94.4 %),12.25 %
4,07/06/2019,132,7,70.23 (50.5 %),46 (33.1 %),84 (60.4 %),13.0 %
5,21/06/2021,121,9,31.91 (24.5 %),21 (16.2 %),66 (50.8 %),Unknown
6,01/07/2021,119,12,18.2 (13.9 %),12 (9.2 %),119 (90.8 %),Unknown
7,06/07/2021,120,11,14.02 (10.7 %),11 (8.4 %),40 (30.5 %),Unknown
8,21/07/2021,118,11,14.0 (10.9 %),12 (9.3 %),17 (13.2 %),Unknown
9,25/08/2021,119,11,9.16 (7.0 %),5 (3.8 %),11 (8.5 %),Unknown


#### Maps

In [139]:
dates = list(set(usgs_subdatasets['plastic_and_water'].loc[assessment[0].index]['Path']))

max_score = f1_weighted_.max()['F1W'] #Higher F1 weighted score

best_score = f1_weighted_.loc[f1_weighted_['F1W'] >= max_score].index[0] #Data from best F1 weighted score

classified_data_d = assessment[best_score]

In [140]:
for d in dates:
    rsdata_charts.map_nn(d, usgs_subdatasets['plastic_and_water'], classified_data_d, usgs_path+'A/', usgs_caminho+'A/', 500, 1000)

In [None]:
#Fazer tabela com lista de pixels e classe atribuida por cada configuracao de classificacao / usar pra compor mapa
#"reevaluate the model accuracy from an image perspective"

# References

[1] Themistocleous, K., Papoutsa, C., Michaelides, S., & Hadjimitsis, D. (2020). Investigating detection of floating plastic litter from space using sentinel-2 imagery. Remote Sensing, 12(16), 2648. <https://www.mdpi.com/2072-4292/12/16/2648>

[2] Biermann, L., Clewley, D., Martinez-Vicente, V., & Topouzelis, K. (2020). Finding plastic patches in coastal waters using optical satellite data. Scientific reports, 10(1), 5364.<https://www.nature.com/articles/s41598-020-62298-z>