# Post-process


During this stage we want to keep the best quality predictions. So we can apply:

   * A threshold to the probability, to preserve the results with high values.
   * A minimum area filter, to keep the predictions that correspond to large areas and eliminate false positives (FP) due to small fluctuations.

We can process the results in different ways, for example we can first polygon the prediction result while applying a threshold to the probability. Then work with a vector file to filter the predictions from small areas.

In this example we work as follows

1. first polygonize each result while applying a threshold to the classes
2. we join all the results to have a single vector file (but one could work with each result independently if they wanted to)
3. Before applying the minimum area filter, it may be useful first to apply a buffer -> dissolve -> multiparts to single parts, this generates that the predictions that are capable are small but are together with others are combined forming a single larger area and avoiding then the minimum area filter removes them
4. We apply the minimum area filter

**Apply a threshold to de probability and poligonize**

In [None]:
import os 

from os import walk
from tqdm import tqdm

PATH = os.path.join('./data_result_cba/','onedate_FW')

results_folder = os.path.join(PATH,'160_160/')

filter_results_folder         = os.path.join(PATH,'filtered/') 
vector_results_folder         = os.path.join(PATH,'vector_geojson/')

!mkdir -p $filter_results_folder
!mkdir -p $vector_results_folder

input_files = next(walk(results_folder), (None, None, []))[2]  # [] if no file
print(input_files[:1])
DEBUG=True 

for raster in tqdm(input_files):
  
    if "tif" in (raster):
        
        src = results_folder + raster 
        dst = filter_results_folder + raster  


        !gdal_calc.py -A $src --A_band=1 -B $src --B_band=2 \
        --calc="(A>140)*(B<240)*100+(A>0)*(B>240)*200" \
        --outfile=$dst

        if DEBUG: print("convert into a vector")
        #convert into a vector
        vec_file_name = raster[:-4] + '.geojson'
        vec_sdt = vector_results_folder + vec_file_name
        
        !gdal_polygonize.py $dst $vec_sdt


**Merge all de the results**

In [None]:
from glob import glob
import subprocess

files = glob('./data_result_cba/onedate_FW/vector_geojson/*.geojson')

for i, f in tqdm(enumerate(files)):
    if i==0:
        cmd = f'ogrmerge.py -f GPKG -o ./data_result_cba/onedate_FW/merged.gpkg -nln out {f}'
        subprocess.run(cmd, shell=True)
    cmd = f'ogrmerge.py -append -o ./data_result_cba/onedate_FW/merged.gpkg -nln  out {f}'
    subprocess.run(cmd, shell=True)

**Change to utm and create a file for each category**

In [None]:
# filter
    
# change to utm
name = './data_result_bar/onedate_FW/merged.gpkg'
name_utm = './data_result_bar/onedate_FW/merged_utm.gpkg'

!ogr2ogr -s_srs EPSG:4326 -t_srs EPSG:32720 -f 'GPKG' $name_utm $name
        
# buffer disolve
name_fuego = './data_result_bar/onedate_FW/merged_filter_buff_fire.gpkg'
!rm  $name_fuego   
!ogr2ogr \
    -t_srs EPSG:32720 \
    -f "GPKG" \
    -sql "select ST_union(ST_buffer(geom,50)) as geometry FROM out WHERE DN='100' " \
    -dialect SQLITE \
    $name_fuego \
    $name_utm

# buffer disolve
name_agua = './data_result_bar/onedate_FW/merged_filter_buff_water.gpkg'
!rm  $name_agua   
!ogr2ogr \
    -t_srs EPSG:32720 \
    -f "GPKG" \
    -sql "select ST_union(ST_buffer(geom,50)) as geometry FROM out WHERE DN='200' " \
    -dialect SQLITE \
    $name_agua \
    $name_utm

**Filter results of small areas**

In [None]:
min_area = 100000

input_path = './data_result_cba/onedate_FW/cba2.gpkg' 
output_path = './data_result_cba/onedate_FW/cba2__minarea100000.gpkg'


!ogr2ogr \
    -t_srs EPSG:32720 \
    -f "GPKG" \
    -sql "SELECT * FROM out m WHERE ST_Area(geom) > $min_area" \
    -dialect SQLITE \
    -nln results \
    $output_path \
    $input_path