This notebook shows how to call the main python functions to preprocess, extract data for training, model training and predict a tif. 
Generally using this notebook is not recommended over calling the data extration, training and prediction scripts from the command line. 
The notebook takes about twice as long.

In [None]:
from rural_beauty.config import models_dir
import pathlib


from rural_beauty import preprocessing         # function to get from the raw data to cleaned data ready for extraction. 
from rural_beauty import get_data_for_training # the function to create model data frames
from rural_beauty import training_model        # the function to train a tree model
from rural_beauty import predict_generic       # the function to predict a tif based on the model

This is the rural_beauty module


In [None]:
preprocessing.main()

# options:
# preprocessing.main( --skip_DE --skip_UK --skip_CLC --skip_DEM --skip_OSM --skip_Hemerobie --no-skip_Protected --skip_Neighborhood)

In [None]:
# parameters for data generation
country = 'DE'
target_variable = 'beauty'
sampling_method = 'all_pixels' # extracting all_pixels will take a long time. 60+ min on the IIASA VM101 server. 


# this is for training the model
model_class      = 'XGB'
class_balance    = 'asis'
number_classes   = 7
sugar            = str(number_classes) + '_'+ '021224'



In [None]:
# python3 rural_beauty/rural_beauty/get_data_for_training.py DE beauty all_pixels
get_data_for_training.main(country=country, target_variable =  target_variable, sampling_method=sampling_method)

All files exist
Extracting beauty's raster values


Extracting explanatory raster values: 100%|██████████| 68/68 [01:25<00:00,  1.26s/it]


Coordinate file written to /h/u145/hofer/MyDocuments/Granular/beauty/data/models/__extracted_points/DE/beauty/random_pixels/coords.csv
Outcome file written to /h/u145/hofer/MyDocuments/Granular/beauty/data/models/__extracted_points/DE/beauty/random_pixels/outcome.csv
Predictors file written to /h/u145/hofer/MyDocuments/Granular/beauty/data/models/__extracted_points/DE/beauty/random_pixels/predictors.csv
Feature path json written to /h/u145/hofer/MyDocuments/Granular/beauty/data/models/__extracted_points/DE/beauty/random_pixels/feature_paths.json


In [None]:
# python3 rural_beauty/rural_beauty/training_model.py DE beauty XGB all_pixels asis 7 7_123456
training_model.main(country          = country,
                    target_variable  = target_variable,
                    model_class      = model_class,
                    sampling_method  = sampling_method,
                    class_balance    = class_balance,
                    sugar            = sugar,
                    number_classes   = number_classes)

Model Accuracy:      0.78
Model F1:            0.77
Model Kendall's Tau: 0.83
Confusion matrix saved to: /h/u145/hofer/MyDocuments/Granular/beauty/data/models/DE__beauty__random_pixels__XGB__asis__7_021224/confusion_matrix.png


<Figure size 800x600 with 0 Axes>

In [6]:
# the prediction function takes a model folder (as crated by the training function)
model_basename = f"{country}__{target_variable}__{sampling_method}__{model_class}__{class_balance}__{sugar}" # instead use something like "__".join(**kargs)
model_folder   = models_dir / model_basename

predict_generic.main(model_folder)

Finished writing the prediction to /h/u145/hofer/MyDocuments/Granular/beauty/data/models/DE__beauty__random_pixels__XGB__asis__7_021224/prediction.tif
