# Changing Project Parameters

Before training a Classifier we often times want to change the type of classifier and other parameters.

On project creation the classifier object is initalized with default values. Depending on our needs we want to change those parameter to more fitting values.

At first we need to point python to the program folder. The path can be assigned as a relative path as shown below, or as an absolute system path.
Than the module can be imported via the `import cloud_classifier` command.

In [15]:
import sys
sys.path.append('../cloud_classifier')
import cloud_classifier

## Project creation
Let's first create a new project. This time we want to train a Decison Tree classifier and choose a name for our project accordingly.

In [16]:
cc = cloud_classifier.cloud_classifier()
cc.create_new_project(name="NewDecisionTreeClassifier", path="../classifiers")


Project folder created succefully!


The new classifier will automatically be initalized with default parameters. Iinside of the project folder folder a new folder called `settings` was created, containing the files `config.json`, holding information about the type and parameters of classifier we want to use and the file `data_structure.json`, holding information about the structure of our used data files.

In [18]:
%%bash

cat ../classifiers/NewDecisionTreeClassifier/settings/config.json
cat ../classifiers/NewDecisionTreeClassifier/settings/data_structure.json

{    
    "classifier_type": "Forest",
    "max_depth": 35,
    "ccp_alpha": 0,
    "n_estimators": 100,
    "feature_preselection": false,
    "max_features": null,
    "min_samples_split" : 2,
    "merge_list" : [],
    "difference_vectors": true,
    "original_values": true,
    "samples": 100
}
{   
    "data_source_folder": "../data/full_dataset",
    "timestamp_length": 13,
    "sat_file_structure": "msevi-medi-TIMESTAMP.nc",
    "label_file_structure": "nwcsaf_msevi-medi-TIMESTAMP.nc",
    "input_source_folder": "../data/example_data",
    "georef_file" : "../data/auxilary_files/msevi-medi-georef.nc" ,
    "mask_file" : "../data/auxilary_files/lsm_mask_medi.nc",  
    "mask_key": "land_sea_mask",
    "mask_sea_coding": 0,
    "input_channels": [
        "bt062",
        "bt073",
        "bt087",
        "bt097",
        "bt108",
        "bt120",
        "bt134"
    ],
    "cloudtype_channel": "ct",
    "nwcsaf_in_version": "auto",
    "nwcsaf_out_version": "v2018",
    "hours": 

When we open the text f
We can now use the `set_project_parameters` command in order to change any parameter. For changing the classifier to a Decision Tree, we change the parameter `classifier_type` to `Tree`.

Those files can be examined and changed with a text editor. Alternativly 


## Changing parameters via the Commandline

One possibility is to use the command line in order to change parameters

### Using programm commands

Our de

In [4]:
file_1 = "../data/example_data/msevi-medi-20190317_1800.nc"
file_2 = "../data/example_data/msevi-medi-20190318_1100.nc"

cc.set_project_parameters(input_files = [file_1, file_2])

We now run the prediction pipeline (with the `run_prediction_pipeline()` method) which 
* applies the classifier to our input data and
* stores the predicted labels.

The option `create_filelist` is set to `False` to take the user-defined input file list.

In [5]:
cc.run_prediction_pipeline(create_filelist = False)

https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations


Classifier loaded!
Masked indices set!
Reference file found
Input vectors created!
Predicted Labels!
Labels saved as nwcsaf_msevi-medi-20190317_1800_predicted.nc
Input vectors created!
Predicted Labels!
Labels saved as nwcsaf_msevi-medi-20190318_1100_predicted.nc


### Using an Automatically Generated Input File List

Alternatively to the manual definition, the input file list can be generated automatically.


The easiest way to do so is to put all input files into an input data folder (here it is set to `../data_example_data`) and just tell the classifier where to look via the `input_source_folder` option.

In [6]:
%%bash

ls -l ../data/example_data

total 30120
-rw-rw-r-- 1 squidy squidy 14946418 Jun  4  2021 msevi-medi-20190317_1800.nc
-rw-rw-r-- 1 squidy squidy 15552552 Jun  4  2021 msevi-medi-20190318_1100.nc
-rw-rw-r-- 1 squidy squidy   155069 Jun  4  2021 nwcsaf_msevi-medi-20190317_1800.nc
-rw-rw-r-- 1 squidy squidy   178946 Jun  4  2021 nwcsaf_msevi-medi-20190318_1100.nc


In [7]:
cc.set_parameters(input_source_folder = "../data/example_data")

In a next step, we can let the classifier predict labels from the input files we have specified.
This is again done with the `run_prediction_pipeline()` method.

If we want the classifier to automatically generate a list of input files and therefore set the option `create_filelist` to `True`.

In [8]:
cc.run_prediction_pipeline(create_filelist = True)

Input filelist created!
Classifier loaded!
Masked indices set!
Reference file found


https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations


Input vectors created!
Predicted Labels!
Labels saved as nwcsaf_msevi-medi-20190318_1100_predicted.nc
Input vectors created!
Predicted Labels!
Labels saved as nwcsaf_msevi-medi-20190317_1800_predicted.nc


## Accessing predicted labels

The predicted labels are stored in the folder of the classifier we are using. They are located in the subfolder `labels`.

In [9]:
%%bash

ls ../classifiers/TreeClassifier/labels

nwcsaf_msevi-medi-20190317_1800_predicted.nc
nwcsaf_msevi-medi-20190318_1100_predicted.nc
