# Run the Training step
This notebook provides step-by-step instructions on how to install the training module for tile-based classification and execute a training run to evaluate its performance.

> Note: Before proceeding, make sure to select the correct kernel. In the top-right corner of the notebook, choose the Jupyter kernel named `Bash`.

## Setup the environment

In [1]:
export WORKSPACE=/workspace/machine-learning-process
export RUNTIME=${WORKSPACE}/runs
mkdir -p ${RUNTIME}
cd ${RUNTIME}
printenv | grep RUNTIME
pwd

XDG_RUNTIME_DIR=/workspace/.local
RUNTIME=/workspace/machine-learning-process/runs
/workspace/machine-learning-process/runs


## Create a hatch environment

The hatch environment provides a dedicated Python where the `make-ml-model` step dependencies are installed. This process can be done with hatch.

In [16]:
cd ${WORKSPACE}/training/make-ml-model
hatch env prune
hatch env create default

[2K[32m.  [0m [1;35mCreating environment: default[0m0m
[2K[32m  .[0m [1;35mInstalling project in development mode[0mt mode[0m
[1A[2K[?25l[32m.  [0m [1;35mChecking dependencies[0m
[2K[32m   [0m [1;35mSyncing dependencies[0mencies[0m
[1A[2K


## Run the make-ml-model application 

First dump the help:

In [17]:
hatch run default:tile-based-training --help

2025-05-08 11:23:01.789960: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 11:23:01.871866: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 11:23:01.971646: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1746703382.027739    6066 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746703382.033355    6066 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1746703382.057401    6066 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

In the cell below, the user can check the MLFLOW_TRACKING_URI which defined as environment variable during deployment of the code-server.

In [19]:
echo ${MLFLOW_TRACKING_URI} 

http://my-mlflow:5000


Now, run the `tile-based-training` command line tool with the parameters:

- stac_reference: https://raw.githubusercontent.com/eoap/machine-learning-process/main/training/app-package/EUROSAT-Training-Dataset/catalog.json
- BATCH_SIZE: 2 
- CLASSES: 10 
- DECAY: 0.1 
- EPOCHS: 50 
- EPSILON: 0.000001 
- LEARNING_RATE: 0.0001 
- LOSS: categorical_crossentropy 
- MEMENTUM: 0.95 
- OPTIMIZER: Adam 
- REGULARIZER: None 
- SAMPLES_PER_CLASS: 1000

Make sure your mlflow is running 

In [25]:
hatch run default:tile-based-training \
    --stac_reference https://raw.githubusercontent.com/eoap/machine-learning-process/main/training/app-package/EUROSAT-Training-Dataset/catalog.json \
    --BATCH_SIZE 2 \
    --CLASSES 10 \
    --DECAY 0.1 \
    --EPOCHS 50 \
    --EPSILON 0.000001 \
    --LEARNING_RATE 0.0001 \
    --LOSS categorical_crossentropy \
    --MEMENTUM 0.95 \
    --OPTIMIZER Adam \
    --REGULARIZER None \
    --SAMPLES_PER_CLASS 10


2025-05-08 11:31:35.154409: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 11:31:35.158655: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 11:31:35.167985: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1746703895.184961    7697 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746703895.190206    7697 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1746703895.203235    7697 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

: 1

List the outputs:

In [12]:
tree mlruns

[0;33m.[0m
├── [0;33mconfig[0m
│   └── config.yaml
├── Dockerfile
├── [0;33moutput[0m
│   └── [0;33mlogs[0m
│       └── running_logs.log
├── params.yaml
├── pyproject.toml
└── [0;33msrc[0m
    ├── __init__.py
    └── [0;33mtile_based_training[0m
        ├── __about__.py
        ├── [0;33mcomponents[0m
        │   ├── data_ingestion.py
        │   ├── __init__.py
        │   ├── model_evaluation.py
        │   ├── prepare_base_model.py
        │   ├── [0;33m__pycache__[0m
        │   │   ├── data_ingestion.cpython-310.pyc
        │   │   ├── data_ingestion.cpython-311.pyc
        │   │   ├── data_ingestion.cpython-312.pyc
        │   │   ├── __init__.cpython-310.pyc
        │   │   ├── __init__.cpython-311.pyc
        │   │   ├── __init__.cpython-312.pyc
        │   │   ├── model_evaluation.cpython-310.pyc
        │   │   ├── model_evaluation.cpython-311.pyc
        │   │   ├── model_evaluation.cpython-312.pyc
        │   │   ├── prepare_base_model.cpython-310.pyc
      

## Clean-up 

In [None]:
exit
rm -fr ${RUNTIME}/envs mlruns