# 15.4 - Training VoteNet and Performing Predictions

# Training

Like many other neural network implementations, VoteNet provides a script to train the network. However, this script has to be modified, so that it is executable under the environment of the server being used, and so that it can also take the ISPRS data as input.

Open the "train.py" file within the votenet main folder.

Below the comment "# Create Dataset and Dataloader", you can find two sections which, depending on the call arguments of the script, create a dataset object of the respective dataset class for either the Sun RGB or ScanNet data. You can take the section of ScanNet as a template (starting with "elif FLAGS.dataset == 'scannet':"), **copy and paste it right before the "else:" statement, and adapt it for the ISPRS data**. You need to exchange the argument dataset flag with a string you want to use when calling the Python train script (e.g. "isprs"), provide the name of the folder where the dataset class and dataset config class are located ("ISPRS"), and the names of the dataset (ISPRSDetectionVotesDataset) and dataset configuration (ISPRSDatasetConfig) classes. For the parameter use_color, put the Boolean value False as no color information is provided by the ISPRS dataset.

To have a check on the correctness of the code, we provide the corresponding code in the code cell below. But better to do the changes by hand instead of just using copy and paste.

In [None]:
elif FLAGS.dataset == 'isprs':
    sys.path.append(os.path.join(ROOT_DIR, 'ISPRS'))
    from ISPRS_detection_dataset import ISPRSDetectionVotesDataset, MAX_NUM_OBJ
    from model_util_ISPRS import ISPRSDatasetConfig
    DATASET_CONFIG = ISPRSDatasetConfig()
    TRAIN_DATASET = ISPRSDetectionVotesDataset('train', num_points=NUM_POINT,
        augment=False,
        use_color=False, use_height=(not FLAGS.no_height))
    TEST_DATASET = ISPRSDetectionVotesDataset('val', num_points=NUM_POINT,
        augment=False,
        use_color=False, use_height=(not FLAGS.no_height))

Normally these would be the only necessary changes to make VoteNet capable to also use the ISPRS dataset classes. However, the VoteNet implementation also uses Tensorflow's Tensorboard visualizer (although the implementation of VoteNet is actually based on PyTorch). And the visualizer is not usable on the server in this form. Therefore, all the references of the visualizer need to be commented out.

**Find all lines of code that uses "TfVisualizer" and the variables named "TRAIN_VISUALIZER" and "TEST_VISUALIZER" and comment them out (by placing the #-symbol at the beginning of the line)** There should be 7 lines of code that need to be commented out.

Before you close the "train.py" file, take a look at the parameters that VoteNet accepts, read their descriptions, and check their default values. 

Alternatively, start the train.py script with the -h (help) argument to get the parameters.

In [1]:
!python votenet/train.py -h

usage: train.py [-h] [--model MODEL] [--dataset DATASET]
                [--checkpoint_path CHECKPOINT_PATH] [--log_dir LOG_DIR]
                [--dump_dir DUMP_DIR] [--num_point NUM_POINT]
                [--num_target NUM_TARGET] [--vote_factor VOTE_FACTOR]
                [--cluster_sampling CLUSTER_SAMPLING]
                [--ap_iou_thresh AP_IOU_THRESH] [--max_epoch MAX_EPOCH]
                [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE]
                [--weight_decay WEIGHT_DECAY] [--bn_decay_step BN_DECAY_STEP]
                [--bn_decay_rate BN_DECAY_RATE]
                [--lr_decay_steps LR_DECAY_STEPS]
                [--lr_decay_rates LR_DECAY_RATES] [--no_height] [--use_color]
                [--use_sunrgbd_v2] [--overwrite] [--dump_results]

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL         Model file name [default: votenet]
  --dataset DATASET     Dataset name. sunrgbd or scannet. [default: sunrgbd]
  --ch

As you can see, the network uses by default the Sun RGB dataset, 20.000 input points per patch, trains for 180 epochs with a batch size of 8.

To train VoteNet with the ISPRS data, better open a terminal, change the current directory to the "votenet" directory, and run the train.py script with the following arguments.

In [None]:
CUDA_VISIBLE_DEVICES=7 python train.py --dataset isprs  --log_dir log_isprs --num_point 100000 --max_epoch 2 --batch_size 1 --lr_decay_steps '160, 240, 320'

Note that the number of epochs is set to 2, which is not enough to effectively train the network. You could run it for more epochs, but it would take a few hours/days to get good results. From our experience, 400 epochs with the provided dataset gives quite good results. (Although we only tested it on the training data itself, which does not allow a conclusive statement for unseen data yet!)

**The following arguments are the more important once for running the train script:**

- The preceding "CUDA_VISIBLE_DEVICES=" allows you to define which GPU(s) the training process sees. Please change the GPU id from 7 to a GPU that is not occupied. Use "nvidia-smi" on the terminal to check how the GPUs are currently occupied.

- The argument --log_dir specifies in which directory (here log_isprs) the checkpoint and the log is stored. The checkpoint contains all the weights of the network, and can be used to continue training the network, and for evaluation and for prediction. (It is basically a saved state of the network for future use.)

- The provided patches have 100.000 points, which are defined by --num_points.

- The batch size is 1, because of memory limitations on the server side itself, and not the GPU.

- A number of learning rate decay steps (--lr_decay_steps) are specified that happen at epoch 160, 240, and 320. The learning decay rate is by default 0.1 for all 3 steps. What this means is that at each of these 3 steps, the new learning rate is reduced to 10% of the old learning rate (lr = lr * 0.1).

**Final notes on training:**
- Unfortunately, the server instance on the provided server will stop after a while when the connection with the client is lost (e.g. when you log out or close the web browser. And with it the running script. Because the training script regularly saves checkpoints, it can be restarted, but it would still require to have the client run for several days. In a non-virtual environment, you could just start the script, close the web browser window, and log in later to see how your training is going. Even if your terminal tab in JupyterLab is by accident closed, you can still bring it to the foreground by clicking on the terminal session in the "Running Terminals and Kernels" section on the left side of JupyterHub (under the icon for the File Browser). 
- Another alternative is to start the script in a background process from the start and just observe the training process in the log files. But that also does not work in the virtual environment that is automatically stopped when the connection is lost for some time.

# Evaluation

For evaluating the performance of the network and also to do some predictions, VoteNet provides the "eval.py" script. As with the training script, the eval script needs to be adapted in the same way.

First, add the following code related to the ISPRS dataset after the ScanNet part and before the "else"-part (that prints out an error message in the next line). Since we currently do not have validation data, the "train" data is once used for demonstrating the network.

In [None]:
elif FLAGS.dataset == 'isprs':
    sys.path.append(os.path.join(ROOT_DIR, 'ISPRS'))
    from ISPRS_detection_dataset import ISPRSDetectionVotesDataset, MAX_NUM_OBJ
    from model_util_ISPRS import ISPRSDatasetConfig
    DATASET_CONFIG = ISPRSDatasetConfig()
#    TEST_DATASET = ISPRSDetectionVotesDataset('val', num_points=NUM_POINT,
    TEST_DATASET = ISPRSDetectionVotesDataset('train', num_points=NUM_POINT,
        augment=False,
        use_color=False, use_height=(not FLAGS.no_height))

Although all data patches will be evaluated, only 1 patch is written as output. It is actually a complete batch that is written to files and not a patch. But with a batch size of 1, the batch id equals the patch id and it is easier to target a defined patch for output.

You can change the batch_idx to a patch that might interest you. Find the the following line and change the id of 0 to any other number, like 380.

In [None]:
        if batch_idx == 0:
            MODEL.dump_results(end_points, DUMP_DIR, DATASET_CONFIG)

Save the "eval.py" script file and run it within the terminal window from the "votenet" directory with the following command.

In [None]:
CUDA_VISIBLE_DEVICES=7 python eval.py --dataset isprs --checkpoint_path log_isprs/checkpoint.tar --dump_dir eval_isprs --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal --num_point 100000 --batch_size 1

**Caution: The command in the above code cell is very long and it must be provided as one. If you copy & paste it with a line break, then your command might not be complete and you will get an error message.** Better copy and paste it first into some text editor, make sure it is a one-liner, remove any line breaks if there are any, and then copy it into the terminal window.

**The following arguments are the more important once for running the eval script:**

- The path where the checkpoint is located (with --checkpoint_path). For example "log_isprs" for the checkpoint of your trained model.

- The argument --dump_dir specifies where the output files are written to.

Check the "eval.py" file for the other arguments or get the help message from the script itself with the following command.

In [3]:
!python votenet/eval.py -h

usage: eval.py [-h] [--model MODEL] [--dataset DATASET]
               [--checkpoint_path CHECKPOINT_PATH] [--dump_dir DUMP_DIR]
               [--num_point NUM_POINT] [--num_target NUM_TARGET]
               [--batch_size BATCH_SIZE] [--vote_factor VOTE_FACTOR]
               [--cluster_sampling CLUSTER_SAMPLING]
               [--ap_iou_thresholds AP_IOU_THRESHOLDS] [--no_height]
               [--use_color] [--use_sunrgbd_v2] [--use_3d_nms] [--use_cls_nms]
               [--use_old_type_nms] [--per_class_proposal] [--nms_iou NMS_IOU]
               [--conf_thresh CONF_THRESH] [--faster_eval] [--shuffle_dataset]

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL         Model file name [default: votenet]
  --dataset DATASET     Dataset name. sunrgbd or scannet. [default: sunrgbd]
  --checkpoint_path CHECKPOINT_PATH
                        Model checkpoint path [default: None]
  --dump_dir DUMP_DIR   Dump dir to save sample outputs [default: N

In the output folder (eval_isprs), you find quite a number of files. CloudCompare should be able to load and visualize them all. Most interesting are the input point cloud (\_pc.ply), the predicted bounding boxes with high confidence without (or before) (\_pred\_confident\_bbox.ply) and with (or after) non-maximum suppression (\_pred\_confident\_nms\_bbox.ply). There are also files with the ground truth data (\_gt\_) that might be interesting to compare the predicted bounding boxes against.

Depending on how many epochs you trained, the results look more or less promising.

A pre-trained model is provided in the coursematerial folder that can be used for evaluation and also to visually compare it to the model that you trained. Do one more evaluation run with the pre-trained model and compare the evaluation metrics with the one you trained. Also check out and compare the outputs of this model with what you trained.

In [None]:
CUDA_VISIBLE_DEVICES=7 python eval.py --dataset isprs --checkpoint_path /home/jovyan/coursematerial/GIS/ISPRS/VoteNet/pretrained_model/checkpoint.tar --dump_dir eval_isprs --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal --num_point 100000 --batch_size 1

When comparing a model trained for only a few dozens epochs (or even 100 or 200 epochs) with the provided pre-trained model that was trained for 400 epochs, you should see a huge decrease in the mean overall loss (e.g. 600 down to 125), and an increase in the mean average precision as high as 0.82 (IoU 0.25 in NMS) and 0.53 (IoU 0.50 in NMS). 

**But one has to remember that evaluation was performed on the training data! So, the metrics do not say anything about how the network performs on unseen data. The model might be completely overfitted and might not predict anything meaningful at all for unseen data.**)

This concludes the training and evaluation part of this exercise.