# Setup
## Enable GPUs for the notebook:
- Navigate to Edit→Notebook Settings
- Select **GPU** from the Hardware Accelerator drop-down and **Hign-RAM** for Runtime Shape 

## Clone the repository

In [1]:
!git clone https://github.com/theresearchai/rail_transport_rescheduling_rl.git
%cd rail_transport_rescheduling_rl

Cloning into 'rail_transport_rescheduling_rl'...
remote: Enumerating objects: 103, done.[K
remote: Counting objects: 100% (103/103), done.[K
remote: Compressing objects: 100% (64/64), done.[K
remote: Total 4262 (delta 66), reused 69 (delta 39), pack-reused 4159[K
Receiving objects: 100% (4262/4262), 233.27 MiB | 25.36 MiB/s, done.
Resolving deltas: 100% (1925/1925), done.
Checking out files: 100% (4044/4044), done.
/content/rail_transport_rescheduling_rl


## Login W&B
If you don't have an account, sign up at https://wandb.ai/site. 

Follow the instruction and copy your API key to terminal.

In [None]:
!pip install wandb --upgrade 
!wandb login

# Connect Google Drive
This step is optional but it is the easiest way to automatically save all the temporary files created in this project.

Follow the instruction and copy your API key to terminal. The address of your drive would be `/content/gdrive`

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

## Install packages

In [None]:
!pip install -r gpu_envs.txt

# Train Models

Simply run `!python train.py -f config.yaml` to train a model with the configuration file name `config.yaml` or [common parameters](https://docs.ray.io/en/master/rllib-training.html#common-parameters) used by RLlib.
All configuration files to run the experiments can be found in `/content/rail_transport_rescheduling_rl/baselines`.

Colab only provides 1 GPU and 4 CPUs in the **High-RAM** mode so we need to config in the following way.
```yaml
config:
  num_workers: 3
  num_gpus: 1
```

Here are some examples of training using the [RLlib Baselines](https://flatland.aicrowd.com/research/baselines.html).

### APEX

In [None]:
!python train.py -f baselines/action_masking_and_skipping/apex_tree_obs_small_v0.yaml #APEX

In [None]:
!python train.py -f baselines/action_masking_and_skipping/ppo_tree_obs_small_v0_skip.yaml #APEX + Frame Skipping

###PPO

In [None]:
!python train.py -f baselines/action_masking_and_skipping/ppo_tree_obs_small_v0.yaml #PPO

In [None]:
!python train.py -f /content/rail_transport_rescheduling_rl/baselines/ccppo_tree_obs/ccppo.yaml  #Centralized Critic PPO

In [None]:
!python train.py -f baselines/action_masking_and_skipping/ppo_tree_obs_small_v0_skip.yaml #PPO + Frame Skipping

In [None]:
!python train.py -f baselines/action_masking_and_skipping/ppo_tree_obs_small_v0_mask.yaml #PPO + Action Masking

### Imitation Learning

1. Download the expert demonstration provided by Flatland and transform to a rllib compatible format. More details can be found [here](https://docs.ray.io/en/releases-0.8.5/rllib-offline.html).

  I have already finished this step. 


In [None]:
# %%bash
# cd imitation_learning/convert_demonstration
# wget https://s3.eu-central-1.wasabisys.com/aicrowd-flatland-challenge/expert-demonstrations.tgz
# tar zxvf expert-demonstrations.tgz
# python saving_experiences.py

2. Set the folder of converted expert experience as `input` and model `input_files` in the config file.

  ``` yaml
  congifg:
    input: /content/rail_transport_rescheduling_rl/imitation_learning/convert_demonstration/
  ```

3. Mixed imitation learning requires an sampler ratio parameter that determines the proportion of two algorithms. The following example is a mixed model of 25% IL and 75% APEX.
 ``` yaml
  congifg:
    input: 
      /content/rail_transport_rescheduling_rl/imitation_learning/convert_demonstration/ : 0.25
      sampler: 0.75
  ```


In [None]:
!python ./train.py -f baselines/imitation_learning_tree_obs/marwil_tree_obs_all_beta.yaml #MARWIL

# python ./train.py -f baselines/imitation_learning_tree_obs/apex_il_tree_obs_all.yaml #APE-X IMITATION LEARNING (IL)

In [None]:
!python ./train.py -f baselines/imitation_learning_tree_obs/apex_il_tree_obs_25.yaml

In [None]:
!python ./train.py -f baselines/imitation_learning_tree_obs/apex_il_tree_obs_75.yaml

# Rollout Models

`--checkpoint`: path to saved checkpoints

`--cfile`: path to rollout map configuration file

If no config file, you can also use `--env` and `--config` to set up the rollout environment.

In [None]:
!python rollout.py --checkpoint /content/gdrive/MyDrive/checkpoints/apex-tree-obs-small-v0/APEX_flatland_sparse_0_2021-02-12_12-05-12j73wgw9_/checkpoint_500/checkpoint-500 --cfile /content/gdrive/MyDrive/checkpoints/small.yaml --run APEX --episodes 100

In [None]:
!python rollout.py --checkpoint /content/gdrive/MyDrive/checkpoints/apex-tree-obs-small-v0/APEX_flatland_sparse_0_2021-02-12_12-05-12j73wgw9_/checkpoint_500/checkpoint-500 --run APEX --episodes 5 --env 'flatland_sparse' --config '{"env_config": {"test": "true", "generator": "sparse_rail_generator", "generator_config": "small_v0", "observation": "tree", "observation_config": {"max_depth": 2, "shortest_path_max_depth": 30}}, "model": {"fcnet_activation": "relu", "fcnet_hiddens": [256, 256], "vf_share_layers": "True"}}' 
