# Run workflow (pipeline)

## CSV to pickle
```bash
python -m src.preprocessing.csv2pkl \
    --input_dir data/raw/ \
    --output_dir data/processed_pickle/ \
    --cargo_tankers_only \
    --run_name csv2pkl
```

---

## Map_reduce
### V2
```bash
python -m src.preprocessing.map_reduce \
    --input_dir data/processed_pickle/ \
    --output_dir data/processed/map_reduced/ \
    --num_workers 0 \
    --run_name map_reduce
```

---

## Train test split
```bash
python -m src.preprocessing.train_test_split \
    --data_dir data/processed/map_reduced/ \
    --val_size 0.1 \
    --test_size 0.1 \
    --random_state 42
```


---

## Train Models
### V3
```bash
# TPtrans
python -m src.train.train_tptrans_transformer --config configs/traj_tptrans.yaml
# Traisformer
python -m src.train.train_traisformer --config configs/traj_traisformer.yaml
# Kalmar ????
```

---


## Evaluate model

Run src.eval.evaluate_trajectory and look at produced map_model.html file to see interactive map.

#### Example Local
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir data/processed/map_reduced/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt,data/checkpoints/traisformer_small/traj_traisformer.pt \
  --model tptrans,traisformer,kalman \
  --out_dir data/figures/results/all_models_80 \
  --pred_cut 80 --folium --same_pic --collect \
  --samples 1 --temperature 0 --top_k 20 --mmsi 212801000,215933000,218615000,230617000,244554000,248891000,250005981,255802840,305575000,305643000,352005235,636015943,636022355
```

#### Example HPC All MMSI's
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir /dtu/blackhole/10/178320/preprocessed_2/final/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt,data/checkpoints/traisformer_small/traj_traisformer.pt \
  --model tptrans,traisformer,kalman \
  --out_dir /dtu/blackhole/10/178320/results/all_models_80 \
  --pred_cut 80 --folium --same_pic --no_plots --collect \
  --samples 1 --temperature 0 --top_k 20
```

#### Example HPC SELECTED MMSI's
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir /dtu/blackhole/10/178320/preprocessed_2/final/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt,data/checkpoints/traisformer_new/traj_traisformer.pt \
  --model tptrans,traisformer,kalman \
  --out_dir /zhome/2b/a/177038/AIS-MDA/results/all_models_80_subset \
  --pred_cut 80 --folium --same_pic --collect \
  --samples 1 --temperature 0 --top_k 20 --mmsi 205011000,205465000,205770000,207138000,209531000,209577000,209882000,209955000,210185000,210935000,211839040,211876190,212138000,212491000,215035000,215060000,215116000,215207000,215209000,215221000,215238000,215378000,215382000,215654000,215698000
```


#### All plots HPC Local (Beware of HPC storrage limit!)
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir /dtu/blackhole/10/178320/preprocessed_2/final/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt,data/checkpoints/traisformer_small/traj_traisformer.pt \
  --model tptrans,traisformer,kalman \
  --out_dir /zhome/2b/a/177038/AIS-MDA/results/all_models_80_all \
  --pred_cut 80 --folium --same_pic --collect \
  --samples 1 --temperature 0 --top_k 20
```

#### TPtrans
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir data/processed/map_reduced/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt \
  --model tptrans \
  --out_dir data/figures/results/eval_tptrans_medium_80 \
  --pred_cut 80 \
  --folium \
  --same_pic \
  --collect
```

#### Traisformer
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir data/processed/map_reduced/test \
  --ckpt data/checkpoints/traisformer_new_small/traj_traisformer.pt \
  --model traisformer \
  --out_dir data/figures/results/eval_traisformer_80_small_temp0_topk20 \
  --pred_cut 80 --same_pic --folium \
  --samples 1 --temperature 0 --top_k 20 \
  --prevent_stuck --collect
```

#### kalman filter
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir data/processed/map_reduced/test  \
  --model kalman \
  --out_dir data/figures/results/eval_kalman_80 \
  --pred_cut 80 \
  --same_pic --folium --collect 
```

#### All Models 
```bash
python -m src.eval.evaluate_trajectory \
  --split_dir data/processed/map_reduced/test \
  --ckpt data/checkpoints/tptrans_medium/traj_tptrans_delta.pt,data/checkpoints/traisformer_small/traj_traisformer.pt \
  --model tptrans,traisformer,kalman \
  --out_dir data/figures/results/all_models_test_80 \
  --pred_cut 80 --folium --same_pic --collect \
  --samples 1 --temperature 0 --top_k 20 
```

#### src.eval.evaluate_trajectory Usage:
```bash
evaluate_trajectory.py [-h] --split_dir SPLIT_DIR --out_dir OUT_DIR --model MODEL
                              [--ckpt CKPT] [--lat_min LAT_MIN] [--lat_max LAT_MAX]
                              [--lon_min LON_MIN] [--lon_max LON_MAX] [--speed_max SPEED_MAX]
                              [--plot_lat_min PLOT_LAT_MIN] [--plot_lat_max PLOT_LAT_MAX]
                              [--plot_lon_min PLOT_LON_MIN] [--plot_lon_max PLOT_LON_MAX]
                              [--past_len PAST_LEN] [--pred_cut PRED_CUT] [--pred_len PRED_LEN]
                              [--cap_future CAP_FUTURE] [--horizon HORIZON]
                              [--pred_scale PRED_SCALE] [--samples SAMPLES]
                              [--temperature TEMPERATURE] [--top_k TOP_K] [--prevent_stuck]
                              [--no_plots] [--same_pic] [--folium] [--collect] [--mmsi MMSI]
                              [--cpu] [--dpi DPI]
```

#### HPC Usage:

Remember to set DTU blackhole path
```bash
  --split_dir /dtu/blackhole/10/178320/preprocessed_2/final/test \
```

## Metrics

In [None]:
import pandas as pd

## full test set
full_csv = "/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/results/all_models_80_all/metrics_tptrans_traisformer_kalman.csv"
full_pkl = "/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/results/all_models_80_all/predictions_tptrans_traisformer_kalman.pkl"

metrics = pd.read_csv(full_csv)
predictions = pd.read_pickle(full_pkl)

print("\nFull Test Set Metrics:")
print(metrics.head(50))

print("\nFull Test Set Predictions:")
print(predictions)

In [None]:
# filter out mmsi and trip columns
metrics = metrics.drop(columns=['mmsi', 'trip'])

# metrics mean and medians
mean_metrics = metrics.mean()
median_metrics = metrics.median()
    
print("\nMean Metrics:")
print(mean_metrics)

print("\nMedian Metrics:")
print(median_metrics)