# Run workflow

## CSV to pickle
```bash
python -m src.preprocessing.csv2pkl --input_dir data/raw/ --output_dir data/processed_pickle/
```
---

## Map_reduce
### V2
```bash
python -m src.preprocessing.map_reduce_V2 --input_dir  data/processed_pickle/ --temp_dir data/TEMP_DIR --final_dir data/map_reduced/
```
---

## Train test split
```bash
python -m src.preprocessing.train_test_split --data_dir data/map_reduced/ --val_size 0.1 --test_size 0.1 --random_state 42
```
---

## Train Model
### V3
```bash
python -m src.train.train_traj_V3 --config configs/traj_tptrans.yaml
```

---
## Evaluate model
### V6
### TPTrans
1) Evaluate all MMSIs in the split

Plots, CSVs, and metrics will be written under --out_dir/<MMSI>/...
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_all_cut80 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220
```



2) Evaluate a subset of MMSIs
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --mmsi 209536000,209892000 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_subset_cut80 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220
```




3) Evaluate a single specific trip
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --mmsi 209536000 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_one_cut80 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220
```

---

### TrAISformer

1) Evaluate all MMSIs in the split

```bash
python -m src.eval.eval_traj_V6   --split_dir data/map_reduced/test   --ckpt data/checkpoints/traj_traisformer.pt   --model traisformer   --horizon 24 --past_len 128 --pred_cut 80   --mmsi all   --out_dir data/figures/traisformer_v6_all_80   --auto_extent --samples 16 --temperature 0.8 --top_k 30 --match_distance --style satellite --dpi 220
```

2) Evaluate a subset of MMSIs

```bash
python -m src.eval.eval_traj_V6   --split_dir data/map_reduced/test   --ckpt data/checkpoints/traj_traisformer.pt   --model traisformer   --horizon 24 --past_len 128 --pred_cut 80   --mmsi 205482000,209184000    --out_dir data/figures/traisformer_v6_80_test01   --auto_extent --samples 8 --temperature 0.8 --top_k 30 --match_distance --style satellite --dpi 220
```

3) Evaluate a single specific trip

```bash
python -m src.eval.eval_traj_V6   --split_dir data/map_reduced/test   --ckpt data/checkpoints/traj_traisformer.pt   --model traisformer   --horizon 24 --past_len 128 --pred_cut 80   --mmsi 205482000,209184000    --out_dir data/figures/traisformer_v6_test01   --auto_extent --samples 24 --temperature 1.05 --top_k 60 --match_distance --style satellite --dpi 220
```


4) example
```bash
python -m src.eval.eval_traj_V6 \
    --split_dir data/map_reduced/test \
    --out_dir data/figures/traisformer_v6_tuned_01 \
    --ckpt data/checkpoints/traj_traisformer.pt \
    --model traisformer \
    --pred_cut 80 \
    --samples 24 --temperature 1.4 --top_k 160 \
    --lambda_cont 0.015 --alpha_dir 2.4 --beta_turn 0.40 --step_scale 1.15 \
    --auto_extent --match_distance \
    --mmsi 209982000 \
    --style satellite --dpi 220
```

python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_all_01 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220





mmsi's to analyze:
209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152


### 01 d_model basic
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_select_tuned_d_model512_01 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220 \
  --mmsi 209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152
```


### 02 encodeing + decoding layer x2
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_select_tuned_encdecx2_01 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220 \
  --mmsi 209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152
```




```bash
python -m src.train.train_traj_V3 --config configs/traj_tptrans_v2.yaml
```

#### d_model 512, watermask dim 512
````bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/lite/traj_tptrans_d_model512.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_select_tuned_d_model512_01 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220 \
  --mmsi 209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152
```





### 03 nhead10
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_select_tuned_nhead10 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220 \
  --mmsi 209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152
```



### 04 d_model512, encodeing + decoding layer x2, kernel size 4 + dialation 2
```bash
python -m src.eval.eval_traj_V6 \
  --split_dir data/map_reduced/test \
  --ckpt data/checkpoints/traj_tptrans.pt \
  --model tptrans \
  --horizon 12 \
  --past_len 64 \
  --pred_cut 80 \
  --lat_min 54.0 --lat_max 58.0 --lon_min 6.0 --lon_max 16.0 \
  --out_dir data/figures/tptrans_v6_80_select_tuned_d_model512_encdecx2_kernel_dial_tweak_01 \
  --auto_extent \
  --match_distance \
  --style satellite --dpi 220 \
  --mmsi 209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,636018075,636018728,668116152
```



In [None]:
import pandas as pd

# baseline
df_01_mmsi = pd.read_csv("/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/figures/tptrans_v6_80_select_tuned_d_model512_01/summary_by_mmsi.csv")
df_01_sum = pd.read_csv("/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/figures/tptrans_v6_80_select_tuned_d_model512_01/summary_overall.csv")


# enc and dec layers doubled
df_02_mmsi = pd.read_csv("/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/figures/tptrans_v6_80_select_tuned_d_model512_encdecx2_01/summary_by_mmsi.csv")
df_02_sum = pd.read_csv("/Users/alexanderschiotz/Desktop/DTU/Master/Deep Learning/projects/ais-mda/data/figures/tptrans_v6_80_select_tuned_d_model512_encdecx2_01/summary_overall.csv")





In [34]:
df_01_mmsi.describe()


Unnamed: 0,mmsi,n_trips,ade_km_mean,ade_km_median
count,27.0,27.0,27.0,27.0
mean,339160900.0,1.0,9.588564,9.588564
std,151582000.0,0.0,5.347759,5.347759
min,209982000.0,1.0,2.129754,2.129754
25%,244904000.0,1.0,6.533158,6.533158
50%,258648000.0,1.0,9.461284,9.461284
75%,335566900.0,1.0,10.721421,10.721421
max,668116200.0,1.0,26.784638,26.784638


In [35]:
df_01_sum.head()

Unnamed: 0,n_trips,ade_km_mean,ade_km_median,fde_km_mean
0,27,9.588564,9.461284,19.9343


In [41]:
df_02_mmsi.head()

Unnamed: 0,mmsi,n_trips,ade_km_mean,ade_km_median
0,209982000,1,9.769522,9.769522
1,211686000,1,9.836217,9.836217
2,215933000,1,9.123617,9.123617
3,229673000,1,10.628642,10.628642
4,232008636,1,3.579877,3.579877


In [37]:
df_02_sum.head()

Unnamed: 0,n_trips,ade_km_mean,ade_km_median,fde_km_mean
0,27,9.588564,9.461284,19.9343


In [None]:
ADE_mean_01 = df_01_sum.loc[0,'ade_km_mean']
ADE_mean_02 = df_02_sum.loc[0,'ade_km_mean']

ADE_median_01 = df_01_sum.loc[0,'ade_km_median']
ADE_median_02 = df_02_sum.loc[0,'ade_km_median']

FDE_mean_01 = df_01_sum.loc[0,'fde_km_mean']
FDE_mean_02 = df_02_sum.loc[0,'fde_km_mean']

ADE_mean_diff = ADE_mean_01 - ADE_mean_02
ADE_median_diff = ADE_median_01 - ADE_median_02
FDE_mean_diff = FDE_mean_01 - FDE_mean_02 


print("01 ADE mean:", ADE_mean_01, "\n")
print("02 ADE mean:", ADE_mean_02, "\n")

print("01 ADE median:", ADE_median_01, "\n")
print("02 ADE median:", ADE_median_02, "\n")

print("01 FDE mean:", FDE_mean_01, "\n")
print("02 FDE mean:", FDE_mean_02, "\n")

print("ADE mean diff 1-2:", ADE_mean_diff, "\n")
print("ADE median diff 1-2:", ADE_median_diff, "\n")
print("FDE mean diff 1-2:", FDE_mean_diff, "\n")



01 ADE mean: 9.588563859420148 

02 ADE mean: 9.588563859420148 

01 ADE median: 9.461283519055216 

02 ADE median: 9.461283519055216 

01 FDE mean: 19.934300391683724 

02 FDE mean: 19.934300391683724 

ADE mean diff 1-2: 0.0 

ADE median diff 1-2: 0.0 

FDE mean diff 1-2: 0.0 



In [49]:
df_01_mmsi

Unnamed: 0,mmsi,n_trips,ade_km_mean,ade_km_median
0,209982000,1,9.769522,9.769522
1,211686000,1,9.836217,9.836217
2,215933000,1,9.123617,9.123617
3,229673000,1,10.628642,10.628642
4,232008636,1,3.579877,3.579877
5,244575000,1,26.784638,26.784638
6,244632000,1,12.878705,12.878705
7,245176000,1,15.295498,15.295498
8,246443000,1,10.116337,10.116337
9,246606000,1,12.613284,12.613284


In [76]:
import numpy as np 

mmsi = [209982000,211686000,215933000,229673000,232008636,244575000,244632000,245176000,
        246443000,246606000,255769000,257182000,257207000,258648000,258656000,304964000,
        305937000,311945000,316015060,319130500,352003296,518999464,538005462,636015943,
        636018075,636018728,668116152
        ]

for id in mmsi: 
    ADE_mean_01_mmsi = df_01_mmsi.loc[df_01_mmsi['mmsi'] == id, ['ade_km_mean']]
    ADE_mean_02_mmsi = df_02_mmsi.loc[df_02_mmsi['mmsi'] == id, ['ade_km_mean']]
    ADE_mean_mmsi_diff = ADE_mean_01_mmsi - ADE_mean_02_mmsi

    ADE_median_01_mmsi = df_01_mmsi.loc[df_01_mmsi['mmsi'] == id, ['ade_km_median']]
    ADE_median_02_mmsi = df_02_mmsi.loc[df_02_mmsi['mmsi'] == id, ['ade_km_median']]
    ADE_median_mmsi_diff = ADE_median_01_mmsi - ADE_median_02_mmsi

    print("mmsi: ", id, " ADE_mean_diff: ", ADE_mean_mmsi_diff.iloc[0,0], "\n")
    print("mmsi: ", id, " ADE_median_diff: ", ADE_median_mmsi_diff.iloc[0,0], "\n")
    
    #d = {"mmsi": [mmsi]}
    #np.append([d], [{"ade_mean_diff": [ADE_mean_mmsi_diff.iloc[0,0]]}])
    #np.append([d], [{"ade_median_diff": [ADE_median_mmsi_diff.iloc[0,0]]}])
    
    #d.conat({"ade_median_diff": [[ADE_median_mmsi_diff.iloc[0,0]]]})
    #, "ade_mean_diff": [ADE_mean_mmsi_diff.iloc[0,0]], "ade_median_diff": [[ADE_median_mmsi_diff.iloc[0,0]]]}
    #df_mmsi_diff = pd.DataFrame{"mmsi": id, "ADE_mean_diff": ADE_mean_mmsi_diff.iloc[0,0], "ADE_median_diff": ADE_median_mmsi_diff.iloc[0,0]}



mmsi:  209982000  ADE_mean_diff:  0.0 

mmsi:  209982000  ADE_median_diff:  0.0 

mmsi:  211686000  ADE_mean_diff:  0.0 

mmsi:  211686000  ADE_median_diff:  0.0 

mmsi:  215933000  ADE_mean_diff:  0.0 

mmsi:  215933000  ADE_median_diff:  0.0 

mmsi:  229673000  ADE_mean_diff:  0.0 

mmsi:  229673000  ADE_median_diff:  0.0 

mmsi:  232008636  ADE_mean_diff:  0.0 

mmsi:  232008636  ADE_median_diff:  0.0 

mmsi:  244575000  ADE_mean_diff:  0.0 

mmsi:  244575000  ADE_median_diff:  0.0 

mmsi:  244632000  ADE_mean_diff:  0.0 

mmsi:  244632000  ADE_median_diff:  0.0 

mmsi:  245176000  ADE_mean_diff:  0.0 

mmsi:  245176000  ADE_median_diff:  0.0 

mmsi:  246443000  ADE_mean_diff:  0.0 

mmsi:  246443000  ADE_median_diff:  0.0 

mmsi:  246606000  ADE_mean_diff:  0.0 

mmsi:  246606000  ADE_median_diff:  0.0 

mmsi:  255769000  ADE_mean_diff:  0.0 

mmsi:  255769000  ADE_median_diff:  0.0 

mmsi:  257182000  ADE_mean_diff:  0.0 

mmsi:  257182000  ADE_median_diff:  0.0 

mmsi:  257207000