# Notebook: run experiments
From the paper "VDW-GNNs: Vector diffusion wavelets for geometric graph neural networks".

First, set `ROOT` in the cell below to where you have cloned this repo into a folder named `code/`. 
- The lines underneath this assignment will add this directoy to the `PYTHONPATH` to facilitate own-module imports when running scripts in the subsequent cells. 
- If problems persist, a potential fix is to replace `!{sys.executable}` commands with the `%run` syntax.

In [None]:
ROOT = "parent/dir/of/code"

import sys
import os
os.environ["PYTHONPATH"] = ROOT + os.pathsep + os.environ.get("PYTHONPATH", "")

If you want to alter experiment settings, note this repo's experiment configuration precedence:

Configuration values are layered in this order (highest priority first):
1. Command-line arguments (only if the flag is provided)
2. Model YAML (the file passed to `--config`)
3. Experiment YAML in the same directory (`experiment.yaml` or `<dir>.yaml`)
4. Defaults in `config/*_config.py` files

In the layered YAML case, the experiment YAML is loaded first and then the
model YAML overlays it. CLI flags override the merged YAML after loading.
If `--config` points directly to an experiment YAML, no model YAML is auto-loaded.
Any keys not specified in YAML or CLI fall back to defaults in the *_config class files.

# (1) Ellipsoids

## Generate data

This script also computes and saves the $\mathbf{P}$ and $\mathbf{Q}$ operators for VDW-GNN models.

In [None]:
!{sys.executable} {ROOT}/code/scripts/python/generate_ellipsoid_dataset.py \
  --save_dir {ROOT}/data/ellipsoids \
  --config {ROOT}/code/config/yaml_files/ellipsoids/experiment.yaml \
  --pq_h5_name pq_tensor_data_512.h5 \
  --random_seed 457892 \
  --num_samples 512 \
  --num_nodes_per_graph 128 \
  --knn_graph_k 5 \
  --abc_means 3.0 1.0 1.0 \
  --abc_stdevs 0.5 0.2 0.2 \
  --local_pca_kernel_fn gaussian \
  --num_oversample_points 1024 \
  --k_laplacian 10 \
  --laplacian_type sym_norm \
  --dirac_types max min \
  --random_harmonic_k 16 \
  --random_harmonic_coeff_bounds 1.0 2.0 \
  --modulation_scale 0.9

## Run k-fold cross validation

`MODEL_KEY` options: vdw, legs, gcn, gat, gin, egnn_diameter, egnn_normals, tfn_diameter, tfn_normals

>Note: egnn and tfn have separate keys (and config files) for the different targets.

The next cell contains a call for kFCV-training one model on one target variant of the ellipsoids dataset.

- For diameter (graph-level) targets ensure the following settings in `config/yaml_files/ellipsoids/experiment.yaml`:
    ```
    dataset:
    task: graph_regression
    target_key: y_graph_diameter
    target_dim: 1
    ```

- For modulated normal vector (node-level) targets:
    ```
    dataset:
    task: vector_node_regression
    target_key: y_random_harmonic_normals
    target_dim: 3
    ```

In [None]:
MODEL_KEY = "vdw"
TASK = "normals" # normals | diameter

!{sys.executable} {ROOT}/code/scripts/python/main_training.py \
  --root_dir {ROOT} \
  --config {ROOT}/code/config/yaml_files/ellipsoids/{MODEL_KEY}.yaml \
  --dataset ellipsoids \
  --results_save_subdir {TASK} \
  --experiment_type kfold

## Aggregate results

In [None]:
TASK = "normals" # normals | diameter

!{sys.executable} {ROOT}/code/scripts/python/summarize_experiments.py \
  --exp_dirs {ROOT}/experiments/ellipsoids/{TASK} \
  --metrics mse \
  --decimals 4 \
  --out_tex {ROOT}/experiments/ellipsoids/ellipsoids_{TASK}_results.tex \
  --out_csv {ROOT}/experiments/ellipsoids/ellipsoids_{TASK}_results.csv

##

# (2) Wind velocity reconstruction

`MODEL_KEY` options: vdw, legs, gcn, gat, gin, egnn, tfn, dd-tnn

> 1. Ensure you have downloaded the earth surface wind velocity data (for instructions, see the README.md file), and saved the u- and v-wind files in `{ROOT}/data/`, under the filenames `u-wind_1Jan2016_mean_10m.nc` and `v-wind_1Jan2016_mean_10m.nc`.
> 2. The DD-TNN model trains via a different script, hence this cell has branching call logic.

In [None]:
MODEL_KEY = "vdw"

if MODEL_KEY == "dd-tnn":
    py_script = "run_wind_tnn.py"
    config_path = f"{ROOT}/code/config/yaml_files/wind/experiment.yaml"
    extra_args = []
else:
    py_script = "run_wind_experiments.py"
    config_path = f"{ROOT}/code/config/yaml_files/wind/{MODEL_KEY}.yaml"
    extra_args = ["--local_pca_k", "10"]

cmd = [
    sys.executable,
    f"{ROOT}/code/scripts/python/{py_script}",
    "--config", config_path,
    "--root_dir", ROOT,
    "--dataset", "wind",
    "--replications", "5",
    "--knn_k", "3",
    "--sample_n", "2000",
    "--mask_prop", "0.3",
    "--do_rotation_eval",
]
cmd += extra_args

import subprocess
print("Running:\n", " ".join(cmd))
subprocess.run(cmd, check=True, cwd=ROOT)


## Aggregate results 

Here, `--root_dir` must point to where wind experiment results are saved; `--single` must denote the directories within each model subdirectory (auto-named for the sample size and mask proportion, e.g., `n2000p30').

In [None]:
!{sys.executable} {ROOT}/scripts/python/summarize_wind_tables.py \
    --root_dir={ROOT}/experiments/wind \
    --single=n2000p30

# (3) Multi-channel neural recordings

`MODEL_KEY` options: vdw, marble, cebra, lfads

> Ensure you have the "kinematics.pkl", "trial_ids.pkl", and "rate_data_20ms_100ms.pkl" data files saved in `{ROOT}/data/` (details on how to download these are in the README.md).

In [None]:
MODEL_KEY = "vdw"

!{sys.executable} {ROOT}/code/scripts/python/run_macaque_multiday_cv.py \
  --model vdw \
  --days 0-43 \
  --root_dir {ROOT}

## Aggregate results

In [None]:
!{sys.executable} {ROOT}/scripts/python/summarize_day_model_results.py \
    --root {ROOT}/experiments/macaque_reaching \
    --uncertainty mean_std \
    --decimals 2 \
    --bold_best

## Generate the Wilcoxon paired tests plot

> Note: if you have not run each of VDW-GNN, MARBLE, CEBRA, and LFADS exactly once, or wish to compare a different combination of models, you will need to modify the model comparison settings at the top of `plot_macaque_wilcoxon.py` (subsequent model runs are saved in directories named `<model>_1`, `<model>_2`, etc.).

In [None]:
!{sys.executable} {ROOT}/scripts/python/plot_macaque_wilcoxon.py \
    --results_dir {ROOT}/experiments/macaque \
    --results_filename summary_records_mean_std.pkl \
    --metric test_accuracy \
    --save_filename wilcoxon_plot