# **Experiment Demo Notebook**

This notebook has been prepared to allow for the direct running of our experiments.

## **Setup**

To first run the notebooks, we check whether we're currently using Google Colab or not. We can do so by running the following cell:

In [1]:
# Check if using Colab or not
import os
from pathlib import Path

try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False

if IN_COLAB:
    from getpass import getpass
    from google.colab import drive # Access the drive
    drive.mount('/content/drive')
    pat = ''
    repo_name = 'Ioana-Simion/egnn-jax'
    url = f"https://{pat}@github.com/{repo_name}.git"
    !git clone --branch main {url}
    print("\nCurrent Directory:")
    %cd egnn-jax

Now we can setup the environment and location. If you're not using Colab, then you need to install the environment yourself (which can be done using one of the `.yml` files, which depends on your device).

In [None]:
try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False

if IN_COLAB:
    # Read the requirements.txt file
    with open('requirements.txt') as f:
        requirements = f.read().splitlines()

    # Check if each requirement is installed, if not, install it
    import pkg_resources
    installed_packages = {pkg.key for pkg in pkg_resources.working_set}
    for requirement in requirements:
        if not any(requirement.split('==')[0] in pkg for pkg in installed_packages):
            !pip install {requirement}

    !pip install datasets


else: # automatically checks if the current directory is 'repo name'
    curdir = Path.cwd()
    print("Current Directory", curdir)
    assert curdir.name == "egnn-jax" or curdir.parent.name == "egnn-jax", "Notebook cwd has to be on the project root"
    if curdir.name == "notebooks":
        %cd ..
        print("New Current Directory:", curdir)

### **Generating the N-body Dataset**

As the N-body dataset is made via simulation, it's possible to create it manually (and should not take long). We can do that by running the following script:

In [None]:
!python ../n_body/dataset/generate_dataset.py --initial_vel 1 --num-train 3000 --length 1000 --length_test 1000 --sufix "small"

The data for N-body should now be in `n_body/dataset/data`. Alternatively, you can download the dataset [here](https://drive.google.com/drive/folders/1xfigu6ZJHvw7smx4J_-p8uRryIYGUjK7?usp=sharing).

Note that the data for QM9 is already available from `torch_geometric`, meaning that nothing else needs to be done for it.

## **Experiments**

Now we can begin with the experiments. We first show how to perform those for QM9.

Note that the parameters which can be adjusted are documented in the `README` file.

For the EGNN:

In [None]:
!python ../main_qm9.py

...and now for N-body.

N-body EGNN (for colab it is best to use the colab notebook)

In [3]:
!python ../nbody_egnn_trainer.py --nbody_path ../n_body/dataset/data/

Random seed set as 42
Parameters: 117575
[Epoch  1] Training mse: 0.009367193095386028, Validation mse: 0.008887327276170254
	   (New best performance, saving model...)
Figure(1000x600)
[Epoch  2] Training mse: 0.008097128011286259, Validation mse: 0.007526990957558155
	   (New best performance, saving model...)
[Epoch  3] Training mse: 0.006653978023678064, Validation mse: 0.006075744982808828
	   (New best performance, saving model...)
[Epoch  4] Training mse: 0.005744718946516514, Validation mse: 0.00560001889243722
	   (New best performance, saving model...)
