# Preparation Phase for the Tutorial

In the following steps we first get the system and package configuration ready for the tutorial.

## 1.  External MD Driver Requirement Checking
CP2K patched with PLUMED (PLUMED enabled NN)

If a certain package is missing, the rest of the notebook **cannot run properly**. 

In [1]:
!which plumed
!which cp2k.popt
import sys
print(sys.executable)

/rwthfs/rz/cluster/home/yy508225/myplumed/plumed2.9.0/bin/plumed
/home/yy508225/mycp2k/cp2k-2023.1/exe/local/cp2k.popt
/cvmfs/software.hpc.rwth.de/Linux/RH8/x86_64/intel/skylake_avx512/software/Python/3.10.8-GCCcore-12.2.0/bin/python


## 2. Skewencoder Installation
Create a python virtual env for this tutorial if necessary.

The python package `skewencoder` will be installed in the certain python venv.

In [3]:
import os

# Specify the path to your virtual environment
venv_dir = "tutorial4loxodynamics"

# Check if the virtual environment directory exists
if os.path.exists(venv_dir):
    print(f"Virtual environment '{venv_dir}' for this tutorial already exists.")
else:
    print(f"Virtual environment '{venv_dir}' for this tutorial does not exist. Creating now...")
    !python -m venv {venv_dir}
    print(f"Virtual environment '{venv_dir}' for this tutorial created.")

Virtual environment 'tutorial4loxodynamics' for this tutorial does not exist. Creating now...
Virtual environment 'tutorial4loxodynamics' for this tutorial created.


Install ipykernel in the created python venv for this notebook

The `pip install pytz python-dateutil` may not be necessary because it is specifically for the Python distribution on RWTH Cluster.

In [4]:
# Install ipykernel in the virtual environment
!source {venv_dir}/bin/activate

!{venv_dir}/bin/pip install pytz python-dateutil

!{venv_dir}/bin/pip install ipykernel

!{venv_dir}/bin/python -m ipykernel install --user --name={venv_dir} --display-name="Python ({venv_dir})"

import subprocess

# Define your desired kernel name
desired_kernel_name = venv_dir

# Get the list of installed kernels
result = subprocess.run(['jupyter', 'kernelspec', 'list'], stdout=subprocess.PIPE)

# Decode and split the result into lines
kernels_list = result.stdout.decode('utf-8').split('\n')

# Check if the desired kernel name is in the list
kernel_exists = any(desired_kernel_name in line for line in kernels_list)

if kernel_exists:
    print(f"Kernel '{desired_kernel_name}' already exists.")
else:
    print(f"Kernel '{desired_kernel_name}' does not exist. Proceeding with installation...")
    # Run your installation command here
    install_command = f"!{venv_dir}/bin/python -m ipykernel install --user --name={desired_kernel_name} --display-name='Python ({desired_kernel_name})'"
    exec(install_command)

print(f"Kernel added for virtual environment {venv_dir}.")

Collecting pytz
  Using cached pytz-2025.2-py2.py3-none-any.whl (509 kB)
Collecting python-dateutil
  Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Collecting six>=1.5
  Using cached six-1.17.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: pytz, six, python-dateutil
Successfully installed python-dateutil-2.9.0.post0 pytz-2025.2 six-1.17.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Collecting ipykernel
  Using cached ipykernel-6.29.5-py3-none-any.whl (117 kB)
Collecting debugpy>=1.6.5
  Using cached debugpy-1.8.14-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
Collecting ipython>=7.23.1
  Using cached ipython-8.36.0-py3-none-any.whl (831 kB)
Collecting pyzmq>=24
  Using cached

**TODO: Switch the ipykernel to the corresponding virtual env MANUALLY**

Select the kernel named `Python ({your_venv_dir})` in the kernel menu

Run the following cell to check if the python exec and pip are successfully switched.

In [21]:
import sys
import os

if sys.prefix != sys.base_prefix:
    print("Kernel switched to venv. Skewencoder will be installed in the default virtual environment.")
    virtual_env = sys.prefix
    # Set the VIRTUAL_ENV environment variable
    os.environ['VIRTUAL_ENV'] = virtual_env
    # Prepend the virtual environment's bin directory to PATH
    os.environ['PATH'] = f"{virtual_env}/bin:" + os.environ['PATH']
else:
    print("Kernel wasn't switched. Skewencoder might be installed globally.")

import subprocess

# Get site-packages directory using pip show command
result = subprocess.run(['pip', 'show', 'pip'], stdout=subprocess.PIPE)
output = result.stdout.decode()

for line in output.split('\n'):
    if line.startswith('Location'):
        print(f"Packages will be installed at: {line.split(': ')[1]}")

!which pip
!which python
# Get the current working directory
current_directory = os.getcwd()

# Print the current working directory
print("Current working directory for tutorial:", current_directory)

Kernel switched to venv. Skewencoder will be installed in the default virtual environment.
Packages will be installed at: /rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/bin/pip
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/bin/python
Current working directory for tutorial: /rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial


**Install skewencoder**

If the current working dir is still the copied `Tutorial` folder, one can directly run the following command to install skewencoder and the dependencies.

If the current working dir is not the copied `Tutorial` folder, one should manually set the path to the `skewencoder` repository and then install.


In [31]:
PATH_to_SKEWENCODER_Rep = "../skewencoder"

absolute_path = os.path.abspath(PATH_to_SKEWENCODER_Rep)

def is_package_installed(package_name):
    # Get the list of installed packages
    result = subprocess.run(['pip', 'list'], stdout=subprocess.PIPE)
    pip_list_output = result.stdout.decode()

    # Check if the package is in the list
    return package_name.lower() in pip_list_output.lower()

package_name = "skewencoder"

# Check if the package is installed
needs_installation = not is_package_installed(package_name)

if needs_installation:
    print(f"{package_name} is not installed. Needs installation.")
    print(f"Start to install {package_name}...")
    !pip install -e {PATH_to_SKEWENCODER_Rep}
else:
    print(f"{package_name} is already installed.")

sys.path.append(absolute_path)


skewencoder is already installed.



[notice] A new release of pip available: 22.2.2 -> 25.1.1
[notice] To update, run: pip install --upgrade pip


# Attention! Before starting:

One should make sure that the input for MD drivers are available in the identical folder of this script. 

In CP2K related simulation, both the original `job.inp` and the `job_restart.inp` for retarting are necessary.

In [18]:
# Define the file names
file_names = ["job.inp", "job_restart.inp"]

# Check if each file exists in the current directory
files_exist = {file_name: os.path.isfile(file_name) for file_name in file_names}

# Print results
for file_name, exists in files_exist.items():
    if exists:
        print(f"{file_name} is present in the current folder.")
    else:
        print(f"{file_name} is not present in the current folder.")

job.inp is present in the current folder.
job_restart.inp is present in the current folder.


# Step-by-step tutorial for Loxodynamics

Take the workflow of chabazite catalytic system as an example

The whole workflow can be separted into 3 main parts:

1. **Training**: implemented by pytorch + lightning
2. **Generating PLUMED input files**: for both unbiased simulation and biased simulation. 
3. **Run the simulation**: interface with outer MD drivers.

Therefore we need to prepare first some functions for the above usage.

We first load the basic modules necessary for the demo.

In [22]:
import numpy as np
import pandas as pd
import torch
from scipy import stats
from collections.abc import Sequence

# Locate the script for storage consistencty
SCRIPT_DIR = os.getcwd()

# Based on the OS type determine the CLI env.

win_bash_exe_prefix = ["bash","-c"]
zsh_prefix = ["/bin/zsh", "-c"]

# current_os = "windows"
current_os = "zsh"

if current_os == "windows":
    bash_prefix = win_bash_exe_prefix
else:
    bash_prefix = zsh_prefix

**We will demonstrate a Customized Workflow in the following tutorial**
> Note: "Customized" means that all input descriptors are manually defined. Just as what we did in the Chabazite Demo. 

We then load the skewencoder related modules that will apply to our system.

We also need to define folders storing the trajectories and the trained model in every iteration.

In [32]:
print(sys.executable)
!pip list | grep skewencoder
!pip show skewencoder
import skewencoder.state_detection as STADECT
from skewencoder.io import load_dataframe, load_data
import skewencoder.switchfunction as sf
from skewencoder.model_skewencoder import skewencoder_model_init, skewencoder_model_trainer, skewencoder_model_normalization, cv_eval

RESULTS_FOLDER = f"./results"
UNBIASED_FOLDER = f"./unbiased"
LIGHTNING_LOGS = f"./lightning_logs"

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/bin/python
skewencoder              0.1                 /rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/skewencoder

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Name: skewencoder
Version: 0.1
Summary: skewencoder setting up
Home-page: 
Author: GiovanniMaria Piccini
Author-email: Zhikun Zhang <zhikun.zhang@ltt.rwth-aachen.de>
License: MIT
Location: /rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages
Requires: KDEpy, lightning, matplotlib, mlcolvar, networkx, numpy, pandas, scipy, torch
Required-by: 


### 1. Preparation: Training procedure.
We first define the function for **training** Procedure.

In [33]:
"""
    Trains a model using the Chaba training algorithm with the specified state detection mechanism.

    Parameters:
    ----------
    state_detection : STADECT.State_detection
        An instance of the State_detection class that provides methods for detecting states and for decide if apply warm start training strategy.
    
    iter : int
        The number of iterations to run during training. This determines how many times the model will be updated.
    
    encoder_layers : Sequence[int]
        A sequence of integers representing the number of neurons in each layer of the encoder. 
        This defines the architecture of the encoder network used in training.
    
    loss_coeff : float
        A coefficient used for the weight of skewness loss in the loss function during optimization. 
        Adjusting this value can influence model performance and convergence behavior.
    
    batch_size : int
        batch size for training.
    Returns:
    -------
    state_detection : STADECT.State_detection
        The updated state detection instance after training.
        
    model : MultiTaskCV
        The trained model resulting from the training process (specify type if known).
        
    ITER_FOLDER : str
        The path to the folder containing iteration-related outputs or logs generated during training.
        
    skewness_dataset : DictDataset
        A dataset containing skewness information relevant to the trained model (specify type if known).
        
    break_flag : bool
        A flag indicating whether training was interrupted or completed normally.
        
    Example:
    --------
    >>> state_detection, model, ITER_FOLDER, skewness_dataset, break_flag = chaba_training(state_detection, iter, encoder_layers, loss_coeff, batch_size)
    
    """
def chaba_training(state_detection: STADECT.State_detection, iter: int, encoder_layers : Sequence[int], loss_coeff: float, batch_size: int):
    ITER_FOLDER = RESULTS_FOLDER + f"/iter_{iter}"
    subprocess.run([*bash_prefix,f"mkdir {ITER_FOLDER}"], cwd=SCRIPT_DIR)
    break_flag = False

    if iter == 0:
        filenames_iter = [f"{UNBIASED_FOLDER}/COLVAR"]
        filenames_all = filenames_iter
    else:
        filenames_all = [f"{RESULTS_FOLDER}/iter_{i}/COLVAR" for i in range(iter) ]
        filenames_all.append(f"{UNBIASED_FOLDER}/COLVAR")
        filenames_iter = [f"{RESULTS_FOLDER}/iter_{iter-1}/COLVAR"]
    AE_dataset, skewness_dataset, datamodule, _, _ = load_data(filenames_iter,filenames_all,multiple=(iter + 1), bs=batch_size)

    if iter == 0:
        is_stable_state, is_new_state = state_detection(filenames_iter[0])
        model = skewencoder_model_init(AE_dataset,encoder_layers, loss_coeff)
    else:
        PREV_ITER_FOLDER = f"{RESULTS_FOLDER}/iter_{iter-1}" # TODO: Might use os.path.dirname
        is_stable_state, is_new_state = state_detection(filenames_iter[0])
        apply_warm_start = not is_stable_state
        
        if not apply_warm_start:
            print("****************************")
            print("Restart from Scratch")
            print("Restart from Scratch")
            print("Restart from Scratch")
            print("****************************")
            model = skewencoder_model_init(AE_dataset,encoder_layers, loss_coeff)
        else:
            print("****************************")
            print("Apply Warm Start")
            print("Apply Warm Start")
            print("Apply Warm Start")
            print("****************************")
            model = skewencoder_model_init(AE_dataset,encoder_layers, loss_coeff,iter=iter,PREV_ITER_FOLDER=PREV_ITER_FOLDER)

    metrics = skewencoder_model_trainer(model, datamodule, iter_folder=ITER_FOLDER)

    model = skewencoder_model_normalization(model, AE_dataset)

    traced_model = model.to_torchscript(file_path=f'{ITER_FOLDER}/model_autoencoder_{iter}.pt', method='trace')

    return state_detection, model, ITER_FOLDER,skewness_dataset, break_flag

### 2. Preparation: PLUMED input generator.

Then we define the PLUMED input files for both **unbiased** simulation and **biased** simulation. 

Note that the input descriptors are manually defined and must be printed in the COLVAR files.

In the function `chaba_simulation`, we show how loxodynamics wall is applied.

In [34]:
def gen_plumed_chaba_unbiased(file_path = SCRIPT_DIR, simulation_folder = UNBIASED_FOLDER):

    file_path = f'{file_path}/plumed.dat'
    file = open(file_path, 'w')
    input=f'''# vim:ft=plumed
UNITS LENGTH=A TIME=0.001  #Amstroeng, hartree, fs
# O(BAS): o1: 17, o2: 22, o3: 26, o4: 34,
# O(MeOH) o5: 38
# H(CH3): h4: 43, h5: 44, h7: 46
# H(OH): h2: 39
# H(CH2): h3: 42, h6: 45
# H(BAS): h1: 37
# C: c1: 40, c2: 41
# DISTANCES between O(BAS) and H(CH3)
o4h7: DISTANCE ATOMS=34,46
o4h4: DISTANCE ATOMS=34,43
o4h5: DISTANCE ATOMS=34,44

o2h7: DISTANCE ATOMS=22,46
o2h4: DISTANCE ATOMS=22,43
o2h5: DISTANCE ATOMS=22,44

o3h7: DISTANCE ATOMS=26,46
o3h4: DISTANCE ATOMS=26,43
o3h5: DISTANCE ATOMS=26,44

o1h7: DISTANCE ATOMS=17,46
o1h4: DISTANCE ATOMS=17,43
o1h5: DISTANCE ATOMS=17,44

# DISTANCES between O(BAS) and H(CH2)
o4h3: DISTANCE ATOMS=34,42
o4h6: DISTANCE ATOMS=34,45

o2h3: DISTANCE ATOMS=22,42
o2h6: DISTANCE ATOMS=22,45

o3h3: DISTANCE ATOMS=26,42
o3h6: DISTANCE ATOMS=26,45

o1h3: DISTANCE ATOMS=17,42
o1h6: DISTANCE ATOMS=17,45

# DISTANCES between O(BAS) and H(OH)
o4h2: DISTANCE ATOMS=34,39
o2h2: DISTANCE ATOMS=22,39
o3h2: DISTANCE ATOMS=26,39
o1h2: DISTANCE ATOMS=17,39

# DISTANCES between O(BAS) and H(BAS)
o4h1: DISTANCE ATOMS=34,37
o2h1: DISTANCE ATOMS=22,37
o3h1: DISTANCE ATOMS=26,37
o1h1: DISTANCE ATOMS=17,37

# DISTANCES between O(MeOH) and C
o5c1: DISTANCE ATOMS=38,40
o5c2: DISTANCE ATOMS=38,41

# DISTANCES between O(MeOH) and H(CH3)
o5h7: DISTANCE ATOMS=38,46
o5h4: DISTANCE ATOMS=38,43
o5h5: DISTANCE ATOMS=38,44

# DISTANCES between O(MeOH) and H(CH2)
o5h3: DISTANCE ATOMS=38,42
o5h6: DISTANCE ATOMS=38,45

# DISTANCES between O(MeOH) and H(O)
o5h1: DISTANCE ATOMS=38,37
o5h2: DISTANCE ATOMS=38,39


# DISTANCE between atom 7 and 38
d17: DISTANCE ATOMS=7,38

# Apply upper wall to the distance between 7 and 38
uwall: UPPER_WALLS ARG=d17 AT=3.5 KAPPA=200.0

# PRINT all variables

PRINT FMT=%g STRIDE=10 FILE={simulation_folder}/COLVAR ARG=o4h7,o4h4,o4h5,o2h7,o2h4,o2h5,o3h7,o3h4,o3h5,o1h7,o1h4,o1h5,o4h3,o4h6,o2h3,o2h6,o3h3,o3h6,o1h3,o1h6,o4h2,o2h2,o3h2,o1h2,o4h1,o2h1,o3h1,o1h1,o5c1,o5c2,o5h7,o5h4,o5h5,o5h3,o5h6,o5h1,o5h2'''
    print(input, file=file)
    file.close()



def gen_plumed_chaba_biased(model_name : str,
                         file_path : str,
                         simulation_folder,
                         pos,
                         skew,
                         kappa,
                         offset):

    file_path = f'{file_path}/plumed.dat'
    file = open(file_path, 'w')
    input=f'''# vim:ft=plumed
UNITS LENGTH=A TIME=0.001  #Amstroeng, hartree, fs
# O(BAS): o1: 17, o2: 22, o3: 26, o4: 34,
# O(MeOH) o5: 38
# H(CH3): h4: 43, h5: 44, h7: 46
# H(OH): h2: 39
# H(CH2): h3: 42, h6: 45
# H(BAS): h1: 37
# C: c1: 40, c2: 41
# DISTANCES between O(BAS) and H(CH3)
o4h7: DISTANCE ATOMS=34,46
o4h4: DISTANCE ATOMS=34,43
o4h5: DISTANCE ATOMS=34,44

o2h7: DISTANCE ATOMS=22,46
o2h4: DISTANCE ATOMS=22,43
o2h5: DISTANCE ATOMS=22,44

o3h7: DISTANCE ATOMS=26,46
o3h4: DISTANCE ATOMS=26,43
o3h5: DISTANCE ATOMS=26,44

o1h7: DISTANCE ATOMS=17,46
o1h4: DISTANCE ATOMS=17,43
o1h5: DISTANCE ATOMS=17,44

# DISTANCES between O(BAS) and H(CH2)
o4h3: DISTANCE ATOMS=34,42
o4h6: DISTANCE ATOMS=34,45

o2h3: DISTANCE ATOMS=22,42
o2h6: DISTANCE ATOMS=22,45

o3h3: DISTANCE ATOMS=26,42
o3h6: DISTANCE ATOMS=26,45

o1h3: DISTANCE ATOMS=17,42
o1h6: DISTANCE ATOMS=17,45

# DISTANCES between O(BAS) and H(OH)
o4h2: DISTANCE ATOMS=34,39
o2h2: DISTANCE ATOMS=22,39
o3h2: DISTANCE ATOMS=26,39
o1h2: DISTANCE ATOMS=17,39

# DISTANCES between O(BAS) and H(BAS)
o4h1: DISTANCE ATOMS=34,37
o2h1: DISTANCE ATOMS=22,37
o3h1: DISTANCE ATOMS=26,37
o1h1: DISTANCE ATOMS=17,37

# DISTANCES between O(MeOH) and C
o5c1: DISTANCE ATOMS=38,40
o5c2: DISTANCE ATOMS=38,41

# DISTANCES between O(MeOH) and H(CH3)
o5h7: DISTANCE ATOMS=38,46
o5h4: DISTANCE ATOMS=38,43
o5h5: DISTANCE ATOMS=38,44

# DISTANCES between O(MeOH) and H(CH2)
o5h3: DISTANCE ATOMS=38,42
o5h6: DISTANCE ATOMS=38,45

# DISTANCES between O(MeOH) and H(O)
o5h1: DISTANCE ATOMS=38,37
o5h2: DISTANCE ATOMS=38,39


# DISTANCE between atom 7 and 38
d17: DISTANCE ATOMS=7,38

# Apply upper wall to the distance between 7 and 38
uwall: UPPER_WALLS ARG=d17 AT=3.5 KAPPA=200.0
cv: PYTORCH_MODEL FILE={model_name} ARG=o4h7,o4h4,o4h5,o2h7,o2h4,o2h5,o3h7,o3h4,o3h5,o1h7,o1h4,o1h5,o4h3,o4h6,o2h3,o2h6,o3h3,o3h6,o1h3,o1h6,o4h2,o2h2,o3h2,o1h2,o4h1,o2h1,o3h1,o1h1,o5c1,o5c2,o5h7,o5h4,o5h5,o5h3,o5h6,o5h1,o5h2

# UPPER_WALLS ARG=c1c2 AT=+8.5 KAPPA=250.0 EXP=2 LABEL=constr_c1c2 # Wall for potential constraints
    '''
    print(input, file=file)
    file.close()
    walltype=""
    if skew < 0:
        walltype = "UPPER_WALLS"
        offset = -offset
    else:
        walltype = "LOWER_WALLS"
    with open(file_path,"a") as f:
        print(f"""
# Energy wall for aes cv
wall: {walltype} ARG=cv.node-0 AT={pos+offset} KAPPA={kappa} ExP=2 EPS=1 OFFSET=0.0
PRINT FMT=%g STRIDE=10 FILE={simulation_folder}/COLVAR ARG=o4h7,o4h4,o4h5,o2h7,o2h4,o2h5,o3h7,o3h4,o3h5,o1h7,o1h4,o1h5,o4h3,o4h6,o2h3,o2h6,o3h3,o3h6,o1h3,o1h6,o4h2,o2h2,o3h2,o1h2,o4h1,o2h1,o3h1,o1h1,o5c1,o5c2,o5h7,o5h4,o5h5,o5h3,o5h6,o5h1,o5h2,cv.*""",file=f)
        



def chaba_simulation(iter_folder, model_name, model, dataset, kappa, offset):
    nn_output = cv_eval(model, dataset).flatten()
    mu_sknn = np.mean(nn_output)
    var_sknn = np.var(nn_output)
    skew_sknn = stats.skew(nn_output)
    offset += np.sqrt(var_sknn)
    gen_plumed_chaba_biased(model_name=model_name,
                         file_path=".",
                         simulation_folder=iter_folder,
                         pos=mu_sknn,
                         skew=skew_sknn,
                         kappa=kappa,
                         offset=offset)

### 3. Main Workflow

Now we can start our main work flow.

1. Parameter initialization.

In [49]:
# User customized parameters
kappa = 500
n_max_iter = 8
loss_coeff = 0.1
batch_size = 100
offset = 1.0

Optional: Torch seed must be fixed to the following value in order to reproduce the results of the preprint.

In [36]:
torch.manual_seed(22)

<torch._C.Generator at 0x7ffa28e47c50>

2. Clear the history data. 

In [43]:
# DO NOT RUN.
subprocess.run([*bash_prefix,f"rm -rf {RESULTS_FOLDER}"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix,f"rm -rf {LIGHTNING_LOGS}"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix,f"rm -rf {UNBIASED_FOLDER}"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix, f"rm -f {kappa}_iter* all*.pdb"], cwd=SCRIPT_DIR)

subprocess.run([*bash_prefix, f"mkdir -p {UNBIASED_FOLDER}"])

subprocess.run([*bash_prefix, f"echo '******************************************************'"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix, f"echo Start unbiased simulation"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix, f"echo '******************************************************'"], cwd=SCRIPT_DIR)

zsh:1: no matches found: 500_iter*


******************************************************
Start unbiased simulation
******************************************************


CompletedProcess(args=['/bin/zsh', '-c', "echo '******************************************************'"], returncode=0)

3. Generate PLUMED input file for unbiased simulation.

In [44]:
gen_plumed_chaba_unbiased()

4. Run unbiased simulation using CP2K.

In [45]:
!cp2k.popt job.inp > output.log

!echo "Unbiased simulation finished"

[login23-x-1.hpc.itc.rwth-aachen.de:168699] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:168699] pml_ucx.c:309  Error: Failed to create UCP worker


5. Organizing data for further usage.

**Don't run the following snippet in the Demo**

In [46]:
subprocess.run([*bash_prefix, "mv Chaba-1.restart newiter.restart"], cwd = SCRIPT_DIR)
subprocess.run([*bash_prefix, "rm -f Chaba*.restart"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix, f"mv Chaba-pos-1.pdb {kappa}_iteration_Chaba_unbiased-pos.pdb"], cwd = SCRIPT_DIR)
subprocess.run([*bash_prefix, f"cat {kappa}_iteration_Chaba_unbiased-pos.pdb > all_{kappa}.pdb"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix,"rm -f PLUMED.OUT Chaba*"], cwd=SCRIPT_DIR)
subprocess.run([*bash_prefix, f"mkdir -p {RESULTS_FOLDER}"])

CompletedProcess(args=['/bin/zsh', '-c', 'mkdir -p ./results'], returncode=0)

6. Parse the initial unbiased sampling for automatically determining input descriptors. 


In [47]:
bond_type_dict, n_descriptors,heavy_atom_pairs_list = STADECT.parse_unbiased_colvar(colvar_file = f"{UNBIASED_FOLDER}/COLVAR")
encoder_layers = [n_descriptors, 90, 40, 20, 5, 1]

7. Initialize state detection object.

   This object will save the current and historical states traversed by the system during the simulation.
   
   The initial state must be stable. Ohterwise further thermalization is needed.

In [48]:
state_detection = STADECT.State_detection((0.3, 0.7), bond_type_dict=bond_type_dict, n_heavy_atom_pairs=n_descriptors)

### Biased Simulation

In [50]:
for iter in range(n_max_iter):
    # Train the current model
    state_detection, model, ITER_FOLDER, skewness_dataset, break_flag = chaba_training(state_detection, iter, encoder_layers, loss_coeff, batch_size)
    # For logging
    subprocess.run([*bash_prefix, f"echo '******************************************************'"], cwd=SCRIPT_DIR)
    subprocess.run([*bash_prefix, f"echo At the iteration {iter} training step, "], cwd=SCRIPT_DIR)
    subprocess.run([*bash_prefix, f"echo The current state is {state_detection.current_state}"], cwd=SCRIPT_DIR)
    subprocess.run([*bash_prefix, f"echo '******************************************************'"], cwd=SCRIPT_DIR)

    # Generate PLUMED input files for biased simulation
    model_name = f"{ITER_FOLDER}/model_autoencoder_{iter}.pt"
    chaba_simulation(ITER_FOLDER, model_name, model, skewness_dataset, kappa, offset)
    
    # Run Biased Simulation
    subprocess.run([*bash_prefix,"cp2k.popt job_restart.inp > output.log"], cwd=SCRIPT_DIR)

    # Organize trajectories
    subprocess.run([*bash_prefix, "mv Chaba-1.restart newiter.restart"], cwd = SCRIPT_DIR)
    subprocess.run([*bash_prefix, "rm -f Chaba*.restart"], cwd=SCRIPT_DIR)
    subprocess.run([*bash_prefix, f"mv Chaba-pos-1.pdb {kappa}_iteration_Chaba_{iter}-pos.pdb"], cwd = SCRIPT_DIR)
    subprocess.run([*bash_prefix, f"tail -n +3 {kappa}_iteration_Chaba_{iter}-pos.pdb >> all_{kappa}.pdb"], cwd=SCRIPT_DIR)
    subprocess.run([*bash_prefix,"rm -f PLUMED.OUT Chaba*"], cwd=SCRIPT_DIR)

    # logging
    subprocess.run([*bash_prefix, f"echo CP2K simulation at iteration {iter} with plumed ends"], cwd=SCRIPT_DIR)

current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=0.000898228280932532, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.00034557638478034767, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.0002530601201132472, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.0006335084141182632, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=0.0002009941892113689, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.0003105629228685085, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.0007031435186932757, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.00037918194640370576, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.00035721330850010816, key = o3h5
current key is o1h7
current bond type: h-o
iter = 9, last_ro

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py:76: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `lightning.pytorch` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` p

Sanity Checking: |                                                                                                                                                                           | 0/? [00:00<?, ?it/s]dataset_len is [100, 100]
batch_size is [100, 100]
n_batches is [1, 1]
                                                                                                                                                                                                                   

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [400, 400]
batch_size is [100, 100]
n_batches is [4, 4]
Epoch 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 41.99it/s, v_num=0]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|██████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:95236] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:95236] pml_ucx.c:309  Error: Failed to create UCP worker


CP2K simulation at iteration 0 with plumed ends
current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=9.987445281455561e-05, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.00023102729091321306, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.00019688339781075273, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.0001266778395010454, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=0.0001758584459703818, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.00010501239849975972, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.00011882039136431783, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.0002750255590826081, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.00026381620230841513, key = o3h5
current k

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Sanity Checking: |                                                                                                                                                                           | 0/? [00:00<?, ?it/s]dataset_len is [200, 100]
batch_size is [200, 100]
n_batches is [1, 1]
                                                                                                                                                                                                                   

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:79: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 200. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [800, 400]
batch_size is [200, 100]
n_batches is [4, 4]
Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 104.77it/s, v_num=1]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|██████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:180459] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:180459] pml_ucx.c:309  Error: Failed to create UCP worker


CP2K simulation at iteration 1 with plumed ends


/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=0.0005924947525961449, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.01306176740474022, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.0021055346580593297, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.00042406747882687526, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=0.0032973295868675085, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.00029686296694043365, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.00021816725003917246, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.001500284616164011, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.000499211069401021, key = o3h5
current key is o1h7
current bond type: h-o
iter = 9, last_rows_

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:79: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 300. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [1200, 400]
batch_size is [300, 100]
n_batches is [4, 4]
Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 110.68it/s, v_num=2]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|█████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:141004] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:141004] pml_ucx.c:309  Error: Failed to create UCP worker


CP2K simulation at iteration 2 with plumed ends


/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=0.00012639809213490222, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.0001517219630931726, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.00017755979224570072, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.0001858736410693539, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=0.00014504166776114534, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.0001309708056361408, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.0001638903280453386, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.00016512495773865645, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.00020906845029736677, key = o3h5
current key is o1h7
current bond type: h-o
iter = 9, last

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:79: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 400. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [1600, 400]
batch_size is [400, 100]
n_batches is [4, 4]
Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 100.18it/s, v_num=3]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|█████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:249442] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:249442] pml_ucx.c:309  Error: Failed to create UCP worker


CP2K simulation at iteration 3 with plumed ends
current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=0.0014591610269816852, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.002161520873598603, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.03108809101855617, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.00033021143626307657, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=0.0021639410763627906, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.0038974004474634986, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.0004969889494219978, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.000595206080537296, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.00789198397413223, key = o3h5
current key is o1h7

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Sanity Checking: |                                                                                                                                                                           | 0/? [00:00<?, ?it/s]dataset_len is [500, 100]
batch_size is [500, 100]
n_batches is [1, 1]
                                                                                                                                                                                                                   

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:79: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 500. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [2000, 400]
batch_size is [500, 100]
n_batches is [4, 4]
Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 102.42it/s, v_num=4]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|█████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:138523] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:138523] pml_ucx.c:309  Error: Failed to create UCP worker


CP2K simulation at iteration 4 with plumed ends


/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python /rwthfs/rz/cluster/home/yy508225/test_skewencoder_de ...
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


current key is time
current key is o4h7
current bond type: h-o
iter = 0, last_rows_mean=0.00034527930614822427, key = o4h7
current key is o4h4
current bond type: h-o
iter = 1, last_rows_mean=0.00012250924548636155, key = o4h4
current key is o4h5
current bond type: h-o
iter = 2, last_rows_mean=0.00019322919450015687, key = o4h5
current key is o2h7
current bond type: h-o
iter = 3, last_rows_mean=0.00014024077036310493, key = o2h7
current key is o2h4
current bond type: h-o
iter = 4, last_rows_mean=8.381036882648743e-05, key = o2h4
current key is o2h5
current bond type: h-o
iter = 5, last_rows_mean=0.00010805703815962167, key = o2h5
current key is o3h7
current bond type: h-o
iter = 6, last_rows_mean=0.0003958362599314929, key = o3h7
current key is o3h4
current bond type: h-o
iter = 7, last_rows_mean=0.0001759468042716569, key = o3h4
current key is o3h5
current bond type: h-o
iter = 8, last_rows_mean=0.0005837363501820287, key = o3h5
current key is o1h7
current bond type: h-o
iter = 9, last

/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:79: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 600. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
/rwthfs/rz/cluster/home/yy508225/test_skewencoder_demo/Tutorial/tutorial4loxodynamics/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


dataset_len is [2400, 400]
batch_size is [600, 100]
n_batches is [4, 4]
Epoch 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 94.72it/s, v_num=5]
Validation: |                                                                                                                                                                                | 0/? [00:00<?, ?it/s][A
Validation:   0%|                                                                                                                                                                            | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|                                                                                                                                                               | 0/1 [00:00<?, ?it/s][A
Validation DataLoader 0: 100%|█████████████████████████████████████████

[login23-x-1.hpc.itc.rwth-aachen.de:235037] mca_base_component_repository_open: unable to open mca_mtl_ofi: libefa.so.1: cannot open shared object file: No such file or directory (ignored)
[login23-x-1.hpc.itc.rwth-aachen.de:235037] pml_ucx.c:309  Error: Failed to create UCP worker


KeyboardInterrupt: 