# Introduction

This notebook serves as a comprehensive guide for using the `amaceing_toolkit` to set up and execute various computational simulations. 
It includes step-by-step instructions for building input files, running simulations, and analyzing outputs for different tools such as CP2K, MACE, MatterSim, and SevenNet.
Both the interactive and command-line interfaces are covered, providing flexibility for users with different preferences.

## Table of Contents

1. [CP2K Input Files: GEO_OPT](#A1)
2. [MACE Input Files: MD](#A2)
3. [MACE Input Files: Finetuning](#A3)
4. [MatterSim Input Files: MD](#A4)
5. [MatterSim Input Files: Finetuning](#A5)
6. [SevenNet Input Files: MD](#A6)
7. [SevenNet Input Files: Finetuning](#A7)
8. [MACE Recalculating: Reference Trajectories](#A8)

# Installation 
The installation process takes a few minutes.

In [1]:
try:
    from amaceing_toolkit import amaceing_cp2k
except ImportError:
    print("amaceing_toolkit not found. Please install it using: pip install amaceing_toolkit")

In [2]:
# PREAMBLE
import subprocess
import os
import time

def run_atk(command, answers):
    """
    Run an ATK command with the given answers.
    
    Args:
        command (str): The ATK command to run.
        answers (str): A string with multiple lineas each with respective the answer to the command.
        
    Returns:
        str: The output from the command.
    """

    try:
        # Start the ATK process
        process = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        
        # Send the answers to the process all at once
        output, error = process.communicate(input=answers)
        
        # Check for errors
        if process.returncode != 0:
            print("Error running command")
            print(error)
            return None
        
        return output
        
    except Exception as e:
        print(f"Error running command: {e}")
        return None

# A1

## Q&A Process for Building CP2K Input Files

In this section, we will walk through the **Q&A process** used by the `amaceing_cp2k` function to build CP2K input files. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/system.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=REFTRAJ, 5=ENERGY)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Reference trajectory recalculation)
     - `5` (Energy Calculation)
   - **Tutorial Answer**: `1` (Geometry optimization)

   The calculation type determines the purpose of the simulation, such as optimizing the geometry, running molecular dynamics, or calculating energy.

---

### 5. **Project Name**
   - **Question**: *What is the name of the project?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `GEO_OPT_20250416` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyProject`

   The project name is used to name the output files and organize the results.

---

### 6. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `n`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.

---

### 7. **Small Changes to Default Settings**
   - **Question**: *Do you want to make small changes to the default settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, make small changes)
     - `n` (No, keep the default settings)
   - **Tutorial Answer**: `y`

   This option allows for minor adjustments to the default settings without going into full customization, the script prints you all changable parameters.
   - **Available Changes**:
     - `(1) max_iter: 1000`
     - `(2) print_forces: n`
     - `(3) xc_functional: BLYP`

--- 

### 8. **Custom Settings**
   - **Question**: *Which setting do you want to change? (Enter the number)*
    - **Example Answers**:
      - `1` (Change `max_iter`)
      - `2` (Change `print_forces`)
      - `3` (Change `xc_functional`)
   - **Tutorial Answer**: `1` (Change `max_iter`)
    - **New Value**: `200` (Set the maximum number of iterations to 1000)

--- 
### 9. **Other Custom Settings**
   - **Question**: *Do you want to change another setting? (y/n)*
    - **Example Answers**:
      - `y` (Yes, change another setting)
      - `n` (No, finish customization)
   - **Tutorial Answer**: `n`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the CP2K input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the CP2K input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `data/system.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `1` (Geometry optimization)
5. **Project Name**: `MyProject`
6. **Use Default Input Settings**: `n`
7. **Small Changes to Default Settings**: `y`
8. **Custom Settings**: `1` (Change `max_iter`) & **New Value**: `200`
9. **Other Custom Settings**: `n`

By following these steps, the script will generate a CP2K input file tailored to the specified parameters. This file can then be used to run simulations with CP2K.

In [3]:
# Set up the directory for the project
try:
    os.mkdir("cp2k")
except FileExistsError:
    print("Directory cp2k already exists, using it.")
os.chdir("cp2k")

# Define the command to run amaceing_cp2k
command = ["amaceing_cp2k"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/system.xyz
y
14.2067
1
MyProject
n
y
1
200
n
"""

# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


Welcome to the CP2K input file builder!
This tool will help you build input files for CP2K calculations.
Please answer

## Overview of the amaceing_cp2k Output

The `amaceing_cp2k` function generates a CP2K input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `geoopt_cp2k.inp` The main CP2K input file containing all the necessary settings for the calculation.
2. **Log File**: `cp2k_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` A shell script to execute the CP2K calculation on the compute node using the generated input file.
4. **Logger**: This run was logged with the implemented logger!

The CP2K Calculation could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The CP2K calculation is now set up and ready to run! Because of the complex installation of CP2K, the process will not be started in this notebook.


## One-Command Process
The same process can be done in one command using the `amaceing_cp2k` function. Here’s how to use it:

```bash
amaceing_cp2k --run_type="GEO_OPT" --config="{'project_name': 'MyProject', 'coord_file': 'data/system.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'max_iter': '200', 'print_forces': 'OFF', 'xc_functional': 'BLYP', 'cp2k_newer_than_2023x': 'y'}"
```


In [4]:
try:
    os.mkdir("cp2k_1command")
except FileExistsError:
    print("Directory cp2k_1command already exists, skipping creation.")
os.chdir("cp2k_1command")

command = """amaceing_cp2k --run_type="GEO_OPT" --config="{'project_name': 'MyProject', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'max_iter': '200', 'print_forces': 'OFF', 'xc_functional': 'BLYP', 'cp2k_newer_than_2023x': 'y'}" """

# Run 1-Command amaceing_cp2k
subprocess.run(command, shell=True)

os.chdir("..")



    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Input file geoopt_cp2k.inp created.
Runscript created: runscript.sh

    ┌
    │ If you use aMACEing_toolkit in your res

---
---

# A2

## Q&A Process for Building MACE Input Files: MD

In this section, we will walk through the **Q&A process** used by the `amaceing_mace` function to build MACE input files to run a MD. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/system.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE, 7=FINETUNE_MULTIHEAD)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
     - `7` (Multihead Finetuning)
   - **Tutorial Answer**: `3` (Molecular dynamics)

   The calculation type determines the purpose of the process, such as optimizing the geometry, running molecular dynamics, or finetuning.

---

### 5. **Simulation Environment**
   - **Question**: *Do you want to use the ASE atomic simulation environment (y) or LAMMPS (n)? (y/n)*
   - **Example Answers**:
     - `y` (for ASE simulations)
     - `n` (for LAMMPS simulations)
   - **Tutorial Answer**: `y`

   This specifies the simulation environment to be used for running the calculations.

---

### 6. **Project Name**
   - **Question**: *What is the name of the project?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `MD_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyMD`

   The project name is used to name the output files and organize the results.

---

### 7. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `n`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.

---

### 8. **Small Changes to Default Settings**
   - **Question**: *Do you want to make small changes to the default settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, make small changes)
     - `n` (No, keep the default settings)
   - **Tutorial Answer**: `y`

   This option allows for minor adjustments to the default settings without going into full customization, the script prints you all changable parameters.
   - **Available Changes**:
     - `(1) foundation_model: mace_mp`
     - `(2) model_size: small`
     - `(3) dispersion_via_ase: n`
     - `(4) temperature: 300`
     - `(5) pressure: 1.0`
     - `(6) thermostat: Langevin`
     - `(7) nsteps: 2000000`
     - `(8) write_interval: 10`
     - `(9) timestep: 0.5`
     - `(10) log_interval: 100`
     - `(11) print_ext_traj: y`


--- 

### 9. **Custom Settings**
   - **Question**: *Which setting do you want to change? (Enter the number)*
    - **Example Answers**:
      - `1` (Change `foundation_model`)
      - `4` (Change `temperature`)
      - `9` (Change `timestep`)
   - **Tutorial Answer**: `7` (Change `nsteps`)
    - **New Value**: `20` (Set the number of simulation steps to 20)

--- 
### 10. **Other Custom Settings**
   - **Question**: *Do you want to change another setting? (y/n)*
    - **Example Answers**:
      - `y` (Yes, change another setting)
      - `n` (No, finish customization)
   - **Tutorial Answer**: `n`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the MACE input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the MACE input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/system.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `3` (Molecular dynamics)
5. **Simulation Environment**: `y` (ASE)
6. **Project Name**: `MyMD`
7. **Use Default Input Settings**: `n`
8. **Small Changes to Default Settings**: `y`
9. **Custom Settings**: `7` (Change `nsteps`) & **New Value**: `20`
10. **Other Custom Settings**: `n`

By following these steps, the script will generate a MACE input file tailored to the specified parameters. This file can then be used to run simulations with MACE.

In [5]:
# Set up the directory for the project
try:
    os.mkdir("mace_md")
except FileExistsError:
    print("Directory mace_md already exists, skipping creation.")
os.chdir("mace_md")

# Define the command to run amaceing_mace
command = ["amaceing_mace"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/system.xyz
y
14.2067
3
y
MyMD
n
y
7
20
n
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


WELCOME TO THE MACE INPUT WRITER!
This tool will help you build input files for the mace framework.
Please answer the 

## Overview of the amaceing_mace Output

The `amaceing_mace` function generates a MACE input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `md_mace.py` The main MACE input file containing all the necessary settings for the calculation.
2. **Log File**: `mace_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` A shell script to execute the MACE calculation on the compute node using the generated input file.
4. **GPU-Runscript**: `gpu_script.job` A HPC-Runscript for specially configured GPU nodes. 
5. **Logger**: This run was logged with the implemented logger!

The MACE Calculation could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The MACE calculation is now set up and ready to run! We start it on the compute nodes by running the `runscript.sh` file or locally with by just running the `md_mace.inp` file with the command:
```bash
python md_mace.py
```

In [6]:
# Starting the md_mace run
os.chdir("mace_md")
command = "python md_mace.py"

# Run the command
subprocess.run(command, shell=True)
os.chdir("..")

python: can't open file '/scratch/joha4087/scripting/amaceing_toolkit/tutorials/mace_md/md_mace.py': [Errno 2] No such file or directory


## One-Command Process
The same process can be done in one command using the `amaceing_mace` function. Here’s how to use it:

```bash
amaceing_mace --run_type="MD" --config="{'project_name': 'MyMD', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'foundation_model': 'mace_mp', 'model_size': 'small', 'dispersion_via_simenv': 'n', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': '20', 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y', 'simulation_environment': 'ase'}"
```

In [7]:
try:
    os.mkdir("mace_md_1command")
except FileExistsError:
    print("Directory mace_md_1command already exists, skipping creation.")
os.chdir("mace_md_1command")

command = """amaceing_mace --run_type="MD" --config="{'project_name': 'MyMD', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'foundation_model': 'mace_mp', 'model_size': 'small', 'dispersion_via_simenv': 'n', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': '20', 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y', 'simulation_environment': 'ase'}" """

# Run 1-Command amaceing_mace 
subprocess.run(command, shell=True)

os.chdir("..")



    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
ASE input file written to: md.py
Runscripts written to: runscript.sh and gpu_script.job
ASE input file written to: md.py

# A3

---
---
## Q&A Process for Building MACE Input Files: Finetuning

In this section, we will walk through the **Q&A process** used by the `amaceing_mace` function to build MACE input files to finetune a foundation model. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/train_mace.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE, 7=FINETUNE_MULTIHEAD)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
     - `7` (Multihead Finetuning)
   - **Tutorial Answer**: `6` (Finetuning)

   The calculation type determines the purpose of the process, such as optimizing the geometry, running molecular dynamics, or finetuning.

---

### 5. **Project Name**
   - **Question**: *What is the name of the resulting model?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `FINETUNE_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyFTModel`

---
### 6. **Define train file**
   - **Question**: *Do you want to create a training dataset from a force & a position file (y) or did you define it already (n)?*
   - **Example Answers**:
     - `y` (Yes, create a training dataset)
     - `n` (No, use an existing training dataset)
   - **Tutorial Answer**: `n`

---

### 7. **Reduce train dataset size**
   - **Question**: *Do you want to use only a fraction of the dataset (e.g. for testing purposes)? (y/n)*
   - **Example Answers**:
     - `y` (Yes, reduce the size)
     - `n` (No, keep the original size)
   - **Tutorial Answer**: `n`

   This option allows for reducing the size of the training dataset to speed up the finetuning process. If `y` is chosen, additional questions will be asked to specify the reduction parameters.

---

### 8. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `n`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.

---

### 9. **Small Changes to Default Settings**
   - **Question**: *Do you want to make small changes to the default settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, make small changes)
     - `n` (No, keep the default settings)
   - **Tutorial Answer**: `y`

   This option allows for minor adjustments to the default settings without going into full customization, the script prints you all changable parameters.
   - **Available Changes**:
     - `(1) device: cuda`
     - `(2) stress_weight: 0.0`
     - `(3) forces_weight: 10.0`
     - `(4) energy_weight: 0.1`
     - `(5) foundation_model: mace_mp`
     - `(6) model_size: small`
     - `(7) prevent_catastrophic_forgetting: n`
     - `(8) batch_size: 5`
     - `(9) valid_batch_size: 2`
     - `(10) valid_fraction: 0.1`
     - `(11) epochs: 200`
     - `(12) seed: 1`
     - `(13) lr: 0.01`
     - `(14) dir: MACE_models`

--- 

### 10. **Custom Settings**
   - **Question**: *Which setting do you want to change? (Enter the number)*
    - **Example Answers**:
      - `1` (Change `device`)
      - `5` (Change `foundation_model`)
      - `9` (Change `timestep`)
   - **Tutorial Answer**: `11` (Change `epochs`)
    - **New Value**: `2` (Set the number of epochs to 2)

--- 
### 11. **Other Custom Settings**
   - **Question**: *Do you want to change another setting? (y/n)*
    - **Example Answers**:
      - `y` (Yes, change another setting)
      - `n` (No, finish customization)
   - **Tutorial Answer**: `n`

---
### 12. **XC-Functional of training dataset**
   - **Question**: *What is the exchange-correlation functional used in the production of the training dataset? 
   1: PBE
   2: PBE_SR
   3: BLYP
   4: BLYP_SR*
   - **Example Answers**:
     - `1` (PBE: A common functional)
     - `2` (PBE_SR: A short-range version of PBE)
     - `3` (BLYP: A common functional)
     - `4` (BLYP_SR: A short-range version of BLYP)
   - **Tutorial Answer**: `3` (BLYP)

   This specifies the exchange-correlation functional used in the training dataset. It is important for accurate energy and force calculations.

---

### 13. **Log the Model**
   - **Question**: *Do you want to log the model? (y/n)*
    - **Example Answers**:
      - `y` (Yes, log the model)
      - `n` (No, do not log the model)
    - **Tutorial Answer**: `n`

   This option allows for logging the model during the finetuning process. It is useful for tracking changes and performance over time.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the MACE input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the MACE input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/train_mace.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `6` (Finetuning)
5. **Project Name**: `MyFTModel`
6. **Define train file**: `n`
7. **Reduce train dataset size**: `n`
8. **Use Default Input Settings**: `n`
9. **Small Changes to Default Settings**: `y`
10. **Custom Settings**: `11` (Change `epochs`) & **New Value**: `2`
11. **Other Custom Settings**: `n`
12. **XC-Functional of training dataset**: `3`
13. **Log the Model**: `n`

By following these steps, the script will generate a MACE input file tailored to the specified parameters. This file can then be used to run simulations with MACE.

In [8]:
# Set up the directory for the project
try:
    os.mkdir("mace_ft")
except FileExistsError:
    print("Directory mace_ft already exists, skipping creation.")
os.chdir("mace_ft")

# Define the command to run amaceing_mace
command = ["amaceing_mace"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/train_mace.xyz
y
14.2067
6
MyFTModel
n
n
n
y
11
2
n
3
n
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


WELCOME TO THE MACE INPUT WRITER!
This tool will help you build input files for the mace framework.
Please answer the 

## Overview of the amaceing_mace Finetune Output

The `amaceing_mace` function generates a MACE input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `finetune.py` The main MACE input file containing all the necessary settings for the calculation.
2. **Config File**: `config_MyFTModel.yml` A configuration file containing all the parameters used in the calculation.
3. **Log File**: `mace_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
4. **Runscript**: `runscript.sh` A shell script to execute the MACE calculation on the compute node using the generated input file.
5. **GPU-Runscript**: `gpu_script.job` A HPC-Runscript for specially configured GPU nodes. 
6. **Model Logger**: This model was logged with the implemented logger for faster reuse of the model.

The MACE Finetuning could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh 
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The MACE calculation is now set up and ready to run! We start it on the compute nodes by running the `runscript.sh` or `gpu_script.job` file or locally with by just running the `finetune.py` file with the command:
```bash
python finetune.py 
```

In [None]:
# Starting the finetune run
os.chdir("mace_ft")
command = "python finetune.py"

# Run the command
subprocess.run(command, shell=True)
os.chdir("..")

python: can't open file '/scratch/joha4087/scripting/amaceing_toolkit/tutorials/mace_ft/finetune_mace.py': [Errno 2] No such file or directory


## One-Command Process
The same process can be done in one command using the `amaceing_mace` function. Here’s how to use it:

```bash
amaceing_mace --run_type="FINETUNE" --config="{'project_name': 'MyMD', 'train_file': '../data/train_mace.xyz', 'device': 'cuda', 'stress_weight': 0.0, 'forces_weight': 10.0, 'energy_weight': 0.1, 'foundation_model': 'mace_mp', 'model_size': 'small', 'batch_size': 5, 'prevent_catastrophic_forgetting': 'n', 'valid_fraction': 0.1, 'valid_batch_size': 2, 'epochs': '2', 'seed': 1, 'lr': 0.01, 'dir': 'MACE_models', 'xc_functional_of_dataset': 'BLYP'}"
```

In [None]:
try:
    os.mkdir("mace_ft_1command")
except FileExistsError:
    print("Directory mace_ft_1command already exists, skipping creation.")
os.chdir("mace_ft_1command")

command = """amaceing_mace --run_type="FINETUNE" --config="{'project_name': 'MyMD', 'train_file': '../data/train_mace.xyz', 'device': 'cuda', 'stress_weight': 0.0, 'forces_weight': 10.0, 'energy_weight': 0.1, 'foundation_model': 'mace_mp', 'model_size': 'small', 'batch_size': 5, 'prevent_catastrophic_forgetting': 'n', 'valid_fraction': 0.1, 'valid_batch_size': 2, 'epochs': '2', 'seed': 1, 'lr': 0.01, 'dir': 'MACE_models', 'xc_functional_of_dataset': 'BLYP'}" """

# Run 1-Command amaceing_mace 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
0/3 are missing in the precalculated E0 dictionary...
E0 dictionary created using the precalculated data:  {8: -427.8360

# A4

---
---
## Q&A Process for Building MatterSim Input Files: MD

In this section, we will walk through the **Q&A process** used by the `amaceing_mattersim` function to build MatterSim input files to run a MD. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/system.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
   - **Tutorial Answer**: `3` (Molecular dynamics)

   The calculation type determines the purpose of the process, such as running molecular dynamics or finetuning.

---

### 5. **Project Name**
   - **Question**: *What is the name of the project?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `MD_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyMD`

   The project name is used to name the output files and organize the results.

---

### 6. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `y`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the MatterSim input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the MatterSim input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/system.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `3` (Molecular dynamics)
5. **Project Name**: `MyMD`
6. **Use Default Input Settings**: `y`

By following these steps, the script will generate a MatterSim input file tailored to the specified parameters. This file can then be used to run simulations with MatterSim.


In [11]:
# Set up the directory for the project
try:
    os.mkdir("mattersim_md")
except FileExistsError:
    print("Directory mattersim_md already exists, skipping creation.")
os.chdir("mattersim_md")

# Define the command to run amaceing_mattersim
command = ["amaceing_mattersim"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/system.xyz
y
14.2067
3
MyMD
y
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


WELCOME TO THE MATTERSIM INPUT WRITER!
This tool will help you build input files for the mattersim framework.
Please a

## Overview of the amaceing_mattersim Output

The `amaceing_mattersim` function generates a MatterSim input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `md_mattersim.py` The main MatterSim input file containing all the necessary settings for the calculation.
2. **Log File**: `mattersim_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` A shell script to execute the MatterSim calculation on the compute node using the generated input file.
4. **GPU-Runscript**: `gpu_script.job` A HPC-Runscript for specially configured GPU nodes. 
5. **Logger**: This run was logged with the implemented logger!

The MatterSim Calculation could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The MatterSim calculation is now set up and ready to run! There exists a dependency conflict between the `mattersim` and `mace` package. Therefore, the process will not be started in this notebook. (It is possible to run it in the separate environment with only the `mattersim` and `sevennet` package installed.)

## One-Command Process
The same process can be done in one command using the `amaceing_mattersim` function. Here’s how to use it:

```bash
amaceing_mattersim --run_type="MD" --config="{'project_name': 'MyFTModel', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': 'large', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': 2000000, 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y'}"
```

In [12]:
try:
    os.mkdir("mattersim_md_1command")
except FileExistsError:
    print("Directory mattersim_md_1command already exists, skipping creation.")
os.chdir("mattersim_md_1command")

command = """amaceing_mattersim --run_type="MD" --config="{'project_name': 'MyFTModel', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': 'large', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': 2000000, 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y'}" """

# Run 1-Command amaceing_mattersim 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
ASE input file written to: md.py
Runscripts written to: runscript.sh and gpu_script.job
ASE input file written to: md.py

# A5

---
---
## Q&A Process for Building MatterSim Input Files: Finetuning

In this section, we will walk through the **Q&A process** used by the `amaceing_mattersim` function to build MatterSim input files to finetune a foundation model. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/dft_energies.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
   - **Tutorial Answer**: `6` (Finetuning)

   The calculation type determines the purpose of the process, such as optimizing the geometry, running molecular dynamics, or finetuning.

---

### 5. **Project Name**
   - **Question**: *What is the name of the resulting model?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `FINETUNE_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyFTModel`

---
### 6. **Define train file**
   - **Question**: *Do you want to create a training dataset from a force & a position file (y) or did you define it already (n)?*
   - **Example Answers**:
     - `y` (Yes, create a training dataset)
     - `n` (No, use an existing training dataset)
   - **Tutorial Answer**: `y`

---
### 7. **Name of the force file**
   - **Question**: *What is the name of the force file?*
   - **Example Answers**:
      - `force.xyz` (A file containing forces)
      - `forces.xyz` (Another file containing forces)
   - **Tutorial Answer**: `../data/dft_forces.xyz`
    - **Note**: The created training dataset will be saved in the current directory as `dataset.xyz` and `datset_trainset.xyz`. (The `dataset_trainset.xyz` file is used for the finetuning process with MatterSim because it contains other Keywords like `energy` and `forces`.)
    
    This specifies the name of the force file that will be used to create the training dataset. Ensure the file exists in the specified path.

---

### 8. **Reduce train dataset size**
   - **Question**: *Do you want to use only a fraction of the dataset (e.g. for testing purposes)? (y/n)*
   - **Example Answers**:
     - `y` (Yes, reduce the size)
     - `n` (No, keep the original size)
   - **Tutorial Answer**: `n`

   This option allows for reducing the size of the training dataset to speed up the finetuning process. If `y` is chosen, additional questions will be asked to specify the reduction parameters.

### 9. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `n`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.

---

### 10. **Log the Model**
   - **Question**: *Do you want to log the model? (y/n)*
   - **Example Answers**:
      - `y` (Yes, log the model)
      - `n` (No, do not log the model)
   - **Tutorial Answer**: `n`

   This option allows for logging the model during the finetuning process. It is useful for tracking changes and performance over time.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the MatterSim input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the MatterSim input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/dft_energies.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `6` (Finetuning)
5. **Project Name**: `MyFTModel`
6. **Define train file**: `y`
7. **Name of the force file**: `../data/dft_forces.xyz`
8. **Reduce train dataset size**: `n`
9. **Use Default Input Settings**: `y`
10. **Log the Model**: `n`

By following these steps, the script will generate a MatterSim input file tailored to the specified parameters. This file can then be used to run simulations with MatterSim.

In [13]:
# Set up the directory for the project
try:
    os.mkdir("mattersim_ft")
except FileExistsError:
    print("Directory mattersim_ft already exists, skipping creation.")
os.chdir("mattersim_ft")

# Define the command to run amaceing_mattersim
command = ["amaceing_mattersim"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/dft_energies.xyz
y
14.2067
6
MyFTModel
y
../data/dft_forces.xyz
n
y
n
"""

# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")

Error running command
Traceback (most recent call last):
  File "/home/joha4087/anaconda3/envs/atk2/bin/amaceing_mattersim", line 8, in <module>
    sys.exit(amaceing_mattersim())
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 1487, in atk_mattersim
    writer.main()
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 90, in main
    self._interactive_mode()
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 270, in _interactive_mode
    self._configure_finetune(coord_file, pbc_mat, project_name, base_config)
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 576, in _configure_finetune
    path_to_training_file = create_dataset(coord_file, force_file, self.run_type, pbc_mat)
TypeError: create_dataset() takes 3 positional arguments but 4 were given

None


## Overview of the amaceing_mattersim Finetune Output

The `amaceing_mattersim` function generates a MatterSim input file based on the provided parameters. The output consists of several key components:

1. **GPU-Runscript**: `gpu_script.job` A shell script to execute the MatterSim finetuning process on the compute node with one `torchrun` command.
2. **Dataset**: `dataset.xyz` and `dataset_trainset.xyz` The MatterSim input training files.
3. **Logger**: This run was logged with the implemented logger!

## End of Q&A Process
The MatterSim calculation is now set up and ready to run! There exists a dependency conflict between the `mattersim` and `mace` package. Therefore, the process will not be started in this notebook. (It is possible to run it in the separate environment with only the `mattersim` and `sevennet` package installed.)

## One-Command Process
The same process can be done in one command using the `amaceing_mattersim` function. Here’s how to use it:

```bash
amaceing_mattersim --run_type="FINETUNE" --config="{'project_name': 'MyFTModel', 'train_file': '../data/train_7net.xyz', 'device': 'cuda', 'force_loss_ratio': 10.0, 'foundation_model': 'small', 'batch_size': 5, 'save_checkpoint': 'y', 'ckpt_interval': 25, 'epochs': 200, 'seed': 1, 'lr': 0.01, 'save_path': 'MatterSim_models', 'early_stopping': 'n'}"
```

In [14]:
try:
    os.mkdir("mattersim_ft_1command")
except FileExistsError:
    print("Directory mattersim_ft_1command already exists, skipping creation.")
os.chdir("mattersim_ft_1command")

command = """amaceing_mattersim --run_type="FINETUNE" --config="{'project_name': 'MyFTModel', 'train_file': '../data/train_7net.xyz', 'device': 'cuda', 'force_loss_ratio': 10.0, 'foundation_model': 'small', 'batch_size': 5, 'save_checkpoint': 'y', 'ckpt_interval': 25, 'epochs': 200, 'seed': 1, 'lr': 0.01, 'save_path': 'MatterSim_models', 'early_stopping': 'n'}" """

# Run 1-Command amaceing_mattersim 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Runscript for gpu has been written to gpu_script.job
The infos of this model was saved into the finetuned_models.log for

# A6

---
---
## Q&A Process for Building SevenNet Input Files: MD

In this section, we will walk through the **Q&A process** used by the `amaceing_sevennet` function to build SevenNet input files to run a MD. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/system.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
   - **Tutorial Answer**: `3` (Molecular dynamics)

   The calculation type determines the purpose of the process, such as running molecular dynamics or finetuning.

---

### 5. **Simulation Environment**
   - **Question**: *Do you want to use the ASE atomic simulation environment (y) or LAMMPS (n)? (y/n)*
   - **Example Answers**:
     - `y` (for ASE simulations)
     - `n` (for LAMMPS simulations)
   - **Tutorial Answer**: `y`

   This specifies the simulation environment to be used for running the calculations.

---

### 6. **Project Name**
   - **Question**: *What is the name of the project?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `MD_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyMD`

   The project name is used to name the output files and organize the results.

---

### 7. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `y`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the SevenNet input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the SevenNet input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/system.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `3` (Molecular dynamics)
5. **Simulation Environment**: `y` (ASE)
6. **Project Name**: `MyMD`
7. **Use Default Input Settings**: `y`

By following these steps, the script will generate a SevenNet input file tailored to the specified parameters. This file can then be used to run simulations with SevenNet.


In [None]:
# Set up the directory for the project
try:
    os.mkdir("sevennet_md")
except FileExistsError:
    print("Directory sevennet_md already exists, skipping creation.")
os.chdir("sevennet_md")

# Define the command to run amaceing_sevennet
command = ["amaceing_sevennet"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/system.xyz
y
14.2067
3
y
MyMD
y
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")

Error running command
Traceback (most recent call last):
  File "/home/joha4087/anaconda3/envs/atk2/bin/amaceing_sevennet", line 8, in <module>
    sys.exit(amaceing_sevennet())
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 1479, in atk_sevennet
    writer.main()
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 90, in main
    self._interactive_mode()
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 274, in _interactive_mode
    self._configure_simulation_run(coord_file, pbc_mat, project_name, base_config)
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 432, in _configure_simulation_run
    self.config, use_default = self._default_config_loader()
  File "/scratch/joha4087/scripting/amaceing_toolkit/src/amaceing_toolkit/workflow/input_wrapper.py", line 81

## Overview of the amaceing_sevennet Output

The `amaceing_sevennet` function generates a SevenNet input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `md_sevennet.py` The main SevenNet input file containing all the necessary settings for the calculation.
2. **Log File**: `sevennet_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` A shell script to execute the SevenNet calculation on the compute node using the generated input file.
4. **GPU-Runscript**: `gpu_script.job` A HPC-Runscript for specially configured GPU nodes. 
5. **Logger**: This run was logged with the implemented logger!

The SevenNet Calculation could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The SevenNet calculation is now set up and ready to run! There exists a dependency conflict between the `sevennet` and `mace` package. Therefore, the process will not be started in this notebook. (It is possible to run it in the separate environment with only the `mattersim` and `sevennet` package installed.)

## One-Command Process
The same process can be done in one command using the `amaceing_sevennet` function. Here’s how to use it:

```bash
amaceing_sevennet --run_type="MD" --config="{'project_name': 'MyMD', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': '7net-mf-ompa', 'modal': 'mpa', 'dispersion_via_simenv': 'n', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': 2000000, 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y', 'simulation_environment': 'ase'}"
```

In [16]:
try:
    os.mkdir("sevennet_md_1command")
except FileExistsError:
    print("Directory sevennet_md_1command already exists, skipping creation.")
os.chdir("sevennet_md_1command")

command = """amaceing_sevennet --run_type="MD" --config="{'project_name': 'MyMD', 'coord_file': '../data/system.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': '7net-mf-ompa', 'modal': 'mpa', 'dispersion_via_simenv': 'n', 'temperature': '300', 'pressure': '1.0', 'thermostat': 'Langevin', 'nsteps': 2000000, 'write_interval': 10, 'timestep': 0.5, 'log_interval': 100, 'print_ext_traj': 'y', 'simulation_environment': 'ase'}" """

# Run 1-Command amaceing_sevennet
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
ASE input file written to: md.py
Runscripts written to: runscript.sh and gpu_script.job
ASE input file written to: md.py

# A7

---
---
## Q&A Process for Building SevenNet Input Files: Finetuning

In this section, we will walk through the **Q&A process** used by the `amaceing_sevennet` function to build SevenNet input files to finetune a foundation model. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---

### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/train_7net.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)

   - **Tutorial Answer**: `6` (Finetuning)

   The calculation type determines the purpose of the process, such as optimizing the geometry, running molecular dynamics, or finetuning.

---

### 5. **Project Name**
   - **Question**: *What is the name of the resulting model?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `FINETUNE_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyFTModel`

---
### 6. **Define train file**
   - **Question**: *Do you want to create a training dataset from a force & a position file (y) or did you define it already (n)?*
   - **Example Answers**:
     - `y` (Yes, create a training dataset)
     - `n` (No, use an existing training dataset)
   - **Tutorial Answer**: `n`

---

### 7. **Reduce train dataset size**
   - **Question**: *Do you want to use only a fraction of the dataset (e.g. for testing purposes)? (y/n)*
   - **Example Answers**:
     - `y` (Yes, reduce the size)
     - `n` (No, keep the original size)
   - **Tutorial Answer**: `n`

   This option allows for reducing the size of the training dataset to speed up the finetuning process. If `y` is chosen, additional questions will be asked to specify the reduction parameters.

### 8. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `n`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.

---

### 9. **Log the Model**
   - **Question**: *Do you want to log the model? (y/n)*
   - **Example Answers**:
      - `y` (Yes, log the model)
      - `n` (No, do not log the model)
   - **Tutorial Answer**: `n`

   This option allows for logging the model during the finetuning process. It is useful for tracking changes and performance over time.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the SevenNet input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the SevenNet input file using the gathered information and predefined templates.
5. **Output**: The generated input file is saved in the current directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/dft_energies.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `6` (Finetuning)
5. **Project Name**: `MyFTModel`
6. **Define train file**: `n`
7. **Reduce train dataset size**: `n`
8. **Use Default Input Settings**: `y`
9. **Log the Model**: `n`

By following these steps, the script will generate a SevenNet input file tailored to the specified parameters. This file can then be used to run simulations with SevenNet.

In [17]:
# Set up the directory for the project
try:
    os.mkdir("sevennet_ft")
except FileExistsError:
    print("Directory sevennet_ft already exists, skipping creation.")
os.chdir("sevennet_ft")

# Define the command to run amaceing_mattersim
command = ["amaceing_sevennet"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/train_7net.xyz
y
14.2067
6
MyFTModel
n
n
y
n
"""

# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


WELCOME TO THE SEVENNET INPUT WRITER!
This tool will help you build input files for the sevennet framework.
Please ans

## Overview of the amaceing_sevennet Finetune Output

The `amaceing_sevennet` function generates a MatterSim input file based on the provided parameters. The output consists of several key components:

1. **Input File**: `finetune_sevennet.py` The main SevenNet input file containing all the necessary settings for the calculation.
2. **Log File**: `sevennet_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` A shell script to execute the SevenNet calculation on the compute node using the generated input file.
4. **GPU-Runscript**: `gpu_script.job` A HPC-Runscript for specially configured GPU nodes. 
5. **Model Logger**: This model was logged with the implemented logger for faster reuse of the model.

The SevenNet Finetuning could be run using the following command:
```bash
# LSF workload manager
bsub < runscript.sh 
# SLURM workload manager
sbatch runscript.sh
```

## End of Q&A Process
The SevenNet calculation is now set up and ready to run! There exists a dependency conflict between the `sevennet` and `mace` package. Therefore, the process will not be started in this notebook. (It is possible to run it in the separate environment with only the `mattersim` and `sevennet` package installed.)

## One-Command Process
The same process can be done in one command using the `amaceing_sevennet` function. Here’s how to use it:

```bash
amaceing_sevennet --run_type="FINETUNE" --config="{'project_name': 'MyFTModel', 'foundation_model': '7net-0', 'device': 'cuda', 'train_file': '../data/train_7net.xyz', 'batch_size': 4, 'epochs': 100, 'seed': 1, 'lr': 0.01}"
```

In [18]:
try:
    os.mkdir("sevennet_ft_1command")
except FileExistsError:
    print("Directory mace_ft_sevennet_ft_1command1command already exists, skipping creation.")
os.chdir("sevennet_ft_1command")

command = """amaceing_sevennet --run_type="FINETUNE" --config="{'project_name': 'MyFTModel', 'foundation_model': '7net-0', 'device': 'cuda', 'train_file': '../data/train_7net.xyz', 'batch_size': 4, 'epochs': 100, 'seed': 1, 'lr': 0.01, 'force_loss_ratio': 1.0}" """

# Run 1-Command amaceing_sevennet 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Runscript for gpu has been written to gpu_script.job
The infos of this model was saved into the finetuned_models.log for

# A8

---
---
## Q&A Process for Recalculate Reference Trajectories with MACE

In this section, we will walk through the **Q&A process** used by the `amaceing_mace` function to run a MACE Recalculation. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---
### 1. **Coordinate File**
   - **Question**: *What is the name of the coordinate file (or reference trajectory)?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/traj.xyz`

   The coordinate file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Calculation Type**
   - **Question**: *What type of calculation do you want to run? (1=GEO_OPT, 2=CELL_OPT, 3=MD, 4=MULTI_MD, 5=RECALC, 6=FINETUNE, 7=FINETUNE_MULTIHEAD)*
   - **Example Answers**:
     - `1` (Geometry optimization)
     - `2` (Cell optimization)
     - `3` (Molecular dynamics)
     - `4` (Multi-configuration MDs)
     - `5` (Reference trajectory evaluation)
     - `6` (Finetuning)
     - `7` (Multihead Finetuning)
   - **Tutorial Answer**: `5` (Recalculation)

   The calculation type determines the purpose of the process, such as optimizing the geometry, running molecular dynamics, or finetuning.

---
### 5. **Simulation Environment**
   - **Question**: *Do you want to use the ASE atomic simulation environment (y) or LAMMPS (n)? (y/n)*
   - **Example Answers**:
     - `y` (for ASE simulations)
     - `n` (for LAMMPS simulations)
   - **Tutorial Answer**: `y`

   This specifies the simulation environment to be used for running the calculations. 

---
### 6. **Project Name**
   - **Question**: *What is the name of the project?*
   - **Example Answers**:
     - `MyProject` (A custom project name)
     - `MD_20250422` (A default name based on the calculation type and date)
   - **Tutorial Answer**: `MyRecalc`

   The project name is used to name the output files and organize the results.

---

### 7. **Use Default Input Settings**
   - **Question**: *Do you want to use the default input settings? (y/n)*
   - **Example Answers**:
     - `y` (Yes, use the default settings)
     - `n` (No, customize the settings)
   - **Tutorial Answer**: `y`

   - **Note**: In the Q&A Process the default settings will be presented to the user.

   Choosing `y` will use predefined settings for the selected calculation type. If `n` is chosen, additional questions will be asked to customize the input.


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the MACE input file.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Recalculation**: After the configuration is set, the script will run the MACE recalculation using the specified parameters. 
5. **Output**: The generated output files are saved in the current directory, and a log file is created to document the configuration.


---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Coordinate File**: `../data/traj.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `14.2067`
4. **Calculation Type**: `5` (Recalculation)
5. **Simulation Environment**: `y` (ASE)
6. **Project Name**: `MyRecalc`
7. **Use Default Input Settings**: `y`

By following these steps, the script will recalculate the reference trajectory with MACE.

In [25]:
# Set up the directory for the project
try:
    os.mkdir("mace_recalc")
except FileExistsError:
    print("Directory mace_recalc already exists, skipping creation.")
os.chdir("mace_recalc")

# Define the command to run amaceing_mace
command = ["amaceing_mace"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """../data/traj.xyz
y
14.2067
5
y
MyRecalc
y
n
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    


WELCOME TO THE MACE INPUT WRITER!
This tool will help you build input files for the mace framework.
Please answer the 

## Overview of the amaceing_mace Output

The `amaceing_mace` function does a recalculation with MACE based on the provided parameters. The output consists of several key components:

1. **Input File**: `recalc_mace.py` The main MACE input file containing all the necessary settings for the calculation.
2. **Log File**: `mace_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Energies**: `energies_recalc_with_mace_model_<PROJECT_NAME>` The MACE output file containing the recalculated energies.
4. **Forces**: `forces_recalc_with_mace_model_<PROJECT_NAME>.xyz` The MACE output file containing the recalculated forces.
5. **Logger**: This run was logged with the implemented logger!


## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_mace` function. Here’s how to use it:

```bash
amaceing_mace --run_type="MD" --config="{'project_name': 'MyRecalc', 'coord_file': '../data/traj.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': 'mace_mp', 'model_size': 'small', 'dispersion_via_simenv': 'n', 'simulation_environment': 'ase'}"
```

In [26]:
try:
    os.mkdir("mace_recalc_1command")
except FileExistsError:
    print("Directory mace_recalc_1command already exists, skipping creation.")
os.chdir("mace_recalc_1command")

command = """amaceing_mace --run_type="RECALC" --config="{'project_name': 'MyRecalc', 'coord_file': '../data/traj.xyz', 'pbc_list': '[14.2067 0 0 0 14.2067 0 0 0 14.2067]', 'foundation_model': 'mace_mp', 'model_size': 'small', 'dispersion_via_simenv': 'n', 'simulation_environment': 'ase'}" """

# Run 1-Command amaceing_mace 
subprocess.run(command, shell=True)

os.chdir("..")



    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
ASE input file written to: recalc.py
Runscripts written to: runscript.sh and gpu_script.job
ASE input file written to: r