# Introduction

This notebook serves as a comprehensive guide for using the `amaceing_toolkit` to analyze trajectories and use some basic/complex function from the Utiles-Module. 
It includes step-by-step instructions.
Both the interactive and command-line interfaces are covered, providing flexibility for users with different preferences.

## Table of Contents

1. [Benchmarking: MACE, MatterSim and SevenNet](#B1)
2. [Analysis of a Trajectory](#B2)
3. [Prepare Error Evaluation](#B3)
4. [Error Evaluation a Finetuned Model](#B4)
5. [Get Citation for a MACE run](#B5)

# Installation 
The installation process takes a few minutes.

In [1]:
# try:
#     from amaceing_toolkit import amaceing_cp2k
# except ImportError:
#     print("amaceing_toolkit not found. Please install it using: pip install amaceing_toolkit")

#GOOGLE COLAB
# %cd
# !rm -r amaceing_toolkit
# !git clone https://github.com/jhaens/amaceing_toolkit.git
# %cd amaceing_toolkit

# !pip install ase==3.24.0 cycler==0.12.1 mace_torch==0.3.10 numpy==2.0.2 scipy==1.14.1 pybind11==2.13.6
# !pip install .
# %cd tutorials

In [2]:
# PREAMBLE
import subprocess
import os
import time

def run_atk(command, answers):
    """
    Run an ATK command with the given answers.

    Args:
        command (str): The ATK command to run.
        answers (str): A string with multiple lineas each with respective the answer to the command.

    Returns:
        str: The output from the command.
    """

    try:
        # Start the ATK process
        process = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

        # Send the answers to the process all at once
        output, error = process.communicate(input=answers)

        # Check for errors
        if process.returncode != 0:
            print("Error running command")
            print(error)
            return None

        return output

    except Exception as e:
        print(f"Error running command: {e}")
        return None


# B1

## Q&A Process for Building Benchmark Input Files

In this section, we will walk through the **Q&A process** used by the `amaceing_utils` function. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---
### 1. **Calculation Type**
   - **Question**: *Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BENCHMARK)*
   - **Example Answers**:
      - `1` (Evaluation of Error)
      - `2` (Prepare Error Evaluation)
      - `3` (Extract Frames out of an XYZ-File)
      - `4` (Get Citation for MACE run)
      - `5` (Benchmarking)
   - **Tutorial Answer**: `5`
    
    The calculation type determines the specific calculation to be performed. 

---

### 2. **Reference Trajectory File**
   - **Question**: *What is the name of the reference trajectory?*
   - **Example Answers**:
     - `system.xyz` (a file containing atomic coordinates)
     - `traj.xyz` (a reference trajectory file)
     - `/path/to/train.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `../data/dft_energies.xyz`

   The reference trajectory file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 3. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 4. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `14.2067`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---
### 5. **MD Simulations or Recalculations**
   - **Question**: *Do you want to run a MD simulation (y) or a recalculation of the AIMD trajectory (n)?*
    - **Example Answers**:
      - `y` (Molecular Dynamics simulations)
      - `n` (Recalculating the AIMD trajectory)
    - **Tutorial Answer**: `n`
    
    This indicates whether the function builds input files for MD simulations or does recalculations.

---

### 6. **Force File**
   - **Question**: *What is the name of the force file from the AIMD?*
   - **Example Answers**:
     - `forces.xyz` (a file containing atomic forces)
   - **Tutorial Answer**: `../data/dft_forces.xyz`

   The force file is essential for defining the error evaluation of the models. Ensure the file exists in the specified path.

---

### 7. **MACE Model**
   - **Question**: *Which MACE foundational model do you want to use? *
   - **Example Answers**:
     - `mace_off` (a pre-trained MACE model for organic molecules)
     - `mace_mp` (a pre-trained MACE model on the Materials Project)
   - **Tutorial Answer**: `mace_mp`

---

### 8. **MACE Model Size**
   - **Question**: *Which MACE model size do you want to use?*
   - **Example Answers**:
      - `small` (a small MACE model)
      - `medium` (a medium MACE model)
      - `large` (a large MACE model)
   - **Tutorial Answer**: `small`

---

### 9. **MatterSim Model**
   - **Question**: *Which MatterSim model do you want to use?*
   - **Example Answers**:
      - `small` (a pre-trained MatterSim model: MatterSim-v1.0.0-1M.pth)
      - `large` (a pre-trained MatterSim model: MatterSim-v1.0.0-5M.pth)
   - **Tutorial Answer**: `small`

---

### 10. **SevenNet Model**
   - **Question**: *Which SevenNet model do you want to use?*
   - **Example Answers**:
      - `1` (7net-mf-ompa: multi-fidelity model trained on Materials Project data, Alexandria data and Meta Open Materials 2024 data)
      - `2` (7net-omat: model trained on Meta Open Materials 2024 data)
      - `3` (7net-l3i5: model trained on Materials Project data (increased maximum spherical harmonics degree to 3))
      - `4` (7net-0: model trained on Materials Project data)
   - **Tutorial Answer**: `4`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the Utils function.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the input file using the gathered information and predefined templates.
5. **Recalculation**: After the configuration is set, the script will run the recalculation using the specified parameters for all packages which are installed. The specific recalculation command for the packages which are not installed will be printed out. 
6. **Output**: The generated input files are saved in own directories, and a log file is created to document the configuration.
7. **Error Evaluation**: The script evaluates the error of the models based on the generated input files and the reference trajectory. (Only the installed packages will be used for the error evaluation. The command for the packages which are not installed will be printed out.)

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Calculation Type**: `5`
2. **Reference Trajectory File**: `../data/dft_energies.xyz`
3. **Box Shape**: `y`
4. **Box Dimensions**: `14.2067`
5. **MD Simulations or Recalculations**: `n`
6. **Force File**: `../data/dft_forces.xyz`
7. **MACE Model**: `mace_mp`
8. **MACE Model Size**: `small`
9. **MatterSim Model**: `small`
10. **SevenNet Model**: `4`

By following these steps, you will be able to generate the input files needed for the benchmarking process.


In [3]:
# Set up the directory for the project
try:
    os.mkdir("utils_benchmark")
except FileExistsError:
    print("Directory utils_benchmark already exists, skipping creation.")
os.chdir("utils_benchmark")

# Define the command to run amaceing_utils
command = ["amaceing_utils"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """5
../data/dft_energies.xyz
y
14.2067
n
../data/dft_forces.xyz
mace_mp
small
small
4
"""#+"""y
#1
#/home/USER/anaconda3/etc/profile.d/conda.sh
#""" # Pseudo input to configure the HPC setup
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BE

## Overview of the amaceing_utils Output

The `amaceing_utils` function does a recalculation with , MatterSim and SevenNet based on the provided parameters. The output consists of the three main folders: `mace`, `mattersim` and `sevennet`. The directories for installed packages contain:

1. **Input File**: `recalc_<PACKAGE_NAME>.py` The main input file containing all the necessary settings for the calculation.
2. **Log File**: `<PACKAGE_NAME>_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Energies**: `energies_recalc_with_<PACKAGE_NAME>_model_benchmark` The output file containing the recalculated energies.
4. **Forces**: `forces_recalc_with_<PACKAGE_NAME>_model_benchmark.xyz` The output file containing the recalculated forces.
5. **Errors**: `errors.txt` The output file containing the mean absolute & relative force errors and the mean absolute energy error.
6. **Logger**: This run was logged with the implemented logger!

The other directories are empty and are ready for executing the given commands inside. The commands for the packages which are not installed were print out.


## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_utils` function. Here’s how to use it:

```bash
amaceing_utils --run_type="BENCHMARK" --config="{'mode': 'RECALC', 'coord_file': '../data/dft_energies.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'force_nsteps': '../data/dft_forces.xyz', 'mace_model': '['mace_mp' 'small']', 'mattersim_model': 'small', 'sevennet_model': '['7net-0' '']'}"
```

In [27]:
try:
    os.mkdir("benchmark_1command")
except FileExistsError:
    print("Directory benchmark_1command already exists, skipping creation.")
os.chdir("benchmark_1command")

command = """amaceing_utils --run_type="BENCHMARK" --config="{'mode': 'RECALC', 'coord_file': '../data/dft_energies.xyz', 'pbc_list': '[14.2067 14.2067 14.2067]', 'force_nsteps': '../data/dft_forces.xyz', 'mace_model': '['mace_mp' 'small']', 'mattersim_model': 'small', 'sevennet_model': '['7net-0' '']'}" """

# Run 1-Command amaceing_utils 
subprocess.run(command, shell=True)

os.chdir("..")

Directory benchmark_1command already exists, skipping creation.

    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
{'mode': 'RECALC', 'coord_file': '../data/dft_energies.x

  _Jd, _W3j_flat, _W3j_indices = torch.load(os.path.join(os.path.dirname(__file__), 'constants.pt'))
  torch.load(f=model_path, map_location=device)





 STARTING THE RECALCULATION OF THE REFERENCE TRAJECTORY WITH mattersim



Mattersim is currently not installed (in this environment). Please install it first or change to the respective environment.
The MatterSim Run can be run via: 
amaceing_mattersim --run_type="RECALC" --config="{project_name: benchmark, coord_file: ../../data/dft_energies.xyz, pbc_list: [14.2067 14.2067 14.2067], foundation_model: small, dispersion_via_ase: n}" 



 STARTING THE RECALCULATION OF THE REFERENCE TRAJECTORY WITH sevennet



SevenNet is currently not installed (in this environment). Please install it first or change to the respective environment.
The SevenNet Run can be run via: 
amaceing_sevennet --run_type="RECALC" --config="{'project_name': benchmark, 'coord_file': ../../data/dft_energies.xyz, 'pbc_list': [14.2067 14.2067 14.2067], 'foundation_model': 7net-0, 'modal': , 'dispersion_via_ase': n}" 

Running the EVAL_ERROR workflow for mace...

The mean absolute force error is 0.10021999 eV/Angstrom.

# B2

## Q&A Process for Analyzing a Trajectory

In this section, we will walk through the **Q&A process** used by the `amaceing_ana` function. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

--- 
### 1. **Trajectory File**
   - **Question**: *What is the name of the trajectory?*
   - **Example Answers**:
     - `traj.xyz` (a trajectory file)
     - `/path/to/traj.xyz` (an absolute path to the coordinate file)
   - **Tutorial Answer**: `koh1.xyz`

   The trajectory file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 2. **Box Shape**
   - **Question**: *Is the box cubic? (y/n/pbc)*
   - **Example Answers**:
     - `y` (Yes, the box is cubic)
     - `n` (No, the box is not cubic; dimensions will be specified separately)
     - `pbc` (Provide a file with periodic boundary conditions)
   - **Tutorial Answer**: `y`

   If the box is cubic, the same dimension will be used for all three axes. Otherwise, the dimensions for each axis must be specified.

---

### 3. **Box Dimensions**
   - **Question**: *What is the length of the box in Å?*
   - **Example Answers**:
     - `10.0` (A cubic box with a side length of 10 Å)
     - `15.0` (A cubic box with a side length of 15 Å)
   - **Tutorial Answer**: `13.8452`

   This specifies the size of the simulation box. For non-cubic boxes, dimensions for each axis (x, y, z) would be requested separately.

---

### 4. **Timestep**
   - **Question**: *What is the timestep in fs?*
   - **Example Answers**:
      - `0.5` (A timestep of 0.5 fs)
      - `1.0` (A timestep of 1.0 fs)
   - **Tutorial Answer**: `50.0`
    
    This specifies the time step used in the trajectory file.

---

### 5. **Single/Multiple Analysis**
   - **Question**: *How many trajectories do you want to analyze (with the same setup)?*
   - **Example Answers**:
        - `1` (Single trajectory analysis)
        - `5` (Multiple trajectory analysis)
   - **Tutorial Answer**: `1`
     
     This indicates whether the function analyzes a single trajectory or multiple trajectories. If you choose multiple, you will be prompted to provide the names of the additional trajectory files and you will be asked to give keywords for the trajectories.

---

### 6. **Analysis from scratch or  Smart Proposal**
   - **Question**: *Do you want to configure the whole analysis from scratch (y) or do you want to accept or refine the smart proposal (n) for the analysis?*
   - **Example Answers**:
          - `y` (Yes, configure the analysis from scratch)
          - `n` (No, accept or refine the smart proposal)
   - **Tutorial Answer**: `n`
   - **Note**: The smart proposal will be printed to the screen.

    If you configure the analysis from scratch, you will be asked to provide the specific parameters for the analysis.

--- 

### 7. **Refine Smart Proposal**
   - **Question**: *Do you want to refine the analysis?*
   - **Example Answers**:
          - `y` (Yes, refine the smart proposal)
          - `n` (No, use the smart proposal as is)
   - **Tutorial Answer**: `n`
   - **Note**: To do single-particle MSD please use `y` and refine the smart proposal.

--- 

### 8. **Evaluate Diffusion Coefficients**
   - **Question**: *Do you want to calculate the diffusion coefficient for the MSD runs?*
   - **Example Answers**:
        - `y` (Yes, calculate the diffusion coefficients)
        - `n` (No, do not evaluate the diffusion coefficients)
    - **Tutorial Answer**: `y`

---
### 9. **Visualization**
   - **Question**: *Do you want to visualize the analysis?*
   - **Example Answers**:
        - `y` (Yes, visualize the analysis)
        - `n` (No, do not visualize the analysis)
    - **Tutorial Answer**: `y`

### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the Analysis function.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Analysis**: The script writes the input file using the gathered information and predefined templates.
6. **Output**: The generated input files are saved own directories, and a log file is created to document the configuration.
7. **Visualization**: The script will build plots and a corresponding tex file for the analysis. You can create a pdf file from the tex file using `pdflatex` (Please do two time pdflatex to get the references right).

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:
1. **Trajectory File**: `koh1.xyz`
2. **Box Shape**: `y`
3. **Box Dimensions**: `13.8452`
4. **Timestep**: `50.0`
5. **Single/Multiple Analysis**: `1`
6. **Analysis from scratch or  Smart Proposal**: `n`
7. **Refine Smart Proposal**: `n`
8. **Evaluate Diffusion Coefficients**: `n`
9. **Visualization**: `y`

By following these steps, you will be able to analyze the trajectory.

In [18]:
# Set up the directory for the project
try:
    os.mkdir("analysis")
except FileExistsError:
    print("Directory analysis already exists, skipping creation.")
os.chdir("analysis")

Directory analysis already exists, skipping creation.


In [19]:
#wget https://cloud.tu-ilmenau.de/s/wDRecAYSpPxXiZk/download/koh1.xyz

In [20]:
# Define the command to run amaceing_mace
command = ["amaceing_ana"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """koh1.xyz
y
13.8452
50
1
n
n
y
y
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Welcome to the aMACEing toolkit trajectory analyzer!
This tool will help you analyze your trajectory files.
Please answe

## Overview of the amaceing_ana Output

The `amaceing_ana` function does a analysis of the given trajectory with the provided parameters. The output consists of the analysis results, the plots and the tex file.

1. **RDF Output Files**: `rdf_<ATOM_PAIR>.csv` The output file of a RDF analysis.
2. **RDF Plot Files**: `rdf_<ATOM_PAIR>_plot.pdf` The plot file of a RDF analysis.
3. **MSD Output Files**: `msd_<ATOM_TYPE>.csv` The output file of a MSD analysis.
4. **MSD Plot Files**: `msd_<ATOM_TYPE>_plot.pdf` The plot file of a MSD analysis.
5. **Diffusion Coefficient Output Files**: `diff_coeff_<ATOM_TYPE>.csv` The output file of the diffusion coefficient evaluation (containing the calue and the standard deviation).
6. **Diffusion Coefficient Overview File**: `overview_diffcoeff.csv` The output file of all diffusion coefficient evaluations.
7. **Tex File**: `analysis.tex` The tex file containing the plots and the analysis results.
8. **Image Folder**: `img_dir/` The folder containing the images for the tex file.

## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_ana` function. Here’s how to use it:

```bash
amaceing_ana -f="koh1.xyz" -p="../data/pbc_ana" -t="50" -v="y"
```

In [24]:
try:
    os.mkdir("analysis_1command")
except FileExistsError:
    print("Directory analysis_1command already exists, skipping creation.")
os.chdir("analysis_1command")

Directory analysis_1command already exists, skipping creation.


In [25]:
#wget https://cloud.tu-ilmenau.de/s/wDRecAYSpPxXiZk/download/koh1.xyz

In [26]:
command = """amaceing_ana -f="koh1.xyz" -p="../data/pbc_ana" -t="50" -v="y" """

# Run 1-Command amaceing_ana
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    

You selected the following parameters for the analysis: koh1.xyz, ../data/pbc_ana, 50, y

The analysis plan (based on s

Processing step: 0
Processing step: 0


Data saved to msd_O.csv
Data saved to msd_H.csv
Analysis finished at: 09:46:15 (duration: 0.236565 s)

The diffusion coefficient for O is: 0.5276899856031971 A**2/ps
The diffusion coefficient for H is: 0.5770918451889415 A**2/ps
LaTeX file analysis.tex created.
You can compile it the analysis.tex with the following command:
pdflatex analysis.tex
(To get the table of contents on the first page, run pdflatex twice.)

    ┌
    │ If you use aMACEing_toolkit in your research, please cite the following publication:
    │
    │ PREPRINT AVAILABLE SOON!
    └
    


# B3

## Q&A Process for Preparing Error Evaluation

In this section, we will walk through the **Q&A process** used by the `amaceing_utils` function. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---
### 1. **Calculation Type**
   - **Question**: *Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BENCHMARK)*
    - **Example Answers**:
      - `1` (Evaluation of Error)
      - `2` (Prepare Error Evaluation)
      - `3` (Extract Frames out of an XYZ-File)
      - `4` (Get Citation for MACE run)
      - `5` (Benchmarking)
    - **Tutorial Answer**: `2`
    
    The calculation type determines the specific calculation to be performed. 

---

### 2. **Reference Trajectory File**
   - **Question**: *What is the name of the MLIP trajectory (.traj-File)?*
   - **Example Answers**:
     - `ref.traj` (a ASE file containing atomic coordinates)
   - **Tutorial Answer**: `../data/trajectory.traj`

   The MLIP trajectory file is essential for defining the atomic structure of the system. Ensure the file exists in the specified path.

---

### 3. **Fraction of Trajectory**
   - **Question**: *Which n-th frame do you want to extract from the file?*
   - **Example Answers**:
     - `1` (Every frame will be extracted)
     - `2` (Each second frame will be extracted)
     - `1000` (Each 1000-th frame will be extracted)
   - **Tutorial Answer**: `1`

---

### 4. **Run CP2K Force Recalculation**
   - **Question**: *Do you want to run the CP2K calculation for the error evaluation now?*
   - **Example Answers**:
     - `y` (Starts CP2K Input generation with `amaceing_cp2k`)
     - `n` (Prints out the command for `amaceing_cp2k`)
   - **Tutorial Answer**: `y`

---
### 5. **Training data XC-Functional*
   - **Question**: *Which XC functional do you want to use for the CP2K calculation? ()'PBE', 'PBE_SR', 'BLYP', 'BLYP_SR')*
    - **Example Answers**:
      - `PBE` (PBE functional)
      - `PBE_SR` (PBE functional with short-range correction)
      - `BLYP` (BLYP functional)
    - **Tutorial Answer**: `BLYP`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the Utils function.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Input File Creation**: The script writes the input file using the gathered information and predefined templates.
5. **Output**: The generated input file are in a own directory, and a log file is created to document the configuration.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Calculation Type**: `2`
2. **Reference Trajectory File**: `../data/trajectory.traj`
3. **Fraction of Trajectory**: `1`
4. **Run CP2K Force Recalculation**: `y`
5. **Training data XC-Functional**: `BLYP`

By following these steps, you will be able to generate the input files needed for the benchmarking process.


In [11]:
# Set up the directory for the project
try:
    os.mkdir("utils_prepare_erroreval")
except FileExistsError:
    print("Directory utils_prepare_erroreval already exists, skipping creation.")
os.chdir("utils_prepare_erroreval")

# Define the command to run amaceing_utils
command = ["amaceing_utils"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """2
../data/trajectory.traj
1
y
BLYP
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BE

## Overview of the amaceing_utils Output

The `amaceing_utils` function prepares the error evaluation. The output consists of files and one folders: `eval_data`. Inside this folder you will find the following files:

1. **Input File**: `reftraj_cp2k.inp` The CP2K input file for the Force recalculation using first principles methods containing all the necessary settings for the calculation.
2. **Log File**: `cp2k_input.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)
3. **Runscript**: `runscript.sh` The runscript containing information specified for the compute nodes.
4. **Coordinate File**: `eval_run_frame0.xyz` The coordinate file containing the first frame of the reference trajectory.

In the main folder you will find the following files:
5. **PBC-File**: `pbc` The PBC file containing the periodic boundary conditions.
6. **Reference XYZ-File**: The reference XYZ file containing the reference trajectory.
7. **Reference Force-File**: The reference force file containing the reference forces.
8. **Log File**: `utils.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)

The other directories are empty and are ready for executing the given commands inside. The commands for the packages which are not installed were print out.


## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_utils` function. Here’s how to use it:

```bash
amaceing_utils --run_type="PREPARE_EVAL_ERROR" --config="{'traj_file': '../data/trajectory.traj', 'each_nth_frame': '1', 'start_cp2k': 'y', 'log_file': '', 'xc_functional': 'BLYP'}"
```

In [12]:
try:
    os.mkdir("prepare_evalerror_1command")
except FileExistsError:
    print("Directory prepare_evalerror_1command already exists, skipping creation.")
os.chdir("prepare_evalerror_1command")

command = """amaceing_utils --run_type="PREPARE_EVAL_ERROR" --config="{'traj_file': '../data/trajectory.traj', 'each_nth_frame': '1', 'start_cp2k': 'y', 'log_file': '', 'xc_functional': 'BLYP'}" """

# Run 1-Command amaceing_utils 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Wrote the file  mace_coord.xyz
Wrote the file  mace_force.xyz
Extracted every 1 frame from the file ../data/trajectory.t

# B4

## Q&A Process for Error Evaluation

In this section, we will walk through the **Q&A process** used by the `amaceing_utils` function. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---
### 1. **Calculation Type**
   - **Question**: *Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BENCHMARK)*
    - **Example Answers**:
      - `1` (Evaluation of Error)
      - `2` (Prepare Error Evaluation)
      - `3` (Extract Frames out of an XYZ-File)
      - `4` (Get Citation for MACE run)
      - `5` (Benchmarking)
    - **Tutorial Answer**: `1`
    
    The calculation type determines the specific calculation to be performed. 

---

### 2. **Ground Truth Energy File**
   - **Question**: *What is the name of the ground truth energy file?*
   - **Example Answers**:
     - `traj.xyz` (A file containing atomic coordinates)
   - **Tutorial Answer**: `../data/dft_energies.xyz`

---

### 3. **Ground Truth Force File**
   - **Question**: *What is the name of the ground truth force file?*
   - **Example Answers**:
     - `force.xyz` (A file containing atomic forces)
   - **Tutorial Answer**: `../data/dft_forces.xyz`

---

### 4. **Comparison Energy File**
   - **Question**: *What is the name of the comparison energy file (energy file or trajectory)?*
   - **Example Answers**:
     - `energy.xyz` (XYZ-File with Energy Keyword in Comment-Line)
     - `energy.txt` (TXT-File with the Energy for each frame)
   - **Tutorial Answer**: `../data/mace_energies.txt`

---
### 5. **Comparison Force File**
   - **Question**: *What is the name of the comparison force file?*
   - **Example Answers**:
     - `force.xyz` (A file containing atomic forces)
   - **Tutorial Answer**: `../data/mace_forces.xyz`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the Utils function.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Output**: The function calculates the errors and generates a text file containing the errors.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Calculation Type**: `1`
2. **Ground Truth Energy File**: `../data/dft_energies.xyz`
3. **Ground Truth Force File**: `../data/dft_forces.xyz`
4. **Comparison Energy File**: `../data/mace_energies.txt`
5. **Comparison Force File**: `../data/mace_forces.xyz`

By following these steps, you will be able to calculate the errors.


In [17]:
# Set up the directory for the project
try:
    os.mkdir("utils_evalerror")
except FileExistsError:
    print("Directory utils_evalerror already exists, skipping creation.")
os.chdir("utils_evalerror")

# Define the command to run amaceing_utils
command = ["amaceing_utils"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """1
../data/dft_energies.xyz
../data/dft_forces.xyz
../data/mace_energies.txt
../data/mace_forces.xyz
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")

Directory utils_evalerror already exists, skipping creation.

    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Which type of calculation do you want to run? (1=EVAL_ERROR

## Overview of the amaceing_utils Output

The `amaceing_utils` function calculates the errors of the utilized model. The output consists of the following files:

1. **Error-File**: `error.txt` The output file containing the mean absolute & relative force errors and the mean absolute energy error.
2. **Log File**: `utils.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)


## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_utils` function. Here’s how to use it:

```bash
amaceing_utils --run_type="EVAL_ERROR" --config="{'ener_filename_ground_truth': '../data/dft_energies.xyz', 'force_filename_ground_truth': '../data/dft_forces.xyz', 'ener_filename_compare': '../data/mace_energies.txt', 'force_filename_compare': '../data/mace_forces.xyz'}"
```

In [14]:
try:
    os.mkdir("evalerror_1command")
except FileExistsError:
    print("Directory evalerror_1command already exists, skipping creation.")
os.chdir("evalerror_1command")

command = """amaceing_utils --run_type="EVAL_ERROR" --config="{'ener_filename_ground_truth': '../data/dft_energies.xyz', 'force_filename_ground_truth': '../data/dft_forces.xyz', 'ener_filename_compare': '../data/mace_energies.txt', 'force_filename_compare': '../data/mace_forces.xyz'}" """

# Run 1-Command amaceing_utils 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
The mean absolute force error is 0.10021998 eV/Angstrom.
The mean relative force error is 0.14060365.
The mean absolute 

# B5

## Q&A Process for Preparing Error Evaluation

In this section, we will walk through the **Q&A process** used by the `amaceing_utils` function. This process is normally interactive and asks the user a series of questions to gather the necessary information for generating the input file. Below, we enumerate the questions, provide example answers, and highlight the answers we will use in this tutorial. In this notebook all answers have to be given in the code cells, but in a real-world scenario, the user would be prompted to answer these questions interactively.

---
### 1. **Calculation Type**
   - **Question**: *Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BENCHMARK)*
   - **Example Answers**:
      - `1` (Evaluation of Error)
      - `2` (Prepare Error Evaluation)
      - `3` (Extract Frames out of an XYZ-File)
      - `4` (Get Citation for MACE run)
      - `5` (Benchmarking)
   - **Tutorial Answer**: `4`

---

### 2. **Log File**
   - **Question**: *What is the name of the log file of the MACE run?*
   - **Example Answers**:
     - `mace_input.log` (The log file of a MACE run)
   - **Tutorial Answer**: `../data/mace_input.log`


### Process Overview
1. **Interactive Input**: The script prompts the user with a series of questions to gather the required information for the Utils function.
2. **Validation**: The script validates the user input (e.g., checks if the coordinate file exists).
3. **Configuration**: Based on the answers, the script generates a configuration dictionary that contains all the necessary parameters.
4. **Output**: The function gives out the specific citations for a MACE run.

---

### Example Walkthrough
Here’s how the Q&A process will look in this tutorial:

1. **Calculation Type**: `4`
2. **Log File**: `../data/mace_input.log`

By following these steps, you will be able to get the citations.


In [15]:
# Set up the directory for the project
try:
    os.mkdir("utils_citation")
except FileExistsError:
    print("Directory utils_citation already exists, skipping creation.")
os.chdir("utils_citation")

# Define the command to run amaceing_utils
command = ["amaceing_utils"]

# Define the answers to the questions as a string (each answer separated by a newline)
answers = """4
../data/mace_input.log
"""
# Run the command with the provided answers
print(run_atk(command, answers))
os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
Which type of calculation do you want to run? (1=EVAL_ERROR, 2=PREPARE_EVAL_ERROR, 3=EXTRACT_XYZ, 4=MACE_CITATIONS, 5=BE

## Overview of the amaceing_utils Output

The `amaceing_utils` function printes out the citations for a specific MACE run. The output consists of the following file:

1. **Log File**: `utils.log` A log file documenting the configuration and parameters used in the input file. (Gives the possibility to recreate the input file, ...)


## End of Q&A Process

## One-Command Process
The same process can be done in one command using the `amaceing_utils` function. Here’s how to use it:

```bash
amaceing_utils --run_type="MACE_CITATIONS" --config="{'log_file': '../data/mace_input.log'}"
```

In [16]:
try:
    os.mkdir("citation_1command")
except FileExistsError:
    print("Directory citation_1command already exists, skipping creation.")
os.chdir("citation_1command")

command = """amaceing_utils --run_type="MACE_CITATIONS" --config="{'log_file': '../data/mace_input.log'}" """

# Run 1-Command amaceing_utils 
subprocess.run(command, shell=True)

os.chdir("..")


    ┌──────────────────────────────────────────────────────────────────────────────────────────┐
    │              __  ______   _____________                 __              ____   _ __      │
    │       ____ _/  |/  /   | / ____/ ____(_)___  ____ _    / /_____  ____  / / /__(_) /_     │
    │      / __ `/ /|_/ / /| |/ /   / __/ / / __ \/ __ `/   / __/ __ \/ __ \/ / //_/ / __/     │
    │     / /_/ / /  / / ___ / /___/ /___/ / / / / /_/ /   / /_/ /_/ / /_/ / / ,< / / /_       │
    │     \__,_/_/  /_/_/  |_\____/_____/_/_/ /_/\__, /____\__/\____/\____/_/_/|_/_/\__/       │
    │                                           /____/_____/                                   │
    │     by Jonas Hänseroth, Theoretical Solid-State Physics, Ilmenau University of Technology│
    └──────────────────────────────────────────────────────────────────────────────────────────┘ 
    
../data/mace_input.log

Citations for MACE:
 1. Ilyes Batatia, David Peter Kovacs, Gregor N. C. Simm, Christoph Ortner, 