# 3. Remote computing with 4C

In this tutorial, you will use the `Grid` iterator to probe the response surface of a 4C model.
The 4C simulations will run remotely on a cluster.  

> **Disclaimer:**
> You will only be able to execute the following notebook if you have access to a remote computing resource and fill in the placeholders accordingly.

## Set up the remote machine

Remote computing with QUEENS is enabled via SSH port forwarding, so a few initial steps are necessary:

1. Make sure both your local and your remote machine have an SSH key under `~/.ssh/id_rsa.pub`.
    In case either of them does not have one yet, you can generate an SSH key on the respective machine via:
    ```bash
    # execute on local or remote machine:
    ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa
    ```

1. Connecting via SSH from the local to the remote machine needs to work without a password.   
    Therefore, you need to copy the public key of the local machine to the `authorized_keys` file of the remote machine:
    ```bash
    # execute on local machine:
    ssh-copy-id -i ~/.ssh/id_rsa.pub <username>@<machine_you_would_like_to_access_passwordless>
    ```

1. To enable passwordless access on the remote machine itself, you also need to copy the ssh-key of the remote machine to its `authorized_keys` file:
    ```bash
    # execute on remote machine:
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ```

> **Troubleshooting**:
> If you are still asked for your password after these steps, verify that:
> - The `~/.ssh` directory has permissions `700`.
>   To set the permissions correctly, execute `chmod -R 700 ~/.ssh`.
> - The home directory on you remote machine has permissions `700`.
    To set the permissions correctly, execute `chmod 700 ~` on the remote machine.

4. Clone the QUEENS repository on the remote machine.

5. Install the same QUEENS environment with the same name that you are using on your local machine also on the remote machine.

> **Subclassing**:
> If you run QUEENS with custom classes that inherit from QUEENS objects, you need to ensure that these classes are also available in the remote QUEENS environment.
> You can do this in one of the following ways:
> 1. Include the custom classes in the QUEENS repository. They will be automatically copied to the remote QUEENS repository at the start of the QUEENS runs.
> 1. Manually synchronize the custom classes between your local and your remote machine, e.g., via an additional repository and install this repository in the remote QUEENS environment.

## Set up the QUEENS experiment

In the following, we will set up the QUEENS experiment for cluster usage:
1. To evaluate 4C on a cluster, use a `Jobscript` driver instead of a `Fourc` driver and adjust the paths:
    - **`input_templates`**: The local path to your 4C input template. QUEENS will copy it to the remote machine for you.
    - **`jobscript_template`**: The local path to your jobscript template. For inspiration, check out our templates under `templates/jobscripts/`.
    - **`executable`**: The absolute path to the 4C executable on the remote machine.

In [None]:
jobscript_driver_kwargs = {
    "jobscript_template": "<PATH_TO_JOBSCRIPT_TEMPLATE>",
    "executable": "<PATH_TO_4C_EXECUTABLE>",
    "extra_options": {},  # optional
}


2. Switch to the `Cluster` scheduler and set up a `RemoteConnection` for it:
    - **`remote_python`**: The absolute path to the Python executable in your conda/mamba environment on the remote machine. You can check your available environments via `conda info --envs`. The path typically looks like `/home/<user>/<conda_install>/envs/queens/bin/python`.
    - **`remote_queens_repository`**: The absolute path to your QUEENS repository on the remote machine.

In [None]:
remote_connection_kwargs = {
    "host": "<REMOTE_HOSTNAME>",
    "user": "<REMOTE_USERNAME>",
    "remote_python": "<PATH_TO_REMOTE_PYTHON>", 
    "remote_queens_repository": "<PATH_TO_REMOTE_QUEENS_REPOSITORY>",
    "gateway": None,  # optional
}
cluster_scheduler_kwargs = {
    "workload_manager": "<WORKLOAD_MANAGER>", 
    "queue": "<QUEUE>",
    "cluster_internal_address": "<CLUSTER_INTERNAL_ADDRESS>",
}

### Run the example

In [None]:
experiment_name = "grid_iterator_4C_remote"
output_dir = "./"

In [None]:
from queens.data_processors import PvdFile
from queens.distributions import Uniform
from queens.drivers import Jobscript
from queens.global_settings import GlobalSettings
from queens.iterators import Grid
from queens.main import run_iterator
from queens.models import Simulation
from queens.parameters.parameters import Parameters
from queens.schedulers import Cluster
from queens.utils.remote_operations import RemoteConnection
from queens.utils.path import relative_path_from_root


with GlobalSettings(
    experiment_name=experiment_name, output_dir=output_dir, debug=False
) as gs:
    # Parameters parameterizing a Neumann BC
    parameter_1 = Uniform(lower_bound=0.0, upper_bound=1.0)
    parameter_2 = Uniform(lower_bound=0.0, upper_bound=1.0)
    parameters = Parameters(parameter_1=parameter_1, parameter_2=parameter_2)

    # The data processor extracts the displacement vectors (with x, y, z component) of all nodes at 
    # the last time step of the simulation
    data_processor = PvdFile(
        field_name="displacement",
        file_name_identifier="*.pvd",
        file_options_dict={},
    )

    # Establish an SSH connection to the cluster
    remote_connection = RemoteConnection(**remote_connection_kwargs) 
    
    scheduler = Cluster(
        experiment_name,
        walltime="00:10:00", 
        remote_connection=remote_connection, 
        num_jobs=1,
        min_jobs=1,
        num_procs=1, 
        num_nodes=1,
        **cluster_scheduler_kwargs,
    )

    # The driver handles the actual evaluation of 4C
    driver = Jobscript(
        parameters=parameters,
        data_processor=data_processor,
        input_templates=relative_path_from_root("tutorials/solid_runtime_hex8.4C.yaml"), 
        **jobscript_driver_kwargs,
    )
    
    model = Simulation(scheduler, driver)
    
    # Analysis setup
    grid_design = {
            "parameter_1": {
                "num_grid_points": 3,
                "axis_type": "lin",
                "data_type": "FLOAT",
            },
            "parameter_2": {
                "num_grid_points": 3,
                "axis_type": "lin",
                "data_type": "FLOAT",
            },
        }
    iterator = Grid(
        model,
        parameters,
        global_settings=gs,
        grid_design=grid_design,
        result_description={"write_results": True, "plot_results": False},
    )

    # Run the analysis
    run_iterator(iterator, gs)

### Evaluate the results

Look at the results and and analyze them.

In [None]:
import numpy as np
from pathlib import Path

from queens.utils.io import load_result

# Load the results
result_file = Path(output_dir) / f"{experiment_name}.pickle"
results = load_result(result_file)

# This yields the displacement components (x, y, z) for each of the 9 grid points on each node 
# of each element 
# (Here: 2 elements with 8 nodes each, the output is written for each element individually), 
# so the resulting array is expected to have the shape (9, 16, 3)
raw_displacements = results["raw_output_data"]["result"]

# Compute the displacement magnitudes for each run on each node of each element.
# The resulting array has shape (9, 16).
point_wise_displacement_magnitudes = np.sqrt(np.sum(raw_displacements ** 2, axis=-1))

# Finally, we compute the maximum displacement that was achieved in each run.
# The resulting array has shape (9,).
max_displacement_magnitude_per_run = np.max(point_wise_displacement_magnitudes, axis=1)
print(max_displacement_magnitude_per_run)

## Where do I find all the data on the cluster?

The data is stored -- equivalently to the local runs -- in a folder with the following nomenclature:  
`$HOME/queens-experiments/<experiment_name>/<job_id>`

For example, you can find the data of the first simulation of this queens experiment in the folder `$HOME/queens-experiments/grid_iterator_4C_remote/1`

Feel free to take a look around and to find the logged 4C console output of one of the simulations.

## Lessons learned:

You have learned how to run 4C simulations on a cluster:

1. Use a `Jobscript` driver and ensure correct paths to the executables:
    - `path_to_executable` refers to a path on the cluster now.
    - `input_template` still refers to a local path making it very easy to adjust the file.
1. Use a `Cluster` scheduler and supply all necessary options.
1. The location of the QUEENS data on the cluster is `$HOME/queens-experiments`.