Skip to content

Commit

Permalink
👌IMPROVE: running processes sections (#431)
Browse files Browse the repository at this point in the history
  • Loading branch information
mbercx committed Sep 26, 2022
1 parent fec8a75 commit be2ca1f
Show file tree
Hide file tree
Showing 2 changed files with 53 additions and 16 deletions.
16 changes: 10 additions & 6 deletions docs/sections/running_processes/basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,11 +241,11 @@ SSSP/1.1/PBE/precision pseudo.family.sssp 85
SSSP/1.1/PBE/efficiency pseudo.family.sssp 85
:::

You can see above that the AiiDAlab cluster already comes with
The list of pseudopotential families might differ for you, depending on where you are running the tutorial.

::::{note}

If you are using the Quantum Mobile virtual machine, you will need to install the `SSSP` [pseudopotentials][pseudopotentials].
If you do not see any pseudopotential families in the list, you will need to install the `SSSP` [pseudopotentials][pseudopotentials].
Luckily, doing it with `aiida-pseudo` is easy!
All you need to do is run:

Expand Down Expand Up @@ -296,13 +296,13 @@ We will now show you how to do it using a *builder*, which is a tool that is par
The simplest way to get a builder for a calculation is from a code node, so load the one we checked at the begining of this module:

:::{margin}
**Remember:** you need to replace `<CODE_PK>` with the PK of the `pw.x` code in your database!
You can also use the label.
**Remember:** you need to replace `<CODE_LABEL>` with the label of the `pw.x` code in your database!
You can also use the PK, but the label is probably easier to remember.
:::

:::{code-block} ipython

In [1]: code = load_code(<CODE_PK>)
In [1]: code = load_code(<CODE_LABEL>)

:::

Expand Down Expand Up @@ -424,7 +424,7 @@ The `builder.parameters` port requires a `Dict` node (you can verify this by run

:::{code-block} ipython

In [9]: builder.parameters = Dict(dict=parameters)
In [9]: builder.parameters = Dict(parameters)

:::

Expand Down Expand Up @@ -625,6 +625,10 @@ Out[3]: -310.56907438957

Moreover, you can also easily access the input and output files of the calculation using the `verdi` CLI:

:::{margin}
The `<PK>` here should correspond to the one of your calculation node.
:::

:::{code-block} console
$ verdi calcjob inputls <PK> # Shows the list of input files
$ verdi calcjob inputcat <PK> # Shows the input file of the calculation
Expand Down
53 changes: 43 additions & 10 deletions docs/sections/running_processes/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
# Troubleshooting

In this section we will intentionally introduce some bad input parameters when setting up our calculation.
This will allow us to illustrate how to 'manually' debug problems that might arise while managing your computations with AiiDA.
This will allow us to illustrate how to manually debug problems that might arise while managing your computations with AiiDA.
You will learn where to look for the information of the errors and the steps you need to take to correct the issue.
In subsequent tutorial sections you can then learn how to systematize this error handling when designing complex workflows.
In later tutorial sections you can then learn how to systematize this error handling when designing complex workflows.

:::{attention}

Expand Down Expand Up @@ -52,7 +52,7 @@ If you are running these tests on a cluster, you may need to set up account perm

For the `structure` you can download the following {download}`silicon crystal<include/data/Si.cif>` and import it into your database.
If you have already done so previously (as it is used in other tutorial sections), you may want to use that pre-existing node instead of saving a new node with repeated information.
To do so you may search for its PK by running `verdi data structure list` and then use the function `load_node()` to retrieve it.
To do so you may search for its PK by running `verdi data core.structure list` and then use the function `load_node()` to retrieve it.

For the `pseudos` (or [pseudopotentials](https://en.wikipedia.org/wiki/Pseudopotential)), you can use the `SSSP/1.1/PBE/efficiency` family of the `aiida-pseudo` package.
If you already have it installed, it is enough to use the `load_group()` function and then the `get_pseudos()` method of the loaded pseudo group.
Expand Down Expand Up @@ -90,7 +90,7 @@ Finally, wrap the standard Python dictionary `parameters_dictionary` in an AiiDA

```{code-block} ipython
In [4]: builder.parameters = Dict(dict=parameters_dictionary)
In [4]: builder.parameters = Dict(parameters_dictionary)
```

Expand Down Expand Up @@ -173,8 +173,8 @@ $ verdi process list -a -p1
```

Your calculation should end up in a finished state, but with some error: this was expected in this case, since we used an invalid key in the input parameters.
You will see this represented by a non-zero error code in brackets near the "Finished" status of the Process State.
Your calculation should end up in a `Finished` state, but in this case the _exit code_ of the calculation, found after the `Finished` state, is `[305]` instead of zero (`[0]`).
This indicates that the process did not complete successfully, which was expected since we used an invalid key in the input parameters.

:::{note}

Expand Down Expand Up @@ -242,7 +242,18 @@ $ verdi calcjob outputcat <PK> | less
```

You will see an error message complaining about the `mickeymouse` line in the input.
You will see an error message complaining about the `mickeymouse` line in the input:

```{code-block}
Reading input from aiida.in
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine read_namelists (1):
bad line in namelist &system: " mickeymouse = 2.4000000000d+02" (error could be in the previous line)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stopping ...
```

### Checking the inputs

Expand Down Expand Up @@ -283,14 +294,27 @@ In [9]: parameters_dictionary = {
...: 'electron_maxstep': 3,
...: }
...: }
...: builder.parameters = Dict(dict=parameters_dictionary)
...: builder.parameters = Dict(parameters_dictionary)
...: calculation = submit(builder)
```

Use `verdi process list -a -p1` to verify that the error code is different now.
You can check again the outputs and the reports with the tools explained in this section and try to fix it yourself before going on to the next.

:::{note}

Curious which exit codes are defined for the `PwCalculation` plugin?
You can see all exit codes for a calculation job using the `verdi plugin list` command:

```{code-block} console
verdi plugin list aiida.calculations quantumespresso.pw
```

:::


## Restarting calculations

Expand Down Expand Up @@ -332,7 +356,7 @@ The `aiida-quantumespresso` plugin supports restarting a calculation by setting
In [12]: parameters['CONTROL']['restart_mode'] = 'restart'
...: restart_builder.parent_folder = failed_calculation.outputs.remote_folder
...: restart_builder.parameters = Dict(dict=parameters)
...: restart_builder.parameters = Dict(parameters)
```

Expand All @@ -350,9 +374,18 @@ Finally, let's label this calculation as a restarted one and submit the new calc
In [13]: from aiida.engine import submit
...: restart_builder.metadata.label = 'Restart from PwCalculation<{}>'.format(failed_calculation.pk)
...: calculation = submit(restart_builder)
...: calcjob_node = submit(restart_builder)
```

Inspect the restarted calculation to verify that, this time, it completes successfully.
You should see a "Finished" status with exit code zero (`0`) when running `verdi process list - a -p1`.

:::{important} **Key takeaways**

- When a process fails, it will return a non-zero **exit code**.
- The logs of any process can be shown using `verdi process report`.
More clues of what went wrong in a failed calculation can be obtained by exploring the output files using `verdi calcjob outputcat`.
- A fully populated builder with the same inputs can be obtained with the `get_builder_restart()` method of any calcjob node.

:::

0 comments on commit be2ca1f

Please sign in to comment.