# Quick start: ground state calculation for benzene

Postopus (the POST-processing of Octopus data) is an environment that can help with the analysis of data that was computed using the Octopus TDDFT package.

In this notebook we give some quick examples how easy you can access data of Octopus using Postopus.
The notebook assumes that you have Octopus installed and is in your (bash) path.


## Running the simulation
Octopus takes an input file describing the systems and the simulation parameters.
The input file is called `inp` and is a text file with no extension.

(The `!command` IPython Magic lets us execute bash commands in the notebook.)

Create the directory in which we run octopus:

In [None]:
!mkdir -p "examples/benzene"

In [None]:
cd "examples/benzene"

Now create the `inp` file. This is achieved with the [`%%writefile`](https://ipython.readthedocs.io/en/stable/interactive/magics.html#cellmagic-writefile) magic command:

In [None]:
%%writefile inp

stdout = "gs_stdout.txt"
stderr = "gs_stderr.txt"

CalculationMode = gs
UnitsOutput = eV_Angstrom

Radius = 5*angstrom
Spacing = 0.15*angstrom

%Output 
  wfs 
  density 
  elf 
  potential
%

OutputFormat = cube
OutputDuringSCF = yes
OutputInterval = 5

XYZCoordinates = "benzene.xyz"
FromScratch = true


Similarly we define the geometry file (benzene.xyz):

In [None]:
%%writefile benzene.xyz
12
Geometry of benzene (in Angstrom)
   C  0.000  1.396  0.000
   C  1.209  0.698  0.000
   C  1.209 -0.698  0.000
   C  0.000 -1.396  0.000
   C -1.209 -0.698  0.000
   C -1.209  0.698  0.000
   H  0.000  2.479  0.000
   H  2.147  1.240  0.000
   H  2.147 -1.240  0.000
   H  0.000 -2.479  0.000
   H -2.147 -1.240  0.000
   H -2.147  1.240  0.000


Now we can invoke octopus:

In [None]:
!octopus

Octopus stores the calculation results in the project directory.
If you were to take a look at the `benzene` project folder, you would see something like:

    ├── benzene.xyz                     # Geometry of the molecules (input)
    ├── exec                            # Runtime information
    │   ├── initial_coordinates.xyz
    │   ├── messages
    │   ├── oct-status-finished
    │   └── parser.log                  # Full set of variables used for the run
    ├── inp                             # Our input file 
    ├── out_gs.log                      # Log file
    ├── output_iter                     # Output for each iteration in scf/td calculation
    │   ├── scf.0005
    │   │   ├── density.cube
    │   │   └── ...
    │   └── scf.0010
    │       ├── density.cube
    │       └── ...
    ├── restart
    │   └── gs                          # Ground state calculations
    │       ├── 0000000001.obf          # Checkpoint file to restart calculation in case of job abortion
    │       ├── 0000000002.obf
    │       ├── ...
    │       └── wfns
    └── static                          # Ground state observables
        ├── info                        # Result of ground state calculation (like energy eigenvalues and forces)
        ├── convergence
        ├── density.cube
        ├── ...
        └── wf-st0015.z=0

These new files are the results of the simulation.
We will now analyse the output files to examine the simulation results. The Postopus package helps in this part.

## Postopus

First, we have to import the Octopus run using Postopus:

In [None]:
from postopus import Run

run = Run(".")  # or equivalent run = Run()

### System selection

Now we have to select the system. To list available systems we print the run object. If the [multisystem](https://octopus-code.org/documentation/13/tutorial/multisystem/introduction/) feature of Octopus is not used, Postopus uses the name "default".

In [None]:
run

### Calculation modes

We can take a look at the different types of calculations done in this system by running:

In [None]:
run.default

This tells us that this project had only a self-consistent field simulation (`scf`).

### Outputs

Printing the calculation mode gives us a list of available outputs:

In [None]:
run.default.scf

### Autocompletion

The tab completion feature available in Jupyter notebooks and IPython supports navigating the systems, calculation modes and outputs.
Here is an example image of how the tab completion feature works:

![images/tab_completion_eg.png](images/tab_completion_eg.png)


## Output files

Postopus makes all the files of the Octopus simulation easily accesible. The data is provided mostly as [pandas](https://pandas.pydata.org/) DataFrame or [xarray](https://docs.xarray.dev/en/stable/) DataArray/Dataset:

### Table-like structured files
We can take a look at the convergence of the systems energy after each scf iteration through:

In [None]:
run.default.scf.convergence()

If we wanted to visualize the convergence of the energy, we can run the following:

In [None]:
# Plot of energy at each interaction showing its convergence
fig = run.default.scf.convergence()["energy"].plot(
    title="Energy convergence",
    marker=".",
    markersize=10,
)
fig.set(xlabel="Iteration", ylabel="Energy (eV)");

### Fields

We can simply load the data of the output accross all iterations (including the output in static/):

In [None]:
density_data = run.default.scf.density("cube")
density_data

The data object is an `xarray` object containing not only the data values, but also the coordinates for which these values are defined:

In [None]:
density_data.coords

For more information on the units take a look at the [units tutorial](units.ipynb).

## Working with the data: Xarray

First, let us access the data from the last iteration only:

In [None]:
xa = density_data.isel(step=-1)
xa

In the first line of output of the previous cell, we can see an overview of the Xarray data inside the `density` field.
The first line tells us the number of sampling points in each direction:

    xarray.DataArray ‘density’ (x: 95, y: 99, z: 67)

So there are 95 sampling points in the x-direction and similarly 67 for the z-direction.
Suppose we want to have a view of density of benzene in the x-y plane (i.e. at $z=0$), 
we can do so by selecting the slice at z=0. One way to do this for this particular simulation is by asking Xarray for selecting the slice at the index $i_z=67/2~\approx~33$:


In [None]:
s0 = xa.isel(z=33)  # Viewing the slice at 33rd index of z-axis
s0

This slice returns another Xarray object. Note here that the coordinate value for `z` is `3e-06`, which is the value of `z` coordinate for the 33rd sampling point in the z-direction.
We can now plot this slice using the `plot()` method of the Xarray object:

(The `;` after `s0.plot()` is a trick used to prevent IPython from printing the result of `s0.plot()` which would show something like `<matplotlib.collections.QuadMesh at 0x7f181bbf8a10>` above the actual plot.)

In [None]:
s0.plot();

Note that the plot has the x and the y axis inverted, one can change this by passing the `x='x'` argument to the `plot()` method.

In [None]:
s0.plot(x="x");

Another way to slice the data is by using the `sel()` method of the Xarray object, where you can pass the *coordinate value* instead of the *index*.

For example, to get the same slice as above, we can use:

In [None]:
s1 = xa.sel(z=0, method="nearest")
s1.plot(x="x");

Note that plots automatically display the value of the position of the slice in the z-direction as well as the step number. 
This is possible because Xarray maintains the metadata of the coordinates even after slicing.

One can have a side view of the molecule by slicing the data in the y-z plane (e.g. at $x=0$) in a similar fashion:

    xa.sel(x=0, method='nearest')

Using a `for` loop and some commands from the `matplotlib` plotting library, we can plot 6 different slices of the data in a 3x2 grid at different values of x.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# create a subplot with 2 rows, 3 columns
fig, axs = plt.subplots(2, 3, figsize=(10, 5))

# x positions from -7.5 to 5 in 6 steps
x_positions = np.linspace(-7.5, 5, 6)

for ax, x in zip(axs.flat, x_positions):
    # plot the slice nearest to x position
    xa.sel(x=x, method="nearest").plot(ax=ax, x="y")

fig.tight_layout()  # avoid overlap of labels

The `sel` method can also be used to select a multiple coordinates at once, for example to select the slice at $x=0$ and $z=0$ we can do:

    xa.sel(x=0, z=0, method='nearest')

We can then plot the variation of density along the y axis. Because we have restricted two (`x` and `z`) of the three spatial dimensions, we obtain a 'line plot' of the density along the remaining coordinate dimension `y`.

In [None]:
xa.sel(x=0, z=0, method="nearest").plot()

A more detailed exploration of plotting with Xarray is available in the [postopus plotting tutorial](https://octopus-code.gitlab.io/postopus/notebooks/xarray-plots1.html#).

## Plotting with holoviews

A plotting library that is built for interactive/multidimensional plots is [holoviews](https://holoviews.org/).
Converting xarray objects (the most common output type in postopus) to holoviews objects is simple, as we will see. Holoviews supports many backends for plotting. In this tutorial we will use bokeh (the default plotting backend for holoviews).

The holoviews approach to plotting may seem unconventional (and perhaps confusing) at first — it is, however, very powerful. In particular, holoviews allows the selective and/or interactive visualization of data that is defined in more dimensions that we can easily visualize. To be able to use that power, we need to provide some additional information to the plotting commands: for example which of the many dimensions we want to plot, and for which we would like to have an interactive slider to select it.

We provide a few examples that can be used as templates to cover typical use cases below. 


First, let us import the holoviews library as well as the plotting backends we will use:

In [None]:
import holoviews as hv
from holoviews import opts  # For setting defaults

hv.extension("bokeh")  # Allow for interactive plots

We then choose the color map we want to use for the plots. We use `viridis`.

In [None]:
# Choose color scheme similar to matplotlib
opts.defaults(
    opts.Image(cmap="viridis", width=400, height=400),
)

Now we can plot our density data including sliders to easily explore the data:

In [None]:
ds = hv.Dataset(density_data.isel(step=-1))
im = ds.to(hv.Image, kdims=["x", "y"]).opts(colorbar=True, width=500)
im

Specifying the argument `dynamic=True` in the `to` method allows us to select the data one by one instead of precomputing for the entire range. This may be necessary for larger data points.

Here is a similar plot for the y-z plane (with a slider for x and the iteration step):

In [None]:
# Note for web users: You should have an active notebook session to interact with the plot
hv.Dataset(density_data).to(hv.Image, kdims=["y", "z"], dynamic=True).opts(
    colorbar=True, width=500
)

A more detailed exploration of plotting with holoviews is available in the [postopus plotting tutorial](https://octopus-code.gitlab.io/postopus/notebooks/holoviews_with_postopus.html), or in the [holoviews documentation](https://holoviews.org/getting_started/Gridded_Datasets.html).