SlothPy makes extensive use of the HDF5 file format (powerful and portable binary format for fast I/O and big data) to store all the data related to the software. The core file format .slt is in fact .hdf5/h5 file in disguise, so it can be additionally opened and modified with programs such as HDFView and HDFCompas programs or directly accessed through the h5py Python wrapper. Firstly, let us import the pacgake

In [None]:
import slothpy as slt

To start using SlothPy one has to create at least one instance of the core class object - Compound. The Compound is intrinsically associated with a .slt file stored on your disk. You can create the Compound from an output file of quantum chemistry software, access the existing .slt file, or add more data to it. Those operations are handled by the Compound creation methods. There is one .rassi.h5 file produced from the MOLCAS calculation included in the "examples" folder from our repository on GitHub. We will use it as an example in this tutorial. To create a Compound from it we have to include its relative path and use slt.compound_from_molcas() method. Note: always check the documentation of the used methods on our website or use your editor to do so (e.g. Shift+Tab in Jupyter Notebooks).

In [None]:
NdCo = slt.compound_from_molcas(".", "Nd_tutorial", "bas3", "./examples", "NdCo_DG_bas3")

After the creation, you can check what is inside your file using the print() method or like this:

In [None]:
NdCo

You can see the list of HDF5 groups and data sets contained in the file together with their Description attributes. This is the way that SlothPy will save all your results. Having already .slt file on your disk you can add to it more ab initio results (just use the path and name of an existing file) or access it at a later point using slt.compound_from_slt() method.

In [None]:
NdCo = slt.compound_from_slt(".", "Nd_tutorial")

All available methods are accessed through an instance of the Compound class that constitutes the user interface and API documented in the Reference Manual. Let us start by computing molar powder-averaged magnetisation for our Nd-based compound. Firstly, we need some imports to create a range of magnetic field and temperature values:

In [None]:
from numpy import linspace

fields_mth = linspace(0.0001, 7, 50)
temperatures_mth = linspace(1, 10, 10)

The method for computing magnetisation as a function of field and temperature is called calculate_mth().

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_mth, 4, temperatures_mth, 10)

In [None]:
mth

We can, for example, plot it as a magnetic field function M(H) iterating over different temperatures:

In [None]:
from matplotlib.pyplot import plot, show

for mh in mth:
    plot(fields_mth, mh)
show()

Now, let us run the above calculation once again, but this time we will use a better, denser grid and include more SO-states. It will take a little more time. Additionally, we will save the results to our Nd_tutorial.slt file using slt keyword. (confront the documentation for the comprehensive description of all the options) 

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_mth, 6, temperatures_mth, 32, slt="bas3")

If you invoke the representation of the Nd_tutorial.slt file once again you can see that a new group "bas3_magnetisation" was created which contains datasets for magnetisation (bas3_mth), magnetic fields (bas3_fields), and temperatures (bas3_temperatures). (all with Description attributes)

In [None]:
NdCo

SlothPy provides an array-like interface for reading and writing data from and to .slt files. As an example, let us read the magnetisation to another variable mth_custom_read together with field values:

In [None]:
mth_custom_read = NdCo["bas3_magnetisation", "bas3_mth"]
field_values = NdCo["bas3_magnetisation", "bas3_fields"]

Here, we provide a full group and dataset name to access the data. Now we can do what we want with the arrays. Let us confirm that they indeed represent magnetisation once again by plotting them:

In [None]:
for mh in mth_custom_read:
    plot(field_values, mh)
show()

Since we have our data saved in the .slt file we can actually plot it using the build-in function (available for various methods - see documentation, all starting with "plot" and having plenty of customization parameters).

In [None]:
NdCo.plot_mth("bas3")

When you invoke plotting functions for specific methods you do not need to provide a suffix, like "_magnetisation", the program will handle it for you.

If you need you can even create your own custom groups with datasets (in the form of numpy NdArrays) in the file and use them in your scripts.

In [None]:
one_to_ten = linspace(1, 10, 10)
NdCo["my_custom_group", "one_to_ten_dataset", "My description of the dataset",
     "My description of the whole group"] = one_to_ten
NdCo

The last two strings giving a description of the group and data set are optional. Later you can add to the existing group more data sets or create datasets without a group (they have to be at least 1-dimensional ArrayLike):

In [None]:
NdCo["my_custom_group", "123_dataset"] = [1,2,3]
NdCo["my_dataset_without_a_group"] = [1]
NdCo

If you now try to re-run the previous calculation you should see a SlothPyError, due to the already existing name of the group (SlothPy prevents you from accidentally overwriting the data).

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_mth, 6, temperatures_mth, 32, slt="bas3")

We can use delete_group_dataset to manually remove datases/groups from the .slt file.

In [None]:
NdCo.delete_group_dataset("my_dataset_without_a_group")
NdCo.delete_group_dataset("my_custom_group", "123_dataset")
NdCo.delete_group_dataset("bas3_magnetisation")
NdCo

As you already should notice (when reading the docstring of calculate_mth method) SlopthPy provides you full control over the amount of CPUs you want to assign to your calculation and threads to be used. Computationally demanding methods are parallelized using a certain amount of separate processes - the number of processes that will be used is (number of CPUs) // (number of threads). Additionally, in the Note section of each method, the user is informed over what quantity the job will be parallelized. In the case of calcualte_mth, the work is distributed over field values (here 50). So using for example 10 parallel processes each will compute the magnetisation for 5 values of the field. As default, SlothPy uses all available logical cores with 1 thread for the linear algebra libraries. For jobs with a very high number of points to be parallelized, you should benefit from a greater number of processes (not including time for Python's multiprocessing setup for a huge amount of data). On the other hand with increasing matrices size it is beneficial to use more threads for operations such as diagonalization etc. It is not a trivial task to choose good settings for very demanding calculations that is why we provide, within SlothPy, the autotune module to do it for you. It tests all possible meaningful setups and gives you time estimates for each of them. It takes some time to do this (because it actually truly does a part of the calculations to benchmark them) so it is advised to use it for jobs that will take hours. To demonstrate it with very small matrices provided in our file (they are 364 x 364 - that is how many SO states are there) we will run two examples using all available CPUs on your machine (if you want to leave some you should change number_cpu = 0 to your desired number):

In [None]:
fields_process = linspace(0.0001, 7, 60)
fields_threads = linspace(0.0001, 7, 3)

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_process, 6, temperatures_mth, 364, number_cpu = 0, autotune=True)

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_threads, 6, temperatures_mth, 364, number_cpu = 0, autotune=True)

In the first case autotune module should choose (but depending on your hardware) more processes and fewer threads than in the second one, where we paralleize only over 3 field points. Time estimates include only pure calculation steps and in our tests for a variety of different methods they should give results within 15-20% of overall execution time maximal error. After autotuning you can run the calculation with the chosen setting manually to see how much time it will take compared to our estimate. Can you choose better settings by yourself?

In [None]:
num_of_cpu = #fill here the number you were autotuning for
num_of_threads = #fill here the number of threads chosen by the autotune module (for fields_process and _threads)

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_process, 6, temperatures_mth, 364, num_of_cpu, num_of_threads)

In [None]:
%%time
mth = NdCo.calculate_mth("bas3", fields_threads, 6, temperatures_mth, 364, num_of_cpu, num_of_threads)

The necessity of the autotune module will become visible for matrices with a number of states over 500-1000 or even 2000+ when calculations with certain settings (how many field values and grids) can take many hours or even days. It also all depends on your hardware e.g. how many possibilities is there to check. For me writing this tutorial now I am testing it on 128 logical cores (64 physical) CPU, so there are many possibilities to choose a number of threads and processes - therefore it is harder and also more time-consuming for the module, but still better than trying manually all the possibilities.