# <font color='green'> TUTORIAL </font>
## <font color='green'> 1. Installation in Unix </font> 

  - conda installation. Type in your console the following command:   
   ```bash
   wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
   ```
   
  - then add miniconda to your path   
   ```bash
   bash miniconda.sh -b -p $HOME/miniconda
   ```
   
  
  - create new virtual environment
   ```bash
   conda create -q -n qmworks python=3.5
   ```
    
  - Install dependecies
   ```bash 
   conda install --name qmworks -c anaconda hdf5
   conda install --name qmworks -c https://conda.anaconda.org/rdkit rdkit
   ```
    
  - Start environment
   ```bash
   source activate qmworks
   ``` 
   
  - install **qmworks** dependencies
    ```bash
     pip install https://github.com/SCM-NV/qmworks/tarball/master#egg=qmworks https://github.com/SCM-NV/plams/tarball/master#egg=plams --upgrade
    ```
### You are ready to start!

# <font color='green'> Starting the environment  </font>
Once *QMWORKS*  has been installed the user should run the following command to initialize the environment:

To leave the environment the following command is used

To finalize preparations before running QMworks: if you don't want the results  to end up in the current work directory, create a new results folder.

In [19]:
mkdir tutorial_results

mkdir: cannot create directory ‘tutorial_results’: File exists


# <font color='green'> What is QMworks?</font>
QMworks is a python library that enables executing complicated workflows of interdependent quantum chemical (QM) calculations in python. It aims at providing a common interface to multiple QM packages, enabling easy and systematic generation of the calculation inputs, as well as facilitating automatic analysis of the results. Furthermore it is build on top of the powerful Noodles framework for executing the calculations in parallel where possible.

# <font color='green'> The basics: calling packages</font> 
Currently **QMWORKS** offers an interface with the following simulation software:
* #### SCM (ADF and DTFB)
* #### CP2K
* #### ORCA
* #### GAMESS-US
* #### DIRAC

<font color='red'> Please make sure that the packages you want to use in QMworks are installed and active; in most supercomputer the simulation package are available using a command like (consult your system administrator):
```bash
load module superAwesomeQuantumPackage/3.1421
```
Also some simulation packages required that you configure a `scratch` folder. For instance *Orca* requires a **SCR** folder to be defnied while *ADF*  called it **SCM_TMPDIR**.


<font color='black'> With ``qmworks`` you can write a python script that simply calls one of the package objects 
**adf, dftb, cp2k, orca, gamess** or **dirac**.
As arguments to the call, you need to provide a ``settings`` objects defining the input of a calculation, a molecular geometry, and, optionally, a job name that enables you to find back the "raw" data of the calculation later on.

Let's see how this works:

First we define a molecule, for example by reading one from an xyz file:

In [20]:
from plams import Molecule
acetonitrile = Molecule("files/acetonitrile.xyz")
print(acetonitrile)

  Atoms: 
    1         C      2.419290      0.606560      0.000000 
    2         C      1.671470      1.829570      0.000000 
    3         N      1.065290      2.809960      0.000000 
    4         H      2.000000      0.000000      1.000000 
    5         H      2.000000      0.000000     -1.000000 
    6         H      3.600000      0.800000      0.000000 



Then we can perform geometry optimization on the molecule by a call to the dftb package object:

In [21]:
from qmworks import dftb, templates, run
job = dftb(templates.geometry, acetonitrile)
print(job)

<noodles.interface.decorator.PromisedObject object at 0x7f9fb895e9b0>


As you can see, "job" is a so-called "promised object". It means it first needs to be "run" by the Noodles scheduler to return a normal python object.

In [22]:
result = run(job, path="tutorial_results")
print(result)

╭─(running jobs)
│ Running dftb ...
╰[s[1A[50C([38;2;60;180;100mretrieved[0m)[u─(success)
<qmworks.packages.SCM.DFTB_Result object at 0x7f9fb88ed048>


We can easily retrieve the calculated properties from the DFTB calculation such as the dipole or the optimized geometry for use in subsequent calculations.

In [23]:
print("Dipole: ", result.dipole)
print(result.molecule)

Dipole:  [1.0864213029, -1.9278296041, -0.0]
  Atoms: 
    1         C      2.366998      0.579794     -0.000000 
    2         C      1.660642      1.834189      0.000000 
    3         N      1.089031      2.847969      0.000000 
    4         H      2.100157      0.010030      0.887206 
    5         H      2.100157      0.010030     -0.887206 
    6         H      3.439065      0.764079     -0.000000 



# <font color='green'> Settings and templates</font> 
In the above example ``templates.geometry`` was actually a predefined Settings object.
You can define and manipulate Settings in a completely flexible manner as will be explained in this section. To facilitate combining different packages in one script, QMworks defines a set of commonly used generic keywords, which can be combined with package specific keywords, to provide maximum flexibility.

In [24]:
from qmworks import Settings
s = Settings()
s.basis = "DZP"
s.specific.adf.basis.core = "large"
s.freeze = [1,2,3]
print(s)

basis: 	DZP
freeze: 	[1, 2, 3]
specific: 	
         adf: 	
             basis: 	
                   core: 	large



This code snippet illustrates that the ``Settings`` can be specified in two ways, using generic or specific keywords. Generic keywords represent input properties that are present in most simulation packages like a *basis set* while *specific* keywords allow the user to apply specific keywords for a package that are not in a generic dictionary.

<font color='red'> Expert info: *Settings* are a subclass of python [dictionaries](https://docs.python.org/3.5/tutorial/datastructures.html#dictionaries) to represent herarchical structures, like
<img src="files/simpleTree.png"> </font>

In QMworks/PLAMS multiple settings objects can be combined using the ``overlay`` function.

In [25]:
merged_settings = templates.geometry.overlay(s)
print(merged_settings)

basis: 	DZP
freeze: 	[1, 2, 3]
specific: 	
         adf: 	
             basis: 	
                   core: 	large
                   type: 	SZ
             geometry: 	
                      optim: 	delocal
             integration: 	
                         accint: 	6.0
             scf: 	
                 converge: 	1e-06
                 iterations: 	100
             xc: 	
                __block_replace: 	True
                lda: 	
         cp2k: 	
         dftb: 	
              dftb: 	
                   resourcesdir: 	DFTB.org/3ob-3-1
              task: 	
                   runtype: 	GO
         dirac: 	
         gamess: 	
                basis: 	
                      gbasis: 	n21
                      ngauss: 	3
                contrl: 	
                       dfttyp: 	pbe
                       runtyp: 	optimize
                       scftyp: 	rhf
         orca: 	
              basis: 	
                    basis: 	sto_sz
              method: 	
                     functional

The *overlay* method merged the template containing default settings for geometry optimizations with different packages with the arguments provided by the user 
<img src="files/merged.png">

resulting in:
<img src="files/result_merged.png" width="700">

Note that the generic and specific keywords still exist next to each other and may not be consistent (e.g. different basis sets are defined in generic and specific keywords). Upon calling a package with a Settings object, the generic keywords are first translated into package specific keywords and combined with the relevant user defined specific keywords. In this step, the settings defined in generic keywords take preference. Subsequently, the input file(s) for the given package is/are generated, based on the keywords after **specific.[package]** based on the [PLAMS software](https://www.scm.com/doc/plams/index.html).


In [26]:
from qmworks import adf
print(adf.generic2specific(merged_settings))

basis: 	DZP
freeze: 	[1, 2, 3]
specific: 	
         adf: 	
             basis: 	
                   core: 	large
                   type: 	DZP
             constraints: 	
                         atom 2: 	
                         atom 3: 	
                         atom 4: 	
             geometry: 	
                      optim: 	cartesian
             integration: 	
                         accint: 	6.0
             scf: 	
                 converge: 	1e-06
                 iterations: 	100
             xc: 	
                __block_replace: 	True
                lda: 	
         cp2k: 	
         dftb: 	
              dftb: 	
                   resourcesdir: 	DFTB.org/3ob-3-1
              task: 	
                   runtype: 	GO
         dirac: 	
         gamess: 	
                basis: 	
                      gbasis: 	n21
                      ngauss: 	3
                contrl: 	
                       dfttyp: 	pbe
                       runtyp: 	optimize
                       scftyp:

In the case of adf the above keywords result in the following input file for ADF package:

In [27]:
result = run(adf(merged_settings, acetonitrile, job_name='adf_acetonitrile'), path="tutorial_results", folder="adf")
print(open('tutorial_results/adf/adf_acetonitrile/adf_acetonitrile.in').read())

[18:50:54] PLAMS working folder: /home/lars/workspace/qmworks/jupyterNotebooks/tutorial_results/adf
╭─(running jobs)
│ Running adf adf_acetonitrile...
[s[1A[50C([38;2;60;180;100mretrieved[0m)[u╰─(success)
Atoms
      1         C      2.419290      0.606560      0.000000 
      2         C      1.671470      1.829570      0.000000 
      3         N      1.065290      2.809960      0.000000 
      4         H      2.000000      0.000000      1.000000 
      5         H      2.000000      0.000000     -1.000000 
      6         H      3.600000      0.800000      0.000000 
End

Basis
  Core large
  Type DZP
End

Constraints
  Atom 2
  Atom 3
  Atom 4
End

Geometry
  Optim cartesian
End

Integration
  Accint 6.0
End

Scf
  Converge 1e-06
  Iterations 100
End

Xc
  Lda
End

End Input




  warn(msg)


# <font color='green'> Combining multiple jobs </font>


Multiple jobs can be combined, while calling the run function only once. The script below combines components outlined above:

In [28]:
from plams import Molecule
from qmworks import dftb, adf, templates, run, Settings

acetonitrile = Molecule("files/acetonitrile.xyz")

dftb_opt = dftb(templates.geometry, acetonitrile, job_name="dftb_opt")

s = Settings()
s.basis = "DZP"
s.specific.adf.basis.core = "large"
print(dftb_opt.molecule)
adf_single = adf(templates.singlepoint.overlay(s), dftb_opt.molecule, job_name="adf_single")

adf_result = run(adf_single, path="tutorial_results", folder="workflow")
print(dftb_opt.molecule)

print(adf_result.energy)

<noodles.interface.decorator.PromisedObject object at 0x7f9fb88edba8>
[18:50:54] PLAMS working folder: /home/lars/workspace/qmworks/jupyterNotebooks/tutorial_results/workflow
╭─(running jobs)
│ Running dftb dftb_opt...
[s[1A[50C([38;2;60;180;100mretrieved[0m)[u│ Running adf adf_single...
[s[1A[50C([38;2;60;180;100mretrieved[0m)[u╰─(success)
<noodles.interface.decorator.PromisedObject object at 0x7f9fb88a10f0>



  warn(msg)


-1.4094874734528824


In this case the second task adf_single reads the molecule optimized in the first job dftb_opt. Note that dftb_opt as well as dftb_opt.molecule are promised objects. When **run** is applied to the adf_single job, noodles builds a graph of dependencies and makes sure all the calculations required to obtain **adf_result** are performed.

All data related to the calculations, i.e. input files generated by QMworks and the resulting output files generated by the QM packages are stored in folders named after the job_names, residing inside a results folder:

In [29]:
ls tutorial_results

[0m[01;34madf[0m/  [01;34mplams.6350[0m/  [01;34mworkflow[0m/


In [30]:
ls tutorial_results/workflow

[0m[01;34madf_single[0m/  [01;34mdftb_opt[0m/  workflow.log


In [31]:
ls tutorial_results/workflow/adf_single

adf_single.dill  adf_single.in   [0m[01;32madf_single.run[0m*  logfile  t21.H
adf_single.err   adf_single.out  adf_single.t21   t21.C    t21.N
