# Getting started
This exmaple illustrates a sample workflow using the pre-implemented dummy engine. To make use of it please configure the `.strucscan` configuration file and copy it in you home directory. You can use the template that comes by default with the repository and link the absoulte pathes to the structure and resource repository to the default ones from the repository.

In [1]:
! cat ../.strucscan

PROJECT_PATH: "data"    # corresponds to the top node of your data tree
STRUCTURES_PATH: "structures"
RESOURCE_PATH: "resources"

DEBUG: FALSE                # Default: FALSE
STRUCT_FILE_FORMAT: cfg     # Default: cfg
SLEEP_TIME: 45              # Default: 45

In general, you might want to adapt the `.strucscan` to your own structure and resource directory and set up a resource directory accordingly. For now, you do not need to worry about it and we will discuss it later. \
This example makes use of the Jupyter notebook interface. In the second example, we demonstrate how to use strucscan from command line.

## The `input` dictionary
Every information about the calculations you want to perform is hand over to strucscan in form of a python dictionary. This `input` dictionary allows several keys where some are mandatory and other are optional. If optional keys are left out, strucscan will fall back to default values. Each value needs to be of type string unless it is boolean. \
Let's have a look at general mandatory keys that every engine requires:
## Mandatory keys

In [2]:
from strucscan.resources.inputyaml import GENERAL

GENERAL().MANDATORY

{'species': 'str',
 'engine': 'str',
 'machine': 'str',
 'ncores': 'str',
 'nnodes': 'str',
 'queuename': 'str',
 'potential': 'str',
 'properties': 'str',
 'prototypes': 'str',
 'settings': 'str'}

As you see, this gives you an idea which value type to enter for each key. Let's have a detailed look at the mandatory keys:

* `species`: (str) chemical species. You can enter multiple speices space seperated, e.g. "Ni Al".
* `engine`: (str) the material simulation code. This depends on the implemented interfaces. As a 'dummy' engine and an interface to VASP is already implemented, possible values are `'dummy'` or `'VASP'`.

* `machine`: (str) name of the machine. This refers to the configurations you have made in the resouce directory.
* `ncores`: (int) number of cores per node.
* `nnodes`: (int) number of nodes.
* `queuename`: (str) name of the queue. When running on queuing systems, this will refer to a template file that you deposit in `resources/machineconfig/<machine>/<scheduler>/<queuename>.<scheduler_suffix>`.
* `potentials`: (str) in case of VASP, this refers to the exchange functionl (`'PBE'` or `'LDA'`). For VASP, please enter one potential per specie. For LAMMPS or any other engine, this refers to a file that you deposit in `resources/engine/<engine>/potentials`.
* `settings`: (str) this refers to a template engine settings file text that you deposit in `resources/engine/<engine>/settings`. Strucscan will check this template and adapts tags and values if neccessary.
* `properties`: (str) material properties to perform. You can enter multiple properties space seperated. Will will discuss the available properties below.
* `prototypes`: (str) names of structure files you want to analyse. Strucscan will look in `resources/structures` for the structure files. You can enter the full name of the structure files space speretaed or enter them on multiple lines. You can also enter a directory containing structures by indicating it by `'<', '>'`, e.g. `<bulk/fcc/>`. Please make sure that every structure has a unique name.

## Optional keys
Last, we should discuss the optional keys.

In [3]:
from strucscan.resources.inputyaml import GENERAL
GENERAL().OPTIONAL

{'initial atvolume': 'default',
 'verbose': False,
 'monitor': True,
 'submit': True,
 'collect': True}

* `initial atvolume`: (str) initial scaling of the structures. Enter one float per specie, e.g. `10. 12.` or type `d` or `default` to use the default atomic volumes deposited in `strucscan.resources.atomicvolumes.py`
* `verbose`: (bool) toggles command line output.
* `monitor`: (bool) if true, strucscan will check the status of each job.
* `submit`: (bool) if true, strucscan will check the status of each job.
* `collect`: (bool) if true, strucscan will check the status of each job.

## Properties
Now let's see which properties are available.

In [4]:
from strucscan.resources.properties import *

Strucscan distinguishes between properties that require any condition and properties that run without any prerequisites. For example, calculating the energy from a structure or optimizing the structure in some way requires no conditon. These are the tasks `static` (calculte energy only), `atomic` (optimize inner degrees of structure), `total` (fully optimize structure).

In [5]:
OPTIMIZATIONS

['static', 'atomic', 'total']

An example task requiering a condition could be a energy-volume calcultion. Usually, you per-process the structure before you create the strained images. The task `ev`, therefore, belongs to the advanced tasks. Advanced tasks and their conditions can be defined in `properties.yaml` in `strucscan.resources` which is read by the `properties` module and stored in `properties_conifg_dict`:

In [6]:
from pprint import pprint
pprint(properties_conifg_dict)

{'default_option': 'atomic', 'dos': 'total', 'eos': 'static, atomic, total'}


The dictionary is build in that way that each key is the name of an advanced task and contains values representing the condition. The key `default_option` can be configured in the `properties.yaml` and is needed whenever no information about the condition is made. If not, strucscan will set it to the default value which is `atomic`. You may update the dictionary by:

In [7]:
properties_conifg_dict.update({"default_option": "total"})
pprint(properties_conifg_dict)

{'default_option': 'total', 'dos': 'total', 'eos': 'static, atomic, total'}


## Input example: VASP
Now that we have an idea what the `input` dictionary looks like, let's make a little more clear by looking at an example for VASP.

In [8]:
from strucscan.resources.inputyaml import VASP
VASP().EXAMPLE

{'species': 'Ni Al_pv',
 'engine': 'VASP 5.4',
 'machine': 'example_vasp',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'potential': 'PBE',
 'properties': 'atomic',
 'prototypes': 'L1_2',
 'settings': '500_SP.incar',
 'magnetic configuration': 'SP',
 'initial magnetic moments': '2.0 0.',
 'kdens': '0.15',
 'kmesh': 'Monkhorst-pack',
 'initial atvolume': 'default',
 'verbose': False,
 'monitor': True,
 'submit': True,
 'collect': False}

You see that VASP allows keys to define the magnetic moments and configuration as well as information about the k-point distribution. If we look at the mandatory keys that VASP we requires, we will see that the keys on the magnetsism are mandatory as VASP is sensitive to it. The ones on the k-points distribution are optional.

In [9]:
list(VASP().MANDATORY.keys())

['species',
 'engine',
 'machine',
 'ncores',
 'nnodes',
 'queuename',
 'potential',
 'properties',
 'prototypes',
 'settings',
 'magnetic configuration',
 'initial magnetic moments']

In [10]:
list(VASP().OPTIONAL.keys())

['initial atvolume',
 'verbose',
 'monitor',
 'submit',
 'collect',
 'kdens',
 'kmesh',
 'k points file']

## Input example: dummy
As VASP requires an available installation, let's move on to the pre-implemented 'dummy' engine to get started with strucscan. Instead of calculating any energies, it will only copy the initial structure file, `structure.cfg`, to the final file, `final.cfg` and waits for 15 s. You can configure this command in the machine configuration file:

In [11]:
import yaml
with open("../resources/machineconfig/dummy/config.yaml", "r") as stream:
        config = yaml.safe_load(stream)
pprint(config)

{'DUMMY': {'parallel': 'cp structure.cfg final.cfg | echo "This is a dummy log '
                       'file." > log.out | sleep 1\n',
           'serial': 'cp structure.cfg final.cfg | echo "This is a dummy log '
                     'file." > log.out | sleep 1\n'},
 'scheduler': 'noqueue',
 'smallest queue': None}


The machine configuration in configured for machines without queuing systems. You can test it right away on your local machine. You can test strucscan with the 'dummy' engine without setting up any pre-requirements. \
Let's check the example `input` dicitonary for our dummy:

In [12]:
from strucscan.resources.inputyaml import DUMMY
DUMMY().EXAMPLE

{'species': 'Al',
 'engine': 'dummy',
 'machine': 'dummy',
 'initial atvolume': 'default',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'properties': 'static atomic total eos',
 'prototypes': 'fcc.cfg',
 'potential': 'none',
 'settings': 'none'}

Settings and potential are set to `'none'`. The 'dummy' does not require any potential or settings file and will not look for it in the resource directory.

In [13]:
list(DUMMY().MANDATORY)

['species',
 'engine',
 'machine',
 'ncores',
 'nnodes',
 'queuename',
 'potential',
 'properties',
 'prototypes',
 'settings']

In [14]:
list(DUMMY().OPTIONAL)

['initial atvolume', 'verbose', 'monitor', 'submit', 'collect']

Let's use this input example and set the `verbose`, so we have a little more insight in what strucscan is doing.

In [15]:
input_dict = DUMMY().EXAMPLE
input_dict.update({'verbose': True})
input_dict

{'species': 'Al',
 'engine': 'dummy',
 'machine': 'dummy',
 'initial atvolume': 'default',
 'ncores': '1',
 'nnodes': '1',
 'queuename': 'none',
 'properties': 'static atomic total eos',
 'prototypes': 'fcc.cfg',
 'potential': 'none',
 'settings': 'none',
 'verbose': True}

## The JobManager
Once the `input` dictionary is set up properly, you can hand it over to the `JobManager` which is the main class of strucscan. Initialising the `JobManager` with your `input` will start the process. Using the 'dummy' exmaple, the process will include following steps:

1. **checking your input:** strucscan controls your input on mandatory and optional keys. If you left out any optional key, strucscan will fall back to the default value.
2. **initializing the job list:** strucscan creates a list of all jobs that are assembled from a loop over your entered prototypes and a loop over your entered properties.
3. **update the job list:** after initialization, strucscan will update each job in the list by checking its status. Does a job not exist, it will create and - if true - submit it. If a job exists, strucscan will check for possible errors in the job and performs a troubleshooting if possible. If the job already runs, strucscan won't touch the job files.
4. **monitoring:** if set to true, strucscan will repeat the update process until each job in the list is finished completely, or has been restarted to a maximum number of three times. If enabled, strucscan will collect the data from the data tree after each cycle.
5. **exiting:** if enabled, strucscan will collect the data from the data tree one more time.

In [16]:
from strucscan.core.jobmanager import JobManager

JobManager(input_dict)

Data tree path:                /home/users/pietki8q/git/strucscan-master/data
Structure repository:          /home/users/pietki8q/git/strucscan-master/structures
Resource repository:           /home/users/pietki8q/git/strucscan-master/resources

Optional key 'monitor' not provided. Default value will be used: True
Optional key 'submit' not provided. Default value will be used: True
Optional key 'collect' not provided. Default value will be used: True


key:                           : your input                                         what strucscan reads                              
----------------------------------------------------------------------------------------------------
species                        : Al                                                 Al                                                
engine                         : dummy                                              dummy                                             
machine                        : dumm

<strucscan.core.jobmanager.JobManager at 0x7f106ae135f8>