# Database Control

This Notebook oversees commands related to control of the calculation database that is created for high-throughput calculations. This includes:

1. Defining databases for easy access.

2. Specifying the local run_directories where calculations will be placed/performed.

3. Uploading/updating the reference records to a database based on the iprPy/library.

4. Checking the number and status of records within a database.

5. Cleaning records in a database by resetting errored calculations and removing excess \*.bid files.

6. Copying/removing database records.

7. Forgetting stored database information.

__Global workflow details:__

The commands offered by this Notebook are 
outside the global workflow, with the exception that new databases can be defined here before use in the other Notebooks.

**Library imports**

In [1]:
# Standard Python libraries
from __future__ import (print_function, division, absolute_import,
                        unicode_literals)

# https://github.com/usnistgov/iprPy
import iprPy
print('iprPy version', iprPy.__version__)

iprPy version 0.8.3


## 1. Define databases

Settings for accessing databases can be stored under simple names for easy access later.

The **list_databases()** function returns a list of all of the names for the stored databases.

In [2]:
print(iprPy.list_databases())

['iprhub', 'PN', 'iprhub_local', 'master', 'demo', 'HEA']


The **set_database()** function allows for database access information to be saved under a simple name.

In [3]:
# Specify local directory to save files to
host = 'C:\\Users\\lmh1\\Documents\\calculations\\ipr\\potential_testing'

# Define local-style database called demo
iprPy.set_database(name='potential_testing', style='local', host=host)

Enter any other database parameters as key, value
Exit by leaving key blank
key: 


The **load_database()** function accesses the database information associated with a database's name and returns an iprPy.Database object.

In [4]:
database = iprPy.load_database('potential_testing')
print(database)

database style local at C:\Users\lmh1\Documents\calculations\ipr\potential_testing


## 2. Define run directories

The high-throughput calculations are prepared and executed using local directories.  The paths to these directories can be saved and stored using simple names for easy access later.

The **list_run_directories()** function returns a list of all of the names for the stored run directories.

In [5]:
print(iprPy.list_run_directories())

['iprhub1', 'PN1', 'master_1', 'master_2', 'master_3', 'master_4', 'demo_1', 'demo_2', 'demo_3', 'demo_4', 'HEA_1', 'HEA_2', 'HEA_3', 'HEA_4']


The **set_run_directory()** function allows for a local run directory to be saved under a simple name. For best functionality, each run_directory should be for a unique database and number of cores.

In [6]:
# Define running directories for up to four cores with n
torun = 'C:\\Users\\lmh1\\Documents\\calculations\\ipr\\torun\\potential_testing\\'
iprPy.set_run_directory(name='potential_testing_1', path=torun + '1')
iprPy.set_run_directory(name='potential_testing_2', path=torun + '2')
iprPy.set_run_directory(name='potential_testing_3', path=torun + '3')
iprPy.set_run_directory(name='potential_testing_4', path=torun + '4')

The **load_run_directory()** function accesses the stored directory path associated with a run directory's name.

In [7]:
run_directory = iprPy.load_run_directory('potential_testing_1')
print(run_directory)

C:\Users\lmh1\Documents\calculations\ipr\torun\potential_testing\1


## 3. Build database by copying reference records into it

The **build_refs()** method copies the reference records in iprPy/library to the database for use in high-throughput calculations. 

Destroy any library records from the database that you want to reset and replace

In [8]:
# Crystal prototypes
#database.destroy_records('crystal_prototype')

# Interatomic potentials
#database.destroy_records('potential_LAMMPS')
#database.destroy_records('potential_openKIM_LAMMPS')

# Pre-defined defect parameters
#database.destroy_records('free_surface')
#database.destroy_records('point_defect')
#database.destroy_records('dislocation_monopole')
#database.destroy_records('stacking_fault')

Add/append missing library records

In [9]:
database.build_refs()

## 4. Check record numbers and status

The **check_records()** method checks how many records of a given style are stored in the database.  If the record is a calculation record, it will also display how many are unfinished, issued errors, or have successfully finished.

In [10]:
database.check_records('potential_LAMMPS')

In database style local at C:\Users\lmh1\Documents\calculations\ipr\potential_testing :
- 281 of style potential_LAMMPS


## 5. Clean calculation records

The **clean_records()** method resets errored calculations of a specified record style.  Cleaning a record style means:

- Resetting any calculations that issued errors back into a run_directory

- Removing any .bid files in the calculation folders in the run_directory

This is useful to resetting and rerunning calculations that may have failed for reasons external to the calculation's method.  E.g. runners terminated early, parameter conflicts for a limited number of potentials, debugging calculations.

__WARNING:__ Conflicts may occur if you clean a run_directory that active runners are operating on as the .bid files are used to avoid multiple runners working on the same calculation at the same time.

In [11]:
#database.clean_records('calculation_E_vs_r_scan', 'potential_testing_1')

## 6. Destroy calculation records

The **destroy_records()** method deletes all records of a specified style.  Useful if you want to reset any library records or rerun calculations with different parameters. 

**WARNING:** This is a permanent delete even for local database styles.

In [12]:
#database.destroy_records('calculation_E_vs_r_scan')

## 7. Forget database information

The **unset_database()** and **unset_run_directory()** functions will remove the saved settings for the databases. 

**NOTE:** Only the stored access information is removed as the records in a database and files in a run_directory will remain.

In [13]:
# Clear out existing definitions
#iprPy.unset_database(name='potential_testing')
#iprPy.unset_run_directory(name='potential_testing_1')
#iprPy.unset_run_directory(name='potential_testing_2')
#iprPy.unset_run_directory(name='potential_testing_3')
#iprPy.unset_run_directory(name='potential_testing_4')