In [1]:
from dbmanager import Manager

# Basics

- The database manager is a local solution to writing simulation data, to organize parametric runs and to visualize data.
It mostly relies on hdf5 and `h5py` for storing data. 
- The manager object is built on a per directory basis, meaning that this directory represents a block of data for which it makes sense to be grouped and compared. Hence, its constructor requires the path to the desired directory. This can be a new path if you are creating a new database.

In [2]:
db = Manager('out')

# Creating a simulation

- When creating a simulation, a new directory will be created for it in which a `.h5` file will be initiated to hold any parameters and metadata, such as the time of creation.
- The argument `uid` is a unique identifier for the simulation and can be any name. If none is specified, a unique identifier will be assigned automatically.
- The executable script can be copied for reproducibility (it must be written in a specific way which will be presented later)
- A bash/sbatch script can be created for either local execution or submission on Euler.

In [3]:
parameters = {
    'timesteps': 5,
    'exponent': 2,
}

sim1 = db.create_simulation(uid='example_1', parameters=parameters, skip_duplicate_check=True)

sim1.change_note('What does this do?')

sim1.copy_executable('example_script.py')

sim1.create_batch_script(euler=False, ntasks=1)

In [4]:
parameters = {
    'timesteps': 5,
    'exponent': 1.4,
}

sim2 = db.create_simulation(uid='example_2', parameters=parameters)

sim2.change_note('this must have some purpose right?')

sim2.copy_executable('example_script.py')

sim2.create_batch_script(euler=False, ntasks=1)

The parameter space may already exist. Here are the duplicates:
          id                               notes    status  exponent  \
1  example_2  this must have some purpose right?  Finished       1.4   

   processors  submitted           time_stamp  timesteps  
1           1      False  2023-06-28 17:01:00          5  


In [5]:
sim = db[0]
db.df

Unnamed: 0,id,notes,status,exponent,processors,submitted,time_stamp,timesteps
0,example_1,What does this do?,Initiated,2.0,1,False,2023-06-28 17:41:00,5
1,example_2,this must have some purpose right?,Initiated,1.4,1,False,2023-06-28 17:42:00,5


# The execution script

**Requirements**

- Using argparse with the following to parsed arguments:
    - `path`: the path to the database
    - `uid`: the unique identifier of a simulation

Both these arguments will be passed automatically when the functionality of creating execution scripts is used. Otherwise, these arguments function as a condensed way of passing any required information to the execution script...


```python
import numpy as np
import argparse

from dbmanager_v2.simulation import SimulationWriter

def main(path, uid):

    # initiate the database simulation writer
    writer = SimulationWriter(uid, path)
    writer.register_git_attributes()
    writer.add_metadata()

    # to load the parameters of this simulation use
    parameters = writer.parameters
    
    # do what you have to do
    # here we create a simple mesh
    coords = np.array([
        [0., 0.],
        [1., 0.],
        [2., 0.],
        [0., 1.],
        [1., 2.]
    ])
    
    initial_values = np.random.randint(0, 10, 5)
    
    for t in range(parameters['timesteps']):

        result = initial_values * t**parameters['exponent']
        
        # write the coordinates
        # if you have a mesh with coords and connectivity, you can add a global mesh with
        # writer.add_mesh(coords, conn) outside this for loop
        writer.add_field('coords', coords, time=t)
        
        # write the data
        writer.add_field('mult_with_t', result, time=t)

        # write a global quantity
        writer.add_global_field('sum', np.sum(result))        
        
        # finish the step, required
        writer.finish_step()

    # finish sim, not required
    writer.finish_sim()


if __name__ == '__main__':

    parser = argparse.ArgumentParser()
    parser.add_argument('--path', type=str)
    parser.add_argument('--uid', type=str)
    args = parser.parse_args()

    main(args.path, args.uid)
```

# Viewing data

In [11]:
db = Manager('out')
db.df

Unnamed: 0,id,notes,exponent,processors,status,submitted,time_stamp,timesteps
0,example_1,What does this do?,2.0,1,Finished,False,2023-06-28 17:01:00,5
1,example_2,this must have some purpose right?,1.4,1,Finished,False,2023-06-28 17:01:00,5


In [40]:
sim = db.sim('example_1')  # or e.g.
sim = db.sim(db.df[db.df.exponent == 2.0].id[0])  # or e.g.
sim = db.sim(db.df.id[0])

sim.change_note('Ah I see what you are doing...')

In [41]:
sim.data_info

Unnamed: 0,coords,mult_with_t
dtype,float32,float32
shape,"(5, 2)","(5, 1)"
steps,5,5


In [43]:
sim.data('mult_with_t')

namespace(t=[0, 1, 2, 3, 4],
          data=[array([[0.],
                       [0.],
                       [0.],
                       [0.],
                       [0.]], dtype=float32),
                array([[7.],
                       [8.],
                       [6.],
                       [2.],
                       [1.]], dtype=float32),
                array([[28.],
                       [32.],
                       [24.],
                       [ 8.],
                       [ 4.]], dtype=float32),
                array([[63.],
                       [72.],
                       [54.],
                       [18.],
                       [ 9.]], dtype=float32),
                array([[112.],
                       [128.],
                       [ 96.],
                       [ 32.],
                       [ 16.]], dtype=float32)])

In [45]:
sim.globals

Unnamed: 0,sum
0,0.0
1,24.0
2,96.0
3,216.0
4,384.0


In [49]:
print(sim.git)


----- REMOTE ------ 
origin	git@gitlab.com:mohitpundir/database-manager.git (fetch)
origin	git@gitlab.com:mohitpundir/database-manager.git (push)

----- BRANCH ------ 
* florez 29c4c2e added functionality to copy executable, added creation of bash script for local and cluster use
  main   6324fea changed link to main

----- LAST COMMIT ------ 
-v

----- STATUS ------ 
On branch florez
Your branch is up to date with 'origin/florez'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	./

nothing added to commit but untracked files present (use "git add" to track)
----- DIFFERENCE ------ 

