# Basic usage of ABCD database

In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

In [7]:
from abcd import ABCD

First of all, we need to define the url of the database. It could be local or remote:

- direct access: url = 'mongodb://localhost:27017'
- api access: url = 'http://localhost/api'

using with statement to catch the riased exceptions. You may can ignore them  but in that case need to handle all the unexpected events. (cannot connect to db, lost connection, wrong filter, wrong url, etc. )

In [8]:
url = 'mongodb://localhost:27017'
# url = 'http://localhost:5000/api'
abcd =  ABCD(url)

print(abcd)
abcd

MongoDatabase(url=localhost:27017, db=abcd, collection=atoms)


In [11]:
abcd.print_info()

      type: mongodb
      host: localhost
      port: 27017
        db: abcd
collection: atoms
number of confs: 0


## Cleanup 

WARNING!! Remove all elements from the database.
Only supported in the case of local access

In [12]:
with abcd as db:
    db.destroy()

## Uploading configurations

In [13]:
from pathlib import Path

from ase.io import iread, read
from utils.ext_xyz import XYZReader

direcotry = Path('utils/data/')
file = direcotry / 'bcc_bulk_54_expanded_2_high.xyz'
# file = direcotry / 'GAP_6.xyz'

Uploading configurations on-by-one directly from an ase atoms object:

In [14]:
%%time
with abcd as db:

    for atoms in iread(file.as_posix(), index=slice(None)):
        
        # Hack to fix the representation of forces
        atoms.calc.results['forces'] = atoms.arrays['force']
        atoms.arrays['force'] = None
            
        db.push(atoms)
    

CPU times: user 30.3 ms, sys: 2.79 ms, total: 33.1 ms
Wall time: 37.6 ms


Reading the trajectory from file:

In [15]:
%%time
traj = read(file.as_posix(), index=slice(None))
len(traj)

CPU times: user 11.7 ms, sys: 2.09 ms, total: 13.7 ms
Wall time: 12.9 ms


In [16]:
%%time
with XYZReader(file) as reader:
    traj = list(reader.read_atoms(forces_label='force'))


CPU times: user 8.18 ms, sys: 2.33 ms, total: 10.5 ms
Wall time: 8.52 ms


Pushing the whole trajectory to the database:

In [17]:
%%time
db.push(traj)

CPU times: user 6.84 ms, sys: 1.69 ms, total: 8.53 ms
Wall time: 11 ms


Uploading a whole file and injecting to the database on the server side:

In [18]:
%%time
with abcd as db:
    db.upload(file.as_posix())

CPU times: user 23.8 ms, sys: 1.77 ms, total: 25.6 ms
Wall time: 26.9 ms


An alternative way to upload file to database:

In [19]:
%%time
with abcd as db, XYZReader(file) as reader:
    db.push(reader.read_atoms(forces_label='force'))

CPU times: user 12.2 ms, sys: 1.89 ms, total: 14.1 ms
Wall time: 13.3 ms


In [20]:
abcd.info()

{'host': 'localhost',
 'port': 27017,
 'db': 'abcd',
 'collection': 'atoms',
 'number of confs': 56}

In [28]:
list(abcd.db.atoms.find())

[{'_id': ObjectId('5c530d3ef34bd2ac87c5ee31'),
  'arrays': {'numbers': [26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26,
    26],
   'positions': [[0.06017635, 0.05781819, 0.08460648],
    [1.40918009, 1.44551149, 1.41805636],
    [-0.00360893, -0.03751375, 3.01276241],
    [1.44248584, 1.40631291, 4.4529162],
    [-0.080846, -0.02919137, 5.83695942],
    [1.31644895, 1.39765164, 7.26587423],
    [-0.03131407, 2.98052992, 0.0504902],
    [1.34941525, 4.27114514, 1.58856784],
    [-0.01589813, 2.82405412, 2.98128968],
    [1.40023491, 4.20467382, 4.46953326],
    [0.01940004, 2.87493497, 5.75744705],
    [1.50073891, 4.402

## Query data from the database

The key component is the query string which implementation is based on the GraphQl specification

- query all
- query atoms
- query properties

- histograms/summaries?

In [None]:
query = {
    'elements': ['Fe', 'H'],
}

with abcd as db:
    traj = [atoms for atoms in db.pull(query)]
    
print(traj)

Query specific properties (like all energies of filetered atoms):

In [None]:
query = {
    'filter': { 
        elements: ['Fe', 'H'],
    },
    'fields': [
        'energy',
    ]
}

with abcd as db:
    data = db.query(query)
    
print(data)

## Download the whole database

Download the whole database 

In [None]:
with open('dump.db') as file:
    with abcd as db:
        db.download(file)

## Linking databases

Pull the data from one into another. This function is usefull when you want to build a local database by fetching the data from another repositories.

In [None]:
with abcd as db:
    db.fetch_from(url='...', query = {})

In [None]:
print(abcd.info())

## Command line interface

In [None]:
!abcd --help

In [None]:
!abcd connect/login 

In [None]:
!abcd info

In [None]:
!abcd push --help

In [None]:
!abcd pull --help

In [None]:
!abcd query --help

In [None]:
!abcd download --help

In [None]:
# search? it is a specific query which returns with the ids only.

# Web interface

flask web server
