# Examples

In [1]:
from mdb import MDBClient

client = MDBClient('localhost', 'postgres', '', 'madness')

  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)


## Get, Add, Update, Delete

There are four main methods in the client: `add`, `get`, `update` and `delete`. They have been made to be relatively easy to use and accept many different inputs.

Let say we want to add a molecule. To do so, we can use the `add` method, and read the results with `get`. The results is a pandas.DataFrame that corresponds to the table `molecule`. Indeed with this method you have to be aware of the scheme of the database.

In [2]:
data = {'smiles': 'C1C=CC=C1'}
client.add('molecule', data)
client.get('molecule')

100%|██████████| 1/1 [00:00<00:00,  6.63it/s]


Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:49.797484,{},12,C1C=CC=C1,2020-02-05 15:39:49.797484,7c99c0aa-a985-4055-8a89-62a569b6fccc


If we want to work with a pandas.DataFrame as an input, we can just change the input. The column of the dataframe have to corresponds to the fields of the sql table.

In [3]:
import pandas as pd

df = pd.DataFrame([{'smiles': 'C#N'}, {'smiles': 'CC'}])
df.head()

Unnamed: 0,smiles
0,C#N
1,CC


In [4]:
client.add('molecule', df)
df = client.get('molecule')
df

100%|██████████| 2/2 [00:00<00:00, 36.96it/s]


Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:49.797484,{},12,C1C=CC=C1,2020-02-05 15:39:49.797484,7c99c0aa-a985-4055-8a89-62a569b6fccc
1,2020-02-05 15:39:49.870092,{},13,C#N,2020-02-05 15:39:49.870092,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f
2,2020-02-05 15:39:50.024713,{},14,CC,2020-02-05 15:39:50.024713,a02c4bca-eb9e-442a-b7a3-73ada714e3f7


It is also possile to get a dataframe of a table, change some values and set it back to the database.

In [5]:
df.at[0, 'metadata'] = {'new_prop': 'prop_value'}
df

Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:49.797484,{'new_prop': 'prop_value'},12,C1C=CC=C1,2020-02-05 15:39:49.797484,7c99c0aa-a985-4055-8a89-62a569b6fccc
1,2020-02-05 15:39:49.870092,{},13,C#N,2020-02-05 15:39:49.870092,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f
2,2020-02-05 15:39:50.024713,{},14,CC,2020-02-05 15:39:50.024713,a02c4bca-eb9e-442a-b7a3-73ada714e3f7


In [6]:
client.update('molecule', df)
df = client.get('molecule')
df

3it [00:00, 42.57it/s]


Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:49.797484,{'new_prop': 'prop_value'},12,C1C=CC=C1,2020-02-05 15:39:50.062466,7c99c0aa-a985-4055-8a89-62a569b6fccc
1,2020-02-05 15:39:49.870092,{},13,C#N,2020-02-05 15:39:50.210280,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f
2,2020-02-05 15:39:50.024713,{},14,CC,2020-02-05 15:39:50.210280,a02c4bca-eb9e-442a-b7a3-73ada714e3f7


The `delete` method can take either a single uuid or a list of uuids:

In [7]:
df = client.get('molecule')
client.delete('molecule', df['uuid'].tolist())
client.get('molecule')

## Objects with relations and helper methods

Of course, `add`, `get`, `update` and `delete` don't capture everything. The relationships between objects do not appear on the pandas.DataFrame and it may be burdensome to add these relationship by hand. This is why some helper methods have been implemented.

It is recommended you use this method when you are adding data to the database instead of the base methods for all objects that have relationships.

In [8]:
# Adding fragments
client.add_fragment('ABC')
client.add_fragment('DEF')

# Adding molecule made of fragment ABC and DEF
client.add_molecule('ABCDEF', 
                    fragments_uuid=[client.get_uuid('fragment', smiles='ABC'),
                                    client.get_uuid('fragment', smiles='DEF')])

<sqlalchemy.ext.automap.eventstore at 0x111fd1be0>

In [9]:
# two fragments have been added
client.get('fragment')

Unnamed: 0,id,smiles,updated_on,uuid,created_on,properties
0,7,ABC,2020-02-05 15:39:50.419804,17396bee-dd8d-4b27-b7a4-0ebe6a509153,2020-02-05 15:39:50.419804,{}
1,8,DEF,2020-02-05 15:39:50.471218,134e7878-d8bb-4c6c-94a1-5094236fd3b0,2020-02-05 15:39:50.471218,{}


In [10]:
# as well as the molecule
client.get('molecule')

Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:50.490832,{},15,ABCDEF,2020-02-05 15:39:50.490832,b5c50345-69da-4857-a545-a9506c07c53b


In [11]:
# and the relationship binding the two
client.get('molecule_fragment')

Unnamed: 0,order,molecule_id,uuid,created_on,fragment_id,id,updated_on
0,0,b5c50345-69da-4857-a545-a9506c07c53b,2026d086-c84f-4d7b-b213-ffc31440127f,2020-02-05 15:39:50.521969,17396bee-dd8d-4b27-b7a4-0ebe6a509153,3,2020-02-05 15:39:50.521969
1,1,b5c50345-69da-4857-a545-a9506c07c53b,f46edfe8-4390-4ab0-8190-d1dd3074262c,2020-02-05 15:39:50.521969,134e7878-d8bb-4c6c-94a1-5094236fd3b0,4,2020-02-05 15:39:50.521969


**REMARK** There is no validation that the user provides a valid smiles to the database, or that it has been canonicalised.

## Filtering

All of this would be rather useless if we did not have an efficient way of filtering data. This is made rather easy thanks to the postgresql database. By using `client.get_models()` you get the scheme of the database and filter them as if they were python objects:

In [12]:
client.get('fragment', filters=[client.models.fragment.smiles == 'ABC'])

Unnamed: 0,id,smiles,updated_on,uuid,created_on,properties
0,7,ABC,2020-02-05 15:39:50.419804,17396bee-dd8d-4b27-b7a4-0ebe6a509153,2020-02-05 15:39:50.419804,{}


You can also do more complex filtering, joining tables that are related:

In [13]:
client.get(['molecule', 'molecule_fragment', 'fragment'], filters=[client.models.fragment.smiles == 'ABC', 
                                                                   client.models.molecule_fragment.order == 0])

Unnamed: 0,created_on,metadata,id,smiles,updated_on,uuid
0,2020-02-05 15:39:50.490832,{},15,ABCDEF,2020-02-05 15:39:50.490832,b5c50345-69da-4857-a545-a9506c07c53b


In the here-above example, you only get the first table, because a pandas.DataFrame is not really well suited to deal with relational objects. Fortunately, you can also output a `sqlalchemy` object that will incorportate this mapping.

In [14]:
molecule = client.get(['molecule', 'molecule_fragment', 'fragment'], return_df=False)
print(molecule[0].smiles)
print(molecule[0].molecule_fragment_collection[0].order)
print(molecule[0].molecule_fragment_collection[0].fragment.smiles)

ABCDEF
0
ABC


## Rollback

The advantage of using event sourcing is that you can rollback at any point of time. Here is a quick example:

In [15]:
client.get('eventstore')

Unnamed: 0,type,id,uuid,data,event,timestamp
0,molecule,55,7c99c0aa-a985-4055-8a89-62a569b6fccc,"{'id': 12, 'smiles': 'C1C=CC=C1'}",create,2020-02-05 15:39:49.797484
1,molecule,56,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,"{'id': 13, 'smiles': 'C#N'}",create,2020-02-05 15:39:49.870092
2,molecule,57,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,"{'id': 14, 'smiles': 'CC'}",create,2020-02-05 15:39:50.024713
3,molecule,58,7c99c0aa-a985-4055-8a89-62a569b6fccc,"{'id': 12, 'smiles': 'C1C=CC=C1', 'metadata': ...",update,2020-02-05 15:39:50.062466
4,molecule,59,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,"{'id': 13, 'smiles': 'C#N', 'metadata': {}}",update,2020-02-05 15:39:50.210280
5,molecule,60,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,"{'id': 14, 'smiles': 'CC', 'metadata': {}}",update,2020-02-05 15:39:50.210280
6,molecule,61,7c99c0aa-a985-4055-8a89-62a569b6fccc,{},delete,2020-02-05 15:39:50.274993
7,molecule,62,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,{},delete,2020-02-05 15:39:50.372971
8,molecule,63,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,{},delete,2020-02-05 15:39:50.372971
9,fragment,64,17396bee-dd8d-4b27-b7a4-0ebe6a509153,"{'id': 7, 'smiles': 'ABC'}",create,2020-02-05 15:39:50.419804


In [16]:
from datetime import datetime
client.rollback(datetime(1980, 4, 3))

<sqlalchemy.ext.automap.eventstore at 0x1120005c0>

In [17]:
client.get('eventstore').sort_values(by='id')

Unnamed: 0,type,id,uuid,data,event,timestamp
0,molecule,55,7c99c0aa-a985-4055-8a89-62a569b6fccc,"{'id': 12, 'smiles': 'C1C=CC=C1'}",create,2020-02-05 15:39:49.797484
1,molecule,56,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,"{'id': 13, 'smiles': 'C#N'}",create,2020-02-05 15:39:49.870092
2,molecule,57,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,"{'id': 14, 'smiles': 'CC'}",create,2020-02-05 15:39:50.024713
3,molecule,58,7c99c0aa-a985-4055-8a89-62a569b6fccc,"{'id': 12, 'smiles': 'C1C=CC=C1', 'metadata': ...",update,2020-02-05 15:39:50.062466
4,molecule,59,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,"{'id': 13, 'smiles': 'C#N', 'metadata': {}}",update,2020-02-05 15:39:50.210280
5,molecule,60,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,"{'id': 14, 'smiles': 'CC', 'metadata': {}}",update,2020-02-05 15:39:50.210280
6,molecule,61,7c99c0aa-a985-4055-8a89-62a569b6fccc,{},delete,2020-02-05 15:39:50.274993
7,molecule,62,1716c6af-47e4-4aa2-bc3d-068fa5e52a4f,{},delete,2020-02-05 15:39:50.372971
8,molecule,63,a02c4bca-eb9e-442a-b7a3-73ada714e3f7,{},delete,2020-02-05 15:39:50.372971
9,fragment,64,17396bee-dd8d-4b27-b7a4-0ebe6a509153,"{'id': 7, 'smiles': 'ABC'}",create,2020-02-05 15:39:50.419804


In [18]:
client.get('molecule')

In [19]:
client.get('fragment')

# Experimental data

... TODO ...