# Examples

In [2]:
from mdb import MDBClient

client = MDBClient('localhost', 'postgres', '', 'molecdb')

  """)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)


## Get, Add, Update, Delete

There are four main methods in the client: `add`, `get`, `update` and `delete`. They have been made to be relatively easy to use and accept many different inputs.

Let say we want to add a molecule. To do so, we can use the `add` method, and read the results with `get`. The results is a pandas.DataFrame that corresponds to the table `molecule`. Indeed with this method you have to be aware of the scheme of the database.

In [3]:
data = {'smiles': 'C1C=CC=C1'}
client.add('molecule', data)
client.get('molecule')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.197789,1,{},C1C=CC=C1,2020-01-19 19:27:37.197789,76c6d4c7-c8cc-4807-afa2-154541f5aca9


If we want to work with a pandas.DataFrame as an input, we can just change the input. The column of the dataframe have to corresponds to the fields of the sql table.

In [4]:
import pandas as pd

df = pd.DataFrame([{'smiles': 'C#N'}, {'smiles': 'CC'}])
df.head()

Unnamed: 0,smiles
0,C#N
1,CC


In [5]:
client.add('molecule', df)
df = client.get('molecule')
df

100%|██████████| 2/2 [00:00<00:00, 44.70it/s]


Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.197789,1,{},C1C=CC=C1,2020-01-19 19:27:37.197789,76c6d4c7-c8cc-4807-afa2-154541f5aca9
1,2020-01-19 19:27:37.298978,2,{},C#N,2020-01-19 19:27:37.298978,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
2,2020-01-19 19:27:37.404864,3,{},CC,2020-01-19 19:27:37.404864,618cbef6-0d4c-4a2f-b35d-5e30afd1159f


It is also possile to get a dataframe of a table, change some values and set it back to the database.

In [6]:
df.at[0, 'properties'] = {'new_prop': 'prop_value'}
df

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.197789,1,{'new_prop': 'prop_value'},C1C=CC=C1,2020-01-19 19:27:37.197789,76c6d4c7-c8cc-4807-afa2-154541f5aca9
1,2020-01-19 19:27:37.298978,2,{},C#N,2020-01-19 19:27:37.298978,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
2,2020-01-19 19:27:37.404864,3,{},CC,2020-01-19 19:27:37.404864,618cbef6-0d4c-4a2f-b35d-5e30afd1159f


In [7]:
client.update('molecule', df)
df = client.get('molecule')
df

100%|██████████| 3/3 [00:00<00:00, 1469.28it/s]


Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.197789,1,{'new_prop': 'prop_value'},C1C=CC=C1,2020-01-19 19:27:37.436168,76c6d4c7-c8cc-4807-afa2-154541f5aca9
1,2020-01-19 19:27:37.298978,2,{},C#N,2020-01-19 19:27:37.436168,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
2,2020-01-19 19:27:37.404864,3,{},CC,2020-01-19 19:27:37.436168,618cbef6-0d4c-4a2f-b35d-5e30afd1159f


The `delete` method can take either a single uuid or a list of uuids:

In [8]:
df = client.get('molecule')
client.delete('molecule', df['uuid'].tolist())
client.get('molecule')

## Objects with relations and helper methods

Of course, `add`, `get`, `update` and `delete` don't capture everything. The relationships between objects do not appear on the pandas.DataFrame and it may be burdensome to add these relationship by hand. This is why some helper methods have been implemented.

It is recommended you use this method when you are adding data to the database instead of the base methods for all objects that have relationships.

In [9]:
client.add_molecule(fragment=['C#N', 'CCC'], smiles='CCCC#N')

100%|██████████| 2/2 [00:00<00:00, 61.51it/s]


{'molecule': <sqlalchemy.ext.automap.eventstore at 0x1181e9208>,
 'molecule_fragment': [<sqlalchemy.ext.automap.eventstore at 0x1181fcbe0>,
  <sqlalchemy.ext.automap.eventstore at 0x1181e9ba8>]}

In [10]:
# two fragments have been added
client.get('fragment')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.795124,1,{},C#N,2020-01-19 19:27:37.795124,81a5b055-23f7-44b8-9ad2-bd75d4497143
1,2020-01-19 19:27:37.836063,2,{},CCC,2020-01-19 19:27:37.836063,5bb15bf7-7a7d-49b1-a335-ec744f85ace7


In [11]:
# as well as the molecule
client.get('molecule')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.739821,4,{},CCCC#N,2020-01-19 19:27:37.739821,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1


In [12]:
# and the relationship binding the two
client.get('molecule_fragment')

Unnamed: 0,created_on,fragment_id,id,molecule_id,order,updated_on,uuid
0,2020-01-19 19:27:37.867158,81a5b055-23f7-44b8-9ad2-bd75d4497143,1,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1,0,2020-01-19 19:27:37.867158,52bf8db5-c7e2-42b1-a407-42cb343ebfbd
1,2020-01-19 19:27:37.913660,5bb15bf7-7a7d-49b1-a335-ec744f85ace7,2,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1,1,2020-01-19 19:27:37.913660,9bff32db-3ab0-4c70-8a4c-2143f3fd3afe


## Filtering

All of this would be rather useless if we did not have an efficient way of filtering data. This is made rather easy thanks to the postgresql database. By using `client.get_models()` you get the scheme of the database and filter them as if they were python objects:

In [13]:
models = client.get_models()

In [14]:
client.get('fragment', filters=[models.fragment.smiles == 'C#N'])

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.795124,1,{},C#N,2020-01-19 19:27:37.795124,81a5b055-23f7-44b8-9ad2-bd75d4497143


You can also do more complex filtering, joining tables that are related:

In [15]:
client.get(['molecule', 'molecule_fragment', 'fragment'], filters=[models.fragment.smiles == 'C#N', 
                                                                   models.molecule_fragment.order == 0])

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2020-01-19 19:27:37.739821,4,{},CCCC#N,2020-01-19 19:27:37.739821,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1


In the here-above example, you only get the first table, because a pandas.DataFrame is not really well suited to deal with relational objects. Fortunately, you can also output a `sqlalchemy` object that will incorportate this mapping.

In [16]:
molecule = client.get(['molecule', 'molecule_fragment', 'fragment'], dataframe=False)
print(molecule[0].smiles)
print(molecule[0].molecule_fragment_collection[0].order)
print(molecule[0].molecule_fragment_collection[0].fragment.smiles)

CCCC#N
0
C#N


## Rollback

The advantage of using event sourcing is that you can rollback at any point of time. Here is a quick example:

In [17]:
client.get('eventstore')

Unnamed: 0,data,event,id,timestamp,type,user_id,uuid
0,"{'id': 1, 'smiles': 'C1C=CC=C1'}",create,2234,2020-01-19 19:27:37.197789,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
1,"{'id': 2, 'smiles': 'C#N'}",create,2235,2020-01-19 19:27:37.298978,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
2,"{'id': 3, 'smiles': 'CC'}",create,2236,2020-01-19 19:27:37.404864,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
3,"{'id': 1, 'smiles': 'C1C=CC=C1', 'properties':...",update,2237,2020-01-19 19:27:37.436168,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
4,"{'id': 2, 'smiles': 'C#N', 'properties': {}}",update,2238,2020-01-19 19:27:37.436168,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
5,"{'id': 3, 'smiles': 'CC', 'properties': {}}",update,2239,2020-01-19 19:27:37.436168,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
6,,delete,2240,2020-01-19 19:27:37.624624,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
7,,delete,2241,2020-01-19 19:27:37.624624,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
8,,delete,2242,2020-01-19 19:27:37.624624,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
9,"{'id': 4, 'smiles': 'CCCC#N'}",create,2243,2020-01-19 19:27:37.739821,molecule,1,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1


In [18]:
from datetime import datetime
client.rollback(datetime(1980, 4, 3))

In [19]:
client.get('eventstore').sort_values(by='id')

Unnamed: 0,data,event,id,timestamp,type,user_id,uuid
0,"{'id': 1, 'smiles': 'C1C=CC=C1'}",create,2234,2020-01-19 19:27:37.197789,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
1,"{'id': 2, 'smiles': 'C#N'}",create,2235,2020-01-19 19:27:37.298978,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
2,"{'id': 3, 'smiles': 'CC'}",create,2236,2020-01-19 19:27:37.404864,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
3,"{'id': 1, 'smiles': 'C1C=CC=C1', 'properties':...",update,2237,2020-01-19 19:27:37.436168,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
4,"{'id': 2, 'smiles': 'C#N', 'properties': {}}",update,2238,2020-01-19 19:27:37.436168,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
5,"{'id': 3, 'smiles': 'CC', 'properties': {}}",update,2239,2020-01-19 19:27:37.436168,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
6,,delete,2240,2020-01-19 19:27:37.624624,molecule,1,76c6d4c7-c8cc-4807-afa2-154541f5aca9
7,,delete,2241,2020-01-19 19:27:37.624624,molecule,1,b6d9e64a-f6d2-4b49-85f3-e8ff74a7d9ab
8,,delete,2242,2020-01-19 19:27:37.624624,molecule,1,618cbef6-0d4c-4a2f-b35d-5e30afd1159f
9,"{'id': 4, 'smiles': 'CCCC#N'}",create,2243,2020-01-19 19:27:37.739821,molecule,1,4dbf7365-aad1-4759-8dd3-7a8a337bc6d1
