# Examples

In [1]:
from goldmine import GoldmineClient

client = GoldmineClient('localhost', 'postgres', '', 'molecdb')

  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)
  % (item.__module__, item.__name__)


## Get, Add, Update, Delete

There are four main methods in the client: `add`, `get`, `update` and `delete`. They have been made to be relatively easy to use and accept many different inputs.

Let say we want to add a molecule. To do so, we can use the `add` method, and read the results with `get`. The results is a pandas.DataFrame that corresponds to the table `molecule`. Indeed with this method you have to be aware of the scheme of the database.

In [2]:
data = {'smiles': 'C1C=CC=C1'}
client.add('molecule', data)
client.get('molecule')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:04.141505,1,{},C1C=CC=C1,2019-06-24 12:50:04.141505,d12075e9-b918-4b33-8135-ae2c3b04337b


If we want to work with a pandas.DataFrame as an input, we can just change the input. The column of the dataframe have to corresponds to the fields of the sql table.

In [3]:
import pandas as pd

df = pd.DataFrame([{'smiles': 'C#N'}, {'smiles': 'CC'}])
df.head()

Unnamed: 0,smiles
0,C#N
1,CC


In [4]:
client.add('molecule', df)
df = client.get('molecule')
df

100%|██████████| 2/2 [00:00<00:00, 60.01it/s]


Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:04.141505,1,{},C1C=CC=C1,2019-06-24 12:50:04.141505,d12075e9-b918-4b33-8135-ae2c3b04337b
1,2019-06-24 12:50:04.215284,2,{},C#N,2019-06-24 12:50:04.215284,00f65f13-06fb-420e-b535-657d2a714f0a
2,2019-06-24 12:50:05.663229,3,{},CC,2019-06-24 12:50:05.663229,2d312c7e-b44d-4252-9110-673fc963f505


It is also possile to get a dataframe of a table, change some values and set it back to the database.

In [5]:
df.at[0, 'properties'] = {'new_prop': 'prop_value'}
df

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:04.141505,1,{'new_prop': 'prop_value'},C1C=CC=C1,2019-06-24 12:50:04.141505,d12075e9-b918-4b33-8135-ae2c3b04337b
1,2019-06-24 12:50:04.215284,2,{},C#N,2019-06-24 12:50:04.215284,00f65f13-06fb-420e-b535-657d2a714f0a
2,2019-06-24 12:50:05.663229,3,{},CC,2019-06-24 12:50:05.663229,2d312c7e-b44d-4252-9110-673fc963f505


In [6]:
client.update('molecule', df)
df = client.get('molecule')
df

100%|██████████| 3/3 [00:00<00:00, 959.72it/s]


Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:04.141505,1,{'new_prop': 'prop_value'},C1C=CC=C1,2019-06-24 12:50:05.696646,d12075e9-b918-4b33-8135-ae2c3b04337b
1,2019-06-24 12:50:04.215284,2,{},C#N,2019-06-24 12:50:05.696646,00f65f13-06fb-420e-b535-657d2a714f0a
2,2019-06-24 12:50:05.663229,3,{},CC,2019-06-24 12:50:05.696646,2d312c7e-b44d-4252-9110-673fc963f505


The `delete` method can take either a single uuid or a list of uuids:

In [7]:
df = client.get('molecule')
client.delete('molecule', df['uuid'].tolist())
client.get('molecule')

## Objects with relations and helper methods

Of course, `add`, `get`, `update` and `delete` don't capture everything. The relationships between objects do not appear on the pandas.DataFrame and it may be burdensome to add these relationship by hand. This is why some helper methods have been implemented.

It is recommended you use this method when you are adding data to the database instead of the base methods for all objects that have relationships.

In [8]:
client.add_molecule(fragment=['C#N', 'CCC'], smiles='CCCC#N')

100%|██████████| 2/2 [00:00<00:00, 89.88it/s]


In [9]:
# two fragments have been added
client.get('fragment')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:08.580389,1,{},C#N,2019-06-24 12:50:08.580389,f89b3c7a-8350-48af-811c-4821448df1a5
1,2019-06-24 12:50:08.622059,2,{},CCC,2019-06-24 12:50:08.622059,b2ea629f-88c5-4f7d-a220-ccd4de8b0ab7


In [10]:
# as well as the molecule
client.get('molecule')

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:07.814670,4,{},CCCC#N,2019-06-24 12:50:07.814670,024c4524-d24b-497b-b01a-71f29cbd5e8f


In [11]:
# and the relationship binding the two
client.get('molecule_fragment')

Unnamed: 0,created_on,fragment_id,id,molecule_id,order,updated_on,uuid
0,2019-06-24 12:50:08.649079,f89b3c7a-8350-48af-811c-4821448df1a5,1,024c4524-d24b-497b-b01a-71f29cbd5e8f,0,2019-06-24 12:50:08.649079,747fbf3c-64ca-4cc9-ad10-41e73d15910a
1,2019-06-24 12:50:08.678458,b2ea629f-88c5-4f7d-a220-ccd4de8b0ab7,2,024c4524-d24b-497b-b01a-71f29cbd5e8f,1,2019-06-24 12:50:08.678458,e71b32b7-e752-4644-b248-5f556cbf903a


## Filtering

All of this would be rather useless if we did not have an efficient way of filtering data. This is made rather easy thanks to the postgresql database. By using `client.get_models()` you get the scheme of the database and filter them as if they were python objects:

In [12]:
models = client.get_models()

In [13]:
client.get('fragment', filters=[models.fragment.smiles == 'C#N'])

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:08.580389,1,{},C#N,2019-06-24 12:50:08.580389,f89b3c7a-8350-48af-811c-4821448df1a5


You can also do more complex filtering, joining tables that are related:

In [14]:
client.get(['molecule', 'molecule_fragment', 'fragment'], filters=[models.fragment.smiles == 'C#N', 
                                                                   models.molecule_fragment.order == 0])

Unnamed: 0,created_on,id,properties,smiles,updated_on,uuid
0,2019-06-24 12:50:07.814670,4,{},CCCC#N,2019-06-24 12:50:07.814670,024c4524-d24b-497b-b01a-71f29cbd5e8f


In the here-above example, you only get the first table, because a pandas.DataFrame is not really well suited to deal with relational objects. Fortunately, you can also output a `sqlalchemy` object that will incorportate this mapping.

In [15]:
molecule = client.get(['molecule', 'molecule_fragment', 'fragment'], dataframe=False)
print(molecule[0].smiles)
print(molecule[0].molecule_fragment_collection[0].order)
print(molecule[0].molecule_fragment_collection[0].fragment.smiles)

CCCC#N
0
C#N


## Rollback

The advantage of using event sourcing is that you can rollback at any point of time. Here is a quick example:

In [16]:
client.get('eventstore')

Unnamed: 0,data,event,id,timestamp,type,user_id,uuid
0,"{'id': 1, 'smiles': 'C1C=CC=C1'}",create,1,2019-06-24 12:50:04.141505,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
1,"{'id': 2, 'smiles': 'C#N'}",create,2,2019-06-24 12:50:04.215284,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
2,"{'id': 3, 'smiles': 'CC'}",create,3,2019-06-24 12:50:05.663229,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
3,"{'id': 1, 'smiles': 'C1C=CC=C1', 'properties':...",update,4,2019-06-24 12:50:05.696646,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
4,"{'id': 2, 'smiles': 'C#N', 'properties': {}}",update,5,2019-06-24 12:50:05.696646,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
5,"{'id': 3, 'smiles': 'CC', 'properties': {}}",update,6,2019-06-24 12:50:05.696646,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
6,,delete,7,2019-06-24 12:50:07.036020,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
7,,delete,8,2019-06-24 12:50:07.036020,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
8,,delete,9,2019-06-24 12:50:07.036020,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
9,"{'id': 4, 'smiles': 'CCCC#N'}",create,10,2019-06-24 12:50:07.814670,molecule,1,024c4524-d24b-497b-b01a-71f29cbd5e8f


In [17]:
from datetime import datetime
client.rollback(datetime(1980, 4, 3))

In [18]:
client.get('eventstore').sort_values(by='id')

Unnamed: 0,data,event,id,timestamp,type,user_id,uuid
0,"{'id': 1, 'smiles': 'C1C=CC=C1'}",create,1,2019-06-24 12:50:04.141505,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
1,"{'id': 2, 'smiles': 'C#N'}",create,2,2019-06-24 12:50:04.215284,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
2,"{'id': 3, 'smiles': 'CC'}",create,3,2019-06-24 12:50:05.663229,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
3,"{'id': 1, 'smiles': 'C1C=CC=C1', 'properties':...",update,4,2019-06-24 12:50:05.696646,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
4,"{'id': 2, 'smiles': 'C#N', 'properties': {}}",update,5,2019-06-24 12:50:05.696646,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
5,"{'id': 3, 'smiles': 'CC', 'properties': {}}",update,6,2019-06-24 12:50:05.696646,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
6,,delete,7,2019-06-24 12:50:07.036020,molecule,1,d12075e9-b918-4b33-8135-ae2c3b04337b
7,,delete,8,2019-06-24 12:50:07.036020,molecule,1,00f65f13-06fb-420e-b535-657d2a714f0a
8,,delete,9,2019-06-24 12:50:07.036020,molecule,1,2d312c7e-b44d-4252-9110-673fc963f505
9,"{'id': 4, 'smiles': 'CCCC#N'}",create,10,2019-06-24 12:50:07.814670,molecule,1,024c4524-d24b-497b-b01a-71f29cbd5e8f
