# Test out Specification
We provide a specification object for the various steps in our multi-fidelity search as well as the models needed to calibrate the results of lower-fidelity step.

This notebooks makes sure a specification parses correctly and ensures the utility operations work

In [1]:
from moldesign.specify import MultiFidelitySearchSpecification
from moldesign.store.mongo import MoleculePropertyDB
import yaml

  from .autonotebook import tqdm as notebook_tqdm


## Load the spec
Make sure it parses and generate some derived properties

In [2]:
with open('model-spec.yaml') as fp:
    spec = MultiFidelitySearchSpecification.parse_obj(yaml.safe_load(fp))
spec

MultiFidelitySearchSpecification(oxidation_state=<OxidationState.OXIDIZED: 'oxidized'>, target_level='nob-acn-smb-geom', model_levels=[ModelEnsemble(base_fidelity='smb-vacuum-vertical', model_type=<ModelType.SCHNET: 'schnet'>, model_pattern='../../ai-components/ip-multi-fidelity/ip-acn-nob-adia-smb/adiabatic/**/best_model', max_models=8, calibration=1.0, model_paths_=())], base_model=ModelEnsemble(base_fidelity=None, model_type=<ModelType.MPNN: 'mpnn'>, model_pattern='../../ai-components/ip-multi-fidelity/ip-acn-nob-adia-smb/vertical/**/best_model.h5', max_models=8, calibration=1, model_paths_=()))

Get the levels

In [3]:
spec.levels

['smb-vacuum-vertical', 'nob-acn-smb-geom']

Get the target property

In [4]:
spec.target_property

'oxidation_potential.nob-acn-smb-geom'

## Test gathering the training set
One key feature is that we can use the specification to draw the training sets needed for each level.

First, connect to MongoDB

In [5]:
db = MoleculePropertyDB.from_connection_info(port=27855)

Get the training set used for an MPNN that predicts the target property

In [6]:
base_training = spec.get_base_training_set(db)
print(f'Pulled {len(base_training)} molecules for the initial training set')

Pulled 3376 molecules for the initial training set


Get for one of the calibration models

In [7]:
base_training = spec.get_calibration_training_set(0, db)
print(f'Pulled {len(base_training)} molecules for the initial training set')

Pulled 3331 molecules for the initial training set


## Load the models
Show that we can load the models for each level

In [8]:
print(f'Found {len(spec.base_model.model_paths)} models for the base level')

Found 8 models for the base level


In [10]:
next(spec.base_model.load_all_model_messages())

<moldesign.score.mpnn.MPNNMessage at 0x20791f6cd60>

Repeat the process for the next levels

In [11]:
for level in spec.model_levels:
    print(f'Found {len(level.model_paths)} models for {level.base_fidelity}')
    print(f'First model: {next(level.load_all_model_messages())}')

Found 8 models for smb-vacuum-vertical
First model: <moldesign.score.schnet.TorchMessage object at 0x00000207916AEF70>


## Preparing for Inference
Given a record get the inputs needed for the next step

Get the full record for a molecule that has not finished all levels of fidelity

In [12]:
record = db.get_eligible_molecule_records(['oxidation_potential.smb-vacuum-vertical'], [spec.target_property])[0]
record = db.get_molecule_record(record.key)

See the highest level completed so far

In [13]:
spec.get_current_step(record)

'smb-vacuum-vertical'

Get the inputs needed for the calibration model

In [14]:
spec.get_inference_inputs(record)

('smb-vacuum-vertical',
 '13\n0 1 InChI=1S/C3H5N3O2/c4-3-2(1-7)5-6-8-3/h7H,1,4H2_neutral\nN                     0.205762452359     1.536840246918    -0.805931613138\nC                     0.910389167616     0.679716572569    -0.046123943213\nC                     0.374419207421    -0.590489622807     0.334054354851\nC                    -0.914073356733    -1.148432544046    -0.153505813140\nO                    -1.948277494540    -0.081423366647    -0.238146307630\nN                     1.292642791341    -1.175704153662     1.131349865537\nN                     2.353921905696    -0.621180375622     1.391132538528\nO                     2.098005547990     0.929770676459     0.438067118103\nH                    -0.774756798334     1.285990794116    -0.965230323761\nH                     0.592326505748     2.433486523580    -1.068617752290\nH                    -1.235985934228    -1.985267824914     0.476944645771\nH                    -0.849792862226    -1.497562504084    -1.189250468111