In [None]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = ''

## The dataset

Again, the job starts with loading the dataset.

Here we use the QM9 dataset [doi:10.6084/m9.figshare.978904](https://figshare.com/collections/Quantum_chemistry_structures_and_properties_of_134_kilo_molecules/978904),
https://www.nature.com/articles/sdata201422 .
A collection of quantum chemical calculations of 134 kilo organic molecules, which is a subset of the GDB-17 chemical universe of 166 billion organic molecules.


### A table of the properties

    Property  Unit         Description
    --------  -----------  --------------
    A         GHz          Rotational constant A
    B         GHz          Rotational constant B
    C         GHz          Rotational constant C
    mu        Debye        Dipole moment
    alpha     Bohr^3       Isotropic polarizability
    homo      Hartree      Energy of Highest occupied molecular orbital (HOMO)
    lumo      Hartree      Energy of Lowest occupied molecular orbital (LUMO)
    gap       Hartree      Gap, difference between LUMO and HOMO
    r2        Bohr^2       Electronic spatial extent
    zpve      Hartree      Zero point vibrational energy
    U0        Hartree      Internal energy at 0 K
    U         Hartree      Internal energy at 298.15 K
    H         Hartree      Enthalpy at 298.15 K
    G         Hartree      Free energy at 298.15 K
    Cv        cal/(mol K)  Heat capacity at 298.15 K



In [9]:
import tensorflow as tf
from pinn.datasets.tfr import load_tfrecord
from helpers import *

dataset = load_tfrecord('qm9-train.tfr', format_dict=qm9_full)

For your convinience we have downloaded and transformed the dataset here.
You can insepect the format of the dataset with the following blocks

In [10]:
show_dataset(dataset)

HBox(children=(NGLWidget(), VBox(children=(Select(layout=Layout(height='270px', width='200px'), options=('Samp…

In [None]:
with tf.Session() as sess:
    print(sess.run(dataset.make_one_shot_iterator().get_next()))

## The model function

Next, define a model function as we did yesterday.

Here we have defined a extra mode `Eval`. 
This is to allow the estimator to evaluate the out-of-sample error 
as during the training.
You'll see that on the tensorboard.

In [None]:
from pinn.networks import pinn_network
def model_fn(features, labels, mode, params):
    from pinn.networks import pinn_network

    label = features['U0'] # Change to your desired output
    pred = pinn_network(features)
    
    if mode == tf.estimator.ModeKeys.TRAIN:
        
        ##### Complete this part by defining the training operation
        loss = 
        train_op = 
        #####
        return tf.estimator.EstimatorSpec(
            mode, loss=loss, train_op=train_op)

    if mode == tf.estimator.ModeKeys.EVAL:
        ##### Also define the loss function here
        loss = 
        return tf.estimator.EstimatorSpec(
            mode, loss=loss, eval_metric_ops=metrics)

    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = pred
        return tf.estimator.EstimatorSpec(
            mode, predictions=predictions)

## Optimization

Depending on what you are predicting, your performance might vary significantly.

For example, for example, when training on potential eneries, assigning 
a "atomic dress" to each atom species will significantly boost the performance.

https://teoroo-cmc.github.io/PiNN_dev/notebooks/2_More_on_training.html#Atomic-dress

On the other hand, pinn sums all the atomic contributions when making predictions,
this might not be a good idea when predicting "homo" or "lumo". 
You'll need to tweak the network function to fit your need. 

At this point you might want to rewrite the pinn_network,
PiNN tries to make this easier by making the constituent parts reusable.
Layers are defined to perform tasks like constructing the neighbor lists, 
or perform pairwise operations.

Further reading:  
https://teoroo-cmc.github.io/PiNN_dev/concepts.html  
https://teoroo-cmc.github.io/PiNN_dev/networks/pinn.html  

You might also want to read the source code.

In [None]:
pinn_network??

You're welcome to further expore the optimal setup for certain task. 
But now we'll move on to see some applications of atomic neural netowrks in action.