In [1]:
from keras.models import Model
from keras.layers import Input, Dense

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


### Important files:
1. `train_vae.py`: The code for training the VAE with/without property prediction.
2. `hyperparameters.py`: The main file, `train_vae.py` references this file for stuff like what loss functions are used, what the weights are, and a boolean for whether property prediction is done. 
3. `models.py`: Contains model architecture (`encoder_model`, `decoder_model`, `property_predictor_model`).

### Input:
With/without property prediction, the input is just one-hot encoded padded smiles strings.

### Output: 
X_train, X_test, Y_train, Y_test are defined in the `vectorize_data` function in `train_vae.py`. Without property prediction, this function only returns X_train, X_test. 

Without property prediction, then `Y_train = []`.

With property prediction, then the property data for X_train is added. 
```python
Y_train = []
Y_train.append(Y_reg_train)

```
The resulting outputs are stored as `model_train_targets` (in `train_vae.py`)
```python
model_train_targets = {'x_pred':X_train, 'z_mean_log_var':np.ones((np.shape(X_train)[0], params['hidden_dim'] * 2))}
model_train_targets['reg_prop_pred'] = Y_train[0]
```
Note how even for no property prediction, it trains on two targets: 'x_pred' and 'z_mean_log_var'. 

### Losses:

#### Loss measures:
With/without property prediction, reconstruction loss and KL divergence are included. There is one loss for each target.

In `train_vae.py` -> `main_property_run`:

```python
model_losses = {'x_pred': params['loss'], 'z_mean_log_var': kl_loss}
```
With property prediction, a property prediction loss is included. Looking into hyperparameters.py, this loss is 'mse'.

```python
model_losses['reg_prop_pred'] = params['reg_prop_pred_loss']. 
```



#### Loss weights:
Each loss measure has a weight associated with it. The total loss is the sum of weight*loss.

In `hyperparameters.py`:

```python
'xent_loss_weight': 1.0
'kl_loss_weight': 1.0
'prop_pred_loss_weight': 0.5

```

### Architecture:
In `train_vae.py`-> `load_models` function:

Recall above that there are three targets if property prediction (only regression) is selected:
```python
AE_PP_model = Model(x_in, model_outputs)
model_outputs = [x_out, z_mean_log_var_output]
model_outputs.append(reg_prop_pred)
```


The model is compiled and trained in `train_vae.py` -> `main_property_run`:

```python

##X,Y from vectorize_data function. 
X_train, X_test, Y_train, Y_test = vectorize_data(params)

##As mentioned in Outputs section above, three outputs for 1. reconstruction, 2. latent_space, 3. property.
model_train_targets = {'x_pred':X_train, 'z_mean_log_var':np.ones((np.shape(X_train)[0], params['hidden_dim'] * 2))}
model_train_targets['reg_prop_pred'] = Y_train[0]


##model is compiled using the three losses.
AE_PP_model.compile(loss=model_losses,
               loss_weights=model_loss_weights,
               optimizer=optim,
               metrics={'x_pred': ['categorical_accuracy',
                    vae_anneal_metric]})

##model is fit using X_train (one hot padded smiles). Different from Keras, it is fit to 'x_pred' and 'z_mean_log_var' as targets if
##no property prediction, and those two + 'reg_prop_pred' if prop pred.
AE_PP_model.fit(X_train, model_train_targets,
                     batch_size=params['batch_size'],
                     epochs=params['epochs'],
                     initial_epoch=params['prev_epochs'],
                     callbacks=callbacks,
                     verbose=keras_verbose,
     validation_data=[X_test, model_test_targets]
 )

##not sure how this step works. 
encoder.save(params['encoder_weights_file'])
decoder.save(params['decoder_weights_file'])
property_predictor.save(params['prop_pred_weights_file'])

```