# Inverse design with Keras Neural Networks

### Level: Intermediate

In this notebook, we show how to achieve the inverse desing of the input parameters to provide to an ANN in order to achieve an specific output. 

First, we create a basic neural network with optimized hiperparameters [1,2]. Then, we adress the optimization process between the desired output and the ANN output as a function of the ANN inputs.

References and additional documentation at the end of this notebook.

#### Dependencies

In [None]:
import keras
import keras_tuner as kt
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from sklearn.model_selection import train_test_split

In [2]:
# fix random seeds
seed = 0
np_rng = np.random.default_rng(seed)  # [3]
keras.utils.set_random_seed(seed)

#### Data

Load train data and split into train/test groups.

In [3]:
try:
    X = pd.read_parquet("features.parquet")
    y = pd.read_parquet("targets.parquet")
except FileNotFoundError:
    # For didactic purposes, we define a set of data with binary representation
    # of 0 to 3 as input and the decimal representation as output.
    X = pd.DataFrame(
        [
            [0, 0],
            [0, 1],
            [1, 0],
            [1, 0],
            [0, 1],
            [0, 1],
            [1, 1],
            [1, 0],
            [1, 1],
            [0, 0],
            [0, 0],
        ]
        * 4
    )
    y = pd.Series([0, 1, 2, 2, 1, 1, 3, 2, 3, 0, 0] * 4)


# we set a fixed random state for reproducibility and teaching purposes,
# but our results have to be consistent across multiple seeds to be relevant
X_train, X_test, y_train, y_test = train_test_split(
    X, y, train_size=0.9, test_size=0.1, random_state=seed
)

# shape of input features and output predictions
features_shape = X_train.iloc[0].shape
target_shape = y_train.iloc[0].shape if bool(y_train.iloc[0].shape) else 1

#### Some parameters

In [4]:
# model optimizer
loss = "mae"
lr = {"min_value": 1e-4, "max_value": 1e-2}  # values from 0.0001 to 0.01
metrics = ["accuracy"]

# training
epochs = 200
batch_size = 64

# hp tuner
max_trials = 50

# early stopping
monitor = "val_loss"
patience = int(0.1 * epochs)

## Neural Network model

Hyperparameter optimization and NN model training.

Optimization regarding epoch number is skipped as early stopping is considered a sufficient method for achieving it.

In [5]:
def model_builder(hp):
    """Build a neural network model."""
    input_layer = keras.Input(shape=features_shape)
    inner_layer_1 = keras.layers.Dense(64, activation="selu")(input_layer)
    inner_layer_2 = keras.layers.Dense(32, activation="selu")(inner_layer_1)
    inner_layer_3 = keras.layers.Dense(16, activation="selu")(inner_layer_2)
    output_layer = keras.layers.Dense(target_shape)(inner_layer_3)

    model = keras.Model(inputs=input_layer, outputs=output_layer, name="NN_model")

    # Tune the learning rate for the optimizer, choose an optimal value [2]
    hp_learning_rate = hp.Float(
        "learning_rate", min_value=lr["min_value"], max_value=lr["max_value"]
    )

    model.compile(
        loss=loss,
        optimizer=keras.optimizers.Nadam(learning_rate=hp_learning_rate),
        metrics=metrics,
    )

    return model

In [None]:
# we set a fixed random state for reproducibility and teaching purposes
tuner = kt.GridSearch(
    hypermodel=model_builder,
    objective=metrics,
    max_trials=max_trials,
    tune_new_entries=True,
    allow_new_entries=True,
    seed=seed,
    project_name="KerasTuner",
    # executions_per_trial = 10
)

# set an early stopping
callbacks = [keras.callbacks.EarlyStopping(monitor=monitor, patience=patience)]

# finally tune hyperpararmeters using the 10 percent of train data for
# validation
tuner.search(
    X_train,
    y_train,
    batch_size=batch_size,
    epochs=epochs,
    validation_split=0.1,
    callbacks=callbacks,
)

# get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

# build the model with the optimal hyperparameters
model = tuner.hypermodel.build(best_hps)

# train the model again
model.fit(
    X_train,
    y_train,
    batch_size=batch_size,
    epochs=epochs,
    validation_split=0.1,
    callbacks=callbacks,
)

# save the model
model.save("../models/model__Inverse_design_with_NN_Keras.keras")

#### Alternatively, load a saved model

Load saved model and use it for the inverse design.

In [7]:
model = keras.saving.load_model("../models/model__Inverse_design_with_NN_Keras.keras")

## Inverse design

First, define the loss which determine the difference between our target and the output of the network as a function of the inputs of the network, i.e., define the objective function to be minimized. 

We select the Mean Squared Error.

In [8]:
def mse_loss(NN_input: np.array, model: keras.Model, target: float) -> float:
    """Obtain the Mean Squared Error between an ANN output and a given target.

    Intended for a single NN input, not a batch of inputs.
    """
    if len(NN_input.shape) > 2:
        raise ValueError("Function intended just for a single NN input. Terminated.")

    # add an aditional dimension to `NN_input` to add the batch size dimension,
    # which is just one
    NN_input_batched = np.expand_dims(NN_input, axis=0)

    # obtain the NN output using the model
    NN_output = model(NN_input_batched)
    return (target - NN_output) ** 2

Then, specify the boundaries of each input (due to selected solver [4]) and store several radom initializations for the initial guess of  best inputs, so we assure the global minimum can be reached.

In [9]:
n_initializations = 5
int_features_shape = features_shape[0]  # indexing specific of this example

# set wide bounds to stress test our application
# common bound for all inputs of the ANN
lower_bound = 0
higher_bound = 1
input_bounds = [[lower_bound, higher_bound]] * int_features_shape

# random initializations of input parameters
init_guesses = np_rng.uniform(
    low=lower_bound, high=higher_bound, size=(n_initializations, int_features_shape)
)

Next, we select `3-point` as the method to compute the gradient method [4,5].

We finally select the target to retrieve and employ scipy minimization to reach the true optimal input for each initialization.

In [None]:
# select the target as the desired output decimal number for this example
# we select something to test the retrieval of the closest input
target = 2
expected_opt = [1, 0]

retrieved_inputs = []
for init_guess in init_guesses:
    # In order to tune the `minimize` function, `method` is an interesting
    # input, which can require to specify `bounds` input with a sequence of
    # (min, max) pairs for each element in x
    solver_output = minimize(
        mse_loss,
        init_guess,
        args=(model, target),
        jac="3-point",
    )
    print(f"\nSolver optimization sucsess: {solver_output.success}")
    if not solver_output.success:
        print(f"Cause of termination: {solver_output.message}")

    # analyze obtained optimized inputs
    print(
        f"\nGuess:{init_guess} -> {np.round(init_guess, 0)}",
        f"\nExpected: {expected_opt}"
        f"\nObtained: {solver_output.x} -> {np.round(solver_output.x, 0)}",
    )

#### References

.. [1] In notebook folder, [`Training_NN_Keras.ipynb`](https://github.com/Eva-ortiz/ML_hodgepodge/blob/main/notebook/Training_NN_Keras.ipynb)

.. [2] In notebook folder, [`Hp_optimization_NN_Keras.ipynb`](https://github.com/Eva-ortiz/ML_hodgepodge/blob/main/notebook/Training_NN_Keras.ipynb)

.. [3] https://numpy.org/doc/stable/reference/random/index.html 

.. [4] https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html 

.. [5] https://www.vaia.com/en-us/textbooks/math/numerical-analysis-9-edition/chapter-4/problem-6-use-the-most-accurate-three-point-formula-to-deter/ 