Skip to content

Latest commit

 

History

History
125 lines (84 loc) · 4.46 KB

helmholtz.2d.dirichlet.hpo.rst

File metadata and controls

125 lines (84 loc) · 4.46 KB

Helmholtz equation over a 2D square domain: Hyper-parameter optimization

Finding proper hyper-parameters for PINNs infrastructures is a common issue for practicioners. To remedy this concern, we apply hyper-parameter optimization (HPO) via Gaussian processes (GP)-based Bayesian optimization.

This example is issued from: Hyper-parameter tuning of physics-informed neural networks: Application to Helmholtz problems, [Neurocomputing].

Notice that this script can be easilly adapted to other examples (either forward or inverse problems).

More scripts are available at HPOMax GitHub repository.

Problem setup

We consider the same setting as in Helmholtz equation over a 2D square domain.

We apply GP-based Bayesian optimization via scikit-optimize (see documentation) over 50 calls. We use the Expected Improvement as acquisition function, define the minimum test error for each call as the (outer) loss function for the HPO.

We optimize the following hyper-parameters:

  • Learning rate \alpha;
  • Width N: number of nodes per layer;
  • Depth L - 1: number of dense layers;
  • Activation function \sigma.

We define every configuration as:

\lambda := [\alpha, N, L-1, \sigma]

and start with an initial setting \lambda_0 := [1e-3, 4, 50, \sin].

Implementation

We highlight the most important parts of the code. At each iteration, the HPO defines a model and trains it. Therefore, we define:

def create_model(config):
    # Define the model
    return model

which sets the model for a given configuration \lambda. Next, we define:

def train_model(model, config):
    # Train the model
    # Define the metric we seek to optimize
    return error

which allows to obtain the HPO loss for each configuration. In our case, we seek at minimizing the best test error. We are ready to define the search space and default parameters:

dim_learning_rate = Real(low=1e-4, high=5e-2, name="learning_rate", prior="log-uniform")
dim_num_dense_layers = Integer(low=1, high=10, name="num_dense_layers")
dim_num_dense_nodes = Integer(low=5, high=500, name="num_dense_nodes")
dim_activation = Categorical(categories=["sin", "sigmoid", "tanh"], name="activation")

dimensions = [
    dim_learning_rate,
    dim_num_dense_layers,
    dim_num_dense_nodes,
    dim_activation,
]

default_parameters = [1e-3, 4, 50, "sin"]

Next, we define the fitness function, which is an input to gp_minimize:

@use_named_args(dimensions=dimensions)
def fitness(learning_rate, num_dense_layers, num_dense_nodes, activation):

    config = [learning_rate, num_dense_layers, num_dense_nodes, activation]
    global ITERATION

    print(ITERATION, "it number")
    # Print the hyper-parameters.
    print("learning rate: {0:.1e}".format(learning_rate))
    print("num_dense_layers:", num_dense_layers)
    print("num_dense_nodes:", num_dense_nodes)
    print("activation:", activation)
    print()

    # Create the neural network with these hyper-parameters.
    model = create_model(config)
    # possibility to change where we save
    error = train_model(model, config)
    # print(accuracy, 'accuracy is')

    if np.isnan(error):
        error = 10**5

    ITERATION += 1
    return error

The test error can yield nan values. We replace this value by a overkill value of 10**5. Finally, we apply the GP-based HPO and plot the convergence results:

ITERATION = 0

search_result = gp_minimize(
    func=fitness,
    dimensions=dimensions,
    acq_func="EI",  # Expected Improvement.
    n_calls=n_calls,
    x0=default_parameters,
    random_state=1234,
)

search_result.x

plot_convergence(search_result)
plot_objective(search_result, show_points=True, size=3.8)

Complete code

.. literalinclude:: ../../../examples/pinn_forward/Helmholtz_Dirichlet_2d_HPO.py
  :language: python