# Example of an Optimization of a Neural Topic Model

In this notebook, we illustrate how optimize the hyper-parameters of the ETM model trough the Bayesian Optimization.

In [None]:
Colab configuration

In [None]:
!pip install octis
from pathlib import Path
Path("/content/20NewsGroup").mkdir(parents=True, exist_ok=True)
!wget -P /content/20NewsGroup/ https://raw.githubusercontent.com/MIND-Lab/OCTIS/master/octis/preprocessed_datasets/20NewsGroup/corpus.txt
!wget -P /content/20NewsGroup/ https://raw.githubusercontent.com/MIND-Lab/OCTIS/master/octis/preprocessed_datasets/20NewsGroup/labels.txt
!wget -P /content/20NewsGroup/ https://raw.githubusercontent.com/MIND-Lab/OCTIS/master/octis/preprocessed_datasets/20NewsGroup/metadata.json
!wget -P /content/20NewsGroup/ https://raw.githubusercontent.com/MIND-Lab/OCTIS/master/octis/preprocessed_datasets/20NewsGroup/vocabulary.txt
import sys
sys.path.insert(0,'/content/OCTIS')

Load the libraries.

In [None]:
from octis.models.ETM import ETM
from octis.dataset.dataset import Dataset
from octis.optimization.optimizer import Optimizer
from skopt.space.space import Real,Categorical
from octis.evaluation_metrics.coherence_metrics import Coherence

Choose a dataset.

In [None]:
dataset = Dataset()
dataset.load("/content/20NewsGroup")

Choose a model.

In [None]:
model = ETM(num_topics=25, num_epochs=30, bow_norm=1)

Choose the metric function to optimize.

In [None]:
metric_parameters = {
        'texts': dataset.get_corpus(),
        'topk': 10,
        'measure': 'c_npmi'
}
npmi = Coherence(metric_parameters)

Create the search space for optimization.

In [None]:
search_space = {
   "num_layers": Categorical({1, 2, 3, 4}),
   "num_neurons": Categorical({100, 200, 300, 400}),
    "activation": Categorical({'sigmoid', 'relu', 'softplus', 'rrelu'})
}

Select the path where the results (json file) will be saved.

In [None]:
save_path='results/test_etm/'

Select the number of iterations and model runs (for each iteration).

In [None]:
number_of_call=20
model_runs=5

Launch the optimization.

In [None]:
optimizer=Optimizer()
optimization_result = optimizer.optimize(model,dataset, npmi,search_space,
                                         number_of_call=number_of_call,
                                         model_runs=model_runs,
                                         save_path=save_path)

You can save the main results of the optimization in a csv file.

In [None]:
optimization_result.save_to_csv("results_etm.csv")