## Standard RMM experiments

This notebook accumulates experiments for all benchmark datasets that can be handled with standard reservoir memory machines, in particular the latch, copy, repeat copy, signal copy, and image copy task. FSM learning and associative recall require some special concepts and are handled in separate notebooks.

In [1]:
# in this first cell we set some experimental meta-parameters that are used across all
# datasets

# the number of training time series
N = 90
# the number of test time series
N_test = 10
# the number of repeats for the experiments
R = 20
# the names of the tasks to be performed
tasks = ['latch', 'copy', 'repeat_copy', 'signal_copy', 'image_copy']
# the number of neurons for each task
num_neurons = [64, 256, 256, 64, 512]
# the number of input dimensions for each task
ns = [1, 9, 9, 2, 28]
# the horizons for each task
Ts = [256, 24, 16, 312, 32]

## Hyperparameter Optimization

In [2]:
# the number of hyperparameter combinations to be tested
hyper_R = 20
# the number of repeats for each hyperparameter combination
hyper_num_repeats = 3
# set the hyper-parameter ranges for all models
models = ['RMM_LMU']
hyperparam_ranges = {
    'ESN' : {
        'radius' : [0.5, 0.7, 0.9],
        'sparsity' : [0.1, 0.2, 0.5],
        'regul' : [1E-7, 1E-5, 1E-3]
    },
    'CRJ' : {
        'v' : [0.1, 0.3, 0.5],
        'w_c' : [0.1, 0.7, 0.9],
        'w_j' : [0.1, 0.2, 0.4],
        'l' : [4, 8, 16],
        'regul' : [1E-7, 1E-5, 1E-3]
    },
    'LMU' : {
        'regul' : [1E-7, 1E-5, 1E-3],
        #'T' : [16, 32, 128, 384]
    },
    'RMM_ESN' : {
        'radius' : [0.5, 0.7, 0.9],
        'sparsity' : [0.1, 0.2, 0.5],
        'regul' : [1E-7, 1E-5, 1E-3],
        'C' : [1., 100., 10000.],
        'svm_kernel' : ['linear', 'rbf'],
        'lr': [1E-6, 5e-6, 1E-5, 5e-5, 1E-4, 5e-4, 1E-3],
        'u': [0.1, 0.5, 0.8, 1.0]
    },
    'RMM_CRJ' : {
        'v' : [0.1, 0.3, 0.5],
        'w_c' : [0.1, 0.7, 0.9],
        'w_j' : [0.1, 0.2, 0.4],
        'l' : [4, 8, 16],
        'regul' : [1E-7, 1E-5, 1E-3],
        'C' : [1., 100., 10000.],
        'svm_kernel' : ['linear', 'rbf'],
        'lr': [1E-6, 5e-6, 1E-5, 5e-5, 1E-4, 5e-4, 1E-3],
        'u': [0.1, 0.5, 0.8, 1.0]
    },
    'RMM_LMU' : {
        'regul' : [1E-7, 1E-5, 1E-3],
        'C' : [1., 100., 10000.],
        #'T' : [16, 32, 128, 384],
        'svm_kernel' : ['linear', 'rbf'],
        'lr': [1E-6, 5e-6, 1E-5, 5e-5, 1E-4, 5e-4, 1E-3],
        'u': [0.1, 0.5, 0.8, 1.0]
    }
}

import numpy as np
import rmm2.esn as esn
import rmm2.crj as crj
import rmm2.lmu as lmu
import rmm2.rmm as rmm

# set up a function to initialize an instance for each model
def setup_model(model, m, n, hyperparams):
    # first, set up the correct reservoir and nonlinearity
    if model.endswith('ESN'):
        U, W = esn.initialize_reservoir(m, n, radius = hyperparams['radius'],
                                        sparsity = hyperparams['sparsity'])
        nonlin = np.tanh
    elif model.endswith('CRJ'):
        U = crj.setup_input_weight_matrix(n, m, v = hyperparams['v'])
        W = crj.setup_reservoir_matrix(m, w_c = hyperparams['w_c'],
                                       w_j = hyperparams['w_j'], l = hyperparams['l'])
        nonlin = np.tanh
    elif model.endswith('LMU'):
        degree = int(m/n)-1
        U, W = lmu.initialize_reservoir(n, degree, hyperparams['T'])
        nonlin = lambda x : x
    else:
        raise ValueError('Unknown model: %s' % model)
    # then, set up the model
    if not model.startswith('RMM_'):
        net = esn.ESN(U, W, regul = hyperparams['regul'], input_normalization = False,
                      nonlin = nonlin)
    else:
        net = rmm.RMM(U, W, lr = hyperparams['lr'], u = hyperparams['u'], regul = hyperparams['regul'], input_normalization = False,
                      nonlin = nonlin, C = hyperparams['C'],
                      svm_kernel = hyperparams['svm_kernel'])
    return net

## Experiment

After all the hyperparameter setup above we can now iterate over all tasks and
first perform hyperparameter optimization, followed by the actual experiment.

In [3]:
import json
import os
import random
import time
from dataset_generators import generate_data
from dataset_generators import _permutation_sampling

# iterate over all tasks
for task_idx in range(len(tasks)):
    task = tasks[task_idx]
    print('------ Task %d of %d: %s -----' % (task_idx+1, len(tasks), task))
    m = num_neurons[task_idx]
    n = ns[task_idx]
    # try to load the selected hyperparameters from file
    hyperparam_path = '%s_hyperparams.json' % task
    if os.path.isfile(hyperparam_path):
        print('loading hyperparameters from %s' % hyperparam_path)
        with open(hyperparam_path, 'r') as hyperparam_file:
            hyperparams = json.load(hyperparam_file)
    else:
        # perform a hyperoptimization where we test R random hyperparameter
        # settings for each model and perform num_repeats repeats to obtain
        # statistics. The hyperparameters with the best mean performance across
        # repeats will be selected
        print('performing hyperparameter optimization (this may take a while)')
        # generate random parameter combination for all models
        hyperparams = {}
        for model in models:
            # initialize a hyperparameter dictionary for each combination
            hyperparams[model] = []
            for r in range(hyper_R):
                hyperparams[model].append({})
            # then iterate over each key and sample the parameter values
            for key in hyperparam_ranges[model]:
                param_range = hyperparam_ranges[model][key]
                param_value_indices = _permutation_sampling(hyper_R, 0, len(param_range)-1)
                for r in range(hyper_R):
                    value = param_range[param_value_indices[r]]
                    hyperparams[model][r][key] = value
            for r in range(hyper_R):
                # set up an extra key for the errors
                hyperparams[model][r]['errors'] = []
                # for the signal_copy dataset, use the 'pseudo' SVM, because everything
                # else takes too long to train
                if task == 'signal_copy' and model.startswith('RMM'):
                    hyperparams[model][r]['svm_kernel'] = 'pseudo'
                # set time horizon for LMU models
                if model.endswith('LMU'):
                    hyperparams[model][r]['T'] = Ts[task_idx]

        for repeat in range(hyper_num_repeats):
            print('--- repeat %d of %d ---' % (repeat+1, hyper_num_repeats))
            # sample training and test data
            Xs, Qs, Ys = generate_data(N, task)
            Xs_test, Qs_test, Ys_test = generate_data(N_test, task)
            # now iterate over all models
            for model in models:
                print('-- model: %s --' % model)
                # and iterate over all parameter combinations for this model
                for params_r in hyperparams[model]:
                    # set up a model instance
                    net = setup_model(model, m, n, params_r)
                    # fit the model to the data
                    if model.startswith('RMM_'):
                        net.fit(Xs, Qs, Ys)
                    else:
                        net.fit(Xs, Ys)
                    # measure the RMSE on the test data
                    mse = 0.
                    for i in range(N_test):
                        Ypred = net.predict(Xs_test[i])
                        mse   += np.mean((Ypred - Ys_test[i]) ** 2)
                    rmse = np.sqrt(mse / N_test)
                    params_r['errors'].append(rmse)
                    print('error: %g' % rmse)
        # write the results to a JSON file
        with open(hyperparam_path, 'w') as hyperparam_file:
            json.dump(hyperparams, hyperparam_file)

    # select best hyperparameters for each model
    hyperparams_opt = {}
    for model in models:
        min_err = np.inf
        for params_r in hyperparams[model]:
            if np.mean(params_r['errors']) < min_err:
                min_err = np.mean(params_r['errors'])
                hyperparams_opt[model] = params_r
        print('\nSelected the following hyper-parameters for %s' % model)
        for key in hyperparams_opt[model]:
            print('%s: %s' % (key, str(hyperparams_opt[model][key])))
    # hyperparameter optimization complete
    
    # ACTUAL EXPERIMENT

    # initialize error and runtime arrays
    errors   = np.zeros((len(models), R))
    runtimes = np.zeros((len(models), R))
    # iterate over all experimental repeats
    for r in range(R):
        print('--- repeat %d of %d ---' % (r+1, R))
        # sample training and test data
        Xs, Qs, Ys = generate_data(N, task)
        Xs_test, Qs_test, Ys_test = generate_data(N_test, task)
        # now iterate over all models
        for model_idx in range(len(models)):
            model = models[model_idx]
            # print('-- model: %s --' % model)
            # set up the model with the best selected hyperparameters
            start_time = time.time()
            net = setup_model(model, m, n, hyperparams_opt[model])
            # fit the model to the data
            if model.startswith('RMM_'):
                net.fit(Xs, Qs, Ys)
            else:
                net.fit(Xs, Ys)
            # measure the RMSE on the test data
            mse = 0.
            for i in range(N_test):
                Ypred = net.predict(Xs_test[i])
                mse   += np.mean((Ypred - Ys_test[i]) ** 2)
            rmse = np.sqrt(mse / N_test)
            runtimes[model_idx, r] = time.time() - start_time
            errors[model_idx, r] = rmse
    # print results
    for model_idx in range(len(models)):
        print('%s: %g +- %g (took %g seconds)' % (models[model_idx], np.mean(errors[model_idx, :]), np.std(errors[model_idx, :]), np.mean(runtimes[model_idx, :])))
    # write results to file
    np.savetxt('%s_errors.csv' % task, errors.T, delimiter='\t', header='\t'.join(models), comments='')
    np.savetxt('%s_runtimes.csv' % task, runtimes.T, delimiter='\t', header='\t'.join(models), comments='')

------ Task 1 of 5: latch -----
performing hyperparameter optimization (this may take a while)
--- repeat 1 of 3 ---
-- model: RMM_LMU --




State prediction recall: 0.989256; precision: 0.989256
error: 0.597197
State prediction recall: 0.959434; precision: 0.959434
error: 0.698116




State prediction recall: 0.999561; precision: 0.999561
error: 0.000558587
State prediction recall: 0.96974; precision: 0.96974
error: 0.708318
State prediction recall: 0.964587; precision: 0.964587
error: 0.70729
State prediction recall: 0.999561; precision: 0.999561
error: 0.000542949
State prediction recall: 0.96974; precision: 0.96974
error: 0.708177




State prediction recall: 0.989913; precision: 0.989913
error: 0.596429
State prediction recall: 0.999561; precision: 0.999561
error: 0.00152964
State prediction recall: 0.96974; precision: 0.96974
error: 0.70905




State prediction recall: 0.998575; precision: 0.998575
error: 0.00362348
State prediction recall: 0.96974; precision: 0.96974
error: 0.704969
State prediction recall: 0.982677; precision: 0.982677
error: 0.596237
State prediction recall: 0.96974; precision: 0.96974
error: 0.70856




State prediction recall: 0.999452; precision: 0.999452
error: 0.0104859
State prediction recall: 0.96974; precision: 0.96974
error: 0.708829
State prediction recall: 0.970398; precision: 0.970398
error: 0.708927
State prediction recall: 0.989256; precision: 0.989256
error: 0.597422




State prediction recall: 0.989256; precision: 0.989256
error: 0.59697
State prediction recall: 0.96974; precision: 0.96974
error: 0.709654
--- repeat 2 of 3 ---
-- model: RMM_LMU --




State prediction recall: 0.989511; precision: 0.989511
error: 0.779603
State prediction recall: 0.970459; precision: 0.970459
error: 0.984642




State prediction recall: 0.99315; precision: 0.99315
error: 0.191355
State prediction recall: 0.970459; precision: 0.970459
error: 0.71389
State prediction recall: 0.964572; precision: 0.964572
error: 0.695853




State prediction recall: 0.999572; precision: 0.999572
error: 0.135888
State prediction recall: 0.970459; precision: 0.970459
error: 0.679323




State prediction recall: 0.990046; precision: 0.990046
error: 0.546495




State prediction recall: 0.999572; precision: 0.999572
error: 0.149075
State prediction recall: 0.971101; precision: 0.971101
error: 0.67891




State prediction recall: 0.998609; precision: 0.998609
error: 0.0950184
State prediction recall: 0.970459; precision: 0.970459
error: 0.682689
State prediction recall: 0.986835; precision: 0.986835
error: 0.718199
State prediction recall: 0.970459; precision: 0.970459
error: 0.682111




State prediction recall: 0.999786; precision: 0.999786
error: 0.119642
State prediction recall: 0.970459; precision: 0.970459
error: 0.875502
State prediction recall: 0.971101; precision: 0.971101
error: 0.681032
State prediction recall: 0.989511; precision: 0.989511
error: 0.680761




State prediction recall: 0.989511; precision: 0.989511
error: 0.72524
State prediction recall: 0.970459; precision: 0.970459
error: 0.695015
--- repeat 3 of 3 ---
-- model: RMM_LMU --




State prediction recall: 0.989359; precision: 0.989359
error: 0.56657
State prediction recall: 0.970033; precision: 0.970033
error: 0.712119




State prediction recall: 0.999566; precision: 0.999566
error: 0.000319808
State prediction recall: 0.970033; precision: 0.970033
error: 0.709665
State prediction recall: 0.970033; precision: 0.970033
error: 0.707874




State prediction recall: 0.999783; precision: 0.999783
error: 0.000447316
State prediction recall: 0.970033; precision: 0.970033
error: 0.709713




State prediction recall: 0.990228; precision: 0.990228
error: 0.566785




State prediction recall: 0.999783; precision: 0.999783
error: 0.00135159
State prediction recall: 0.970901; precision: 0.970901
error: 0.710361




State prediction recall: 0.998914; precision: 0.998914
error: 0.0024086
State prediction recall: 0.970033; precision: 0.970033
error: 0.707165
State prediction recall: 0.986319; precision: 0.986319
error: 0.56564
State prediction recall: 0.970033; precision: 0.970033
error: 0.709804




State prediction recall: 0.999674; precision: 0.999674
error: 0.00292629
State prediction recall: 0.970033; precision: 0.970033
error: 0.710004
State prediction recall: 0.970901; precision: 0.970901
error: 0.709536
State prediction recall: 0.989359; precision: 0.989359
error: 0.565752




State prediction recall: 0.989359; precision: 0.989359
error: 0.566112
State prediction recall: 0.970033; precision: 0.970033
error: 0.711791

Selected the following hyper-parameters for RMM_LMU
regul: 1e-07
C: 100.0
svm_kernel: rbf
lr: 0.001
u: 0.5
errors: [0.003623477474822802, 0.09501838373196504, 0.002408599214835377]
T: 256
--- repeat 1 of 20 ---




State prediction recall: 0.998405; precision: 0.998405
--- repeat 2 of 20 ---




State prediction recall: 0.997927; precision: 0.997927
--- repeat 3 of 20 ---




State prediction recall: 0.998579; precision: 0.998579
--- repeat 4 of 20 ---




State prediction recall: 0.998763; precision: 0.998763
--- repeat 5 of 20 ---




State prediction recall: 0.99907; precision: 0.99907
--- repeat 6 of 20 ---




State prediction recall: 0.999362; precision: 0.999362
--- repeat 7 of 20 ---




State prediction recall: 0.998367; precision: 0.998367
--- repeat 8 of 20 ---




State prediction recall: 0.998678; precision: 0.998678
--- repeat 9 of 20 ---




State prediction recall: 0.999009; precision: 0.999009
--- repeat 10 of 20 ---




State prediction recall: 0.998702; precision: 0.998702
--- repeat 11 of 20 ---




State prediction recall: 0.999061; precision: 0.999061
--- repeat 12 of 20 ---




State prediction recall: 0.998544; precision: 0.998544
--- repeat 13 of 20 ---




State prediction recall: 0.998979; precision: 0.998979
--- repeat 14 of 20 ---




State prediction recall: 0.998828; precision: 0.998828
--- repeat 15 of 20 ---




State prediction recall: 0.999021; precision: 0.999021
--- repeat 16 of 20 ---




State prediction recall: 0.998824; precision: 0.998824
--- repeat 17 of 20 ---




State prediction recall: 0.998312; precision: 0.998312
--- repeat 18 of 20 ---




State prediction recall: 0.998974; precision: 0.998974
--- repeat 19 of 20 ---




State prediction recall: 0.999162; precision: 0.999162
--- repeat 20 of 20 ---




State prediction recall: 0.998545; precision: 0.998545
RMM_LMU: 0.00921672 +- 0.0367357 (took 1.91521 seconds)
