# Forward-backward splitting for time-varying graphical lasso
This notebook shows how to minimise the time-varying graphical lasso with element-wise penalty norms across time-points.

First of all, as always, let's create a bunch of data.
For this task, we generate eah variable to change according to a certain behaviour which can be described as evolution via tigonometric functions, such as `sin` and `cos`.

In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

from scipy.spatial.distance import squareform
from regain import datasets, utils

from sklearn.datasets import load_iris
from sklearn.svm import SVC 
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV, ShuffleSplit

from skopt.searchcv import BayesSearchCV
from skopt.space import Real, Categorical, Integer

In [2]:
# np.random.seed(7)

# fs = 10e3
# N = 100
# amp = 2*np.sqrt(2)
# freq = 1.0
# noise_power = 0.001 * fs / 2
# time = np.arange(N) / fs
# z = amp*np.sin(2*np.pi*freq*time)
# z += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
# plt.plot(z);

In [3]:
# T = 4

# x = np.tile(np.linspace(0, T-1, T), (n_interactions, 1))
# zz = amp * signal.square(2 * np.pi * freq * x + phase, duty=.5)
# plt.plot(x.T, zz.T);

Generate the data starting from the inverse covariance matrices.

In [4]:
np.random.seed(7)

n_samples = 100
n_dim_obs = 50
T = 10

reload(datasets)
data = datasets.make_dataset(n_samples=n_samples, n_dim_obs=n_dim_obs, n_dim_lat=0, T=T,
                             time_on_axis='last',
                             mode='sin', shape='square', closeness=2.4, normalize=1)

# plt.step(np.array([squareform(y, checks=None) for y in data.thetas]), '-|');
# plt.savefig("/home/fede/Dropbox/Latent variables networks/forward backward time varying graphical lasso/smooth_signal.pdf")

### Let's run 

In [5]:
X = data.data
X_tr, X_ts = train_test_split(X)

In [8]:
from regain import prox; reload(prox);
from regain.forward_backward import time_graph_lasso_; reload(time_graph_lasso_)
tglfb = time_graph_lasso_.TimeGraphLassoForwardBackward(
    verbose=2, gamma=10, alpha='max', beta=5, eps=.5, delta=.5, choose='gamma',
    time_norm=1, max_iter=500, time_on_axis='last').fit(X)

NameError: global name 'tv1_1d' is not defined

In [197]:
from regain.utils import positive_definite
positive_definite(tglfb.precision_)

True

In [208]:
utils.structure_error(data.thetas, tglfb.precision_, no_diagonal=0, thresholding=1, eps=1e-5)

{'accuracy': 0.57288,
 'average_precision': 0.6738125889144224,
 'balanced_accuracy': 0.5135658475800082,
 'dor': 1.3445593477335602,
 'f1': 0.7102306648575306,
 'fall_out': 0.8837690426932481,
 'false_omission_rate': 0.5087440381558028,
 'fdr': 0.4179861234655755,
 'fn': 1280,
 'fp': 9398,
 'miss_rate': 0.08909926214673534,
 'nlr': 0.7665708363012813,
 'npv': 0.4912559618441971,
 'plr': 1.0306999836488207,
 'precision': 0.5820138765344245,
 'prevalence': 0.57464,
 'recall': 0.9109007378532646,
 'specificity': 0.11623095730675193,
 'tn': 1236,
 'tp': 13086}

### BayesOptimisation
Since we have lots of hyper-parameters, we rely on a Bayesian optimisation procedure in order to select the best hyper-parameters, treating the scoring function of our algorithm as a black-box for the gaussian process underlying the Bayesian optimisation.

Such procedure is performed via the `scikit-optimize` package.

In [213]:
from regain import utils; reload(utils)
from regain import prox; reload(prox);
reload(time_graph_lasso_)

from skopt import searchcv; reload(searchcv)
data_grid = np.array(data.data)

domain = {'alpha': Real(1e-1, 1e2, prior='log-uniform'),
          'beta': Real(1e-1, 1e2, prior='log-uniform'),
#           'time_norm': Categorical([1, 2])
         }

mdl = time_graph_lasso_.TimeGraphLassoForwardBackward(
    verbose=0, tol=1e-4, max_iter=10000, gamma=20, alpha='max', beta=5, time_norm=1,
    time_on_axis='last', eps=0.8, choose='gamma')
    
cv = ShuffleSplit(5, test_size=0.5)
    
ltgl = searchcv.BayesSearchCV(
    mdl, domain, n_iter=100, cv=cv, verbose=1, n_jobs=-1, iid=True, n_points=3)

# callback handler
def on_step(optim_result):
    score = ltgl.best_score_
    print("best score: %s" % score)
#     if score >= 0.98:
#         print('Interrupting!')
#         return True


ltgl.fit(data_grid, callback=on_step)
# mdl.fit(data_grid)

Fitting 5 folds for each of 3 candidates, totalling 15 fits
best score: -1363.1890618375712
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed:  6.6min finished


best score: -1363.1890618375712
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed:  1.2min finished
[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed:  1.2min finished


best score: -1363.1890618375712
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed:  1.1min finished


best score: -1363.1890618375712
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 19.5min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 26.7min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 35.3min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 28.6min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 29.0min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 37.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 33.3min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 29.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 32.5min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 34.0min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 39.3min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 28.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 26.7min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 31.7min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 42.6min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 43.6min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 44.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 60.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed:  1.1min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 27.7min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 32.1min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 39.1min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 45.5min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 20.8min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 25.6min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 25.9min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 23.3min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 30.4min finished


best score: -838.6641565094241
Fitting 5 folds for each of 3 candidates, totalling 15 fits


[Parallel(n_jobs=-1)]: Done  15 out of  15 | elapsed: 33.2min finished


best score: -838.6641565094241
Fitting 5 folds for each of 1 candidates, totalling 5 fits


[Parallel(n_jobs=-1)]: Done   5 out of   5 | elapsed: 13.4min finished


best score: -838.6641565094241


BayesSearchCV(cv=ShuffleSplit(n_splits=5, random_state=None, test_size=0.5, train_size=None),
       error_score='raise',
       estimator=TimeGraphLassoForwardBackward(alpha='max', assume_centered=False, beta=5,
               choose='gamma', compute_objective=True, delta=0.0001,
               eps=0.8, gamma=20, lamda_criterion='b', lamda_init=1,
               max_iter=10000, time_norm=1, time_on_axis='last',
               tol=0.0001, verbose=0),
       fit_params=None, iid=True, n_iter=100, n_jobs=-1, n_points=3,
       optimizer_kwargs=None, pre_dispatch='2*n_jobs', random_state=None,
       refit=True, return_train_score=False, scoring=None,
       search_spaces={'alpha': Real(low=0.1, high=100.0, prior='log-uniform', transform='identity'), 'beta': Real(low=0.1, high=100.0, prior='log-uniform', transform='identity')},
       verbose=1)

In [218]:
utils.structure_error(data.thetas, mdl.precision_, no_diagonal=0, thresholding=1, eps=1e-8)

{'accuracy': 0.55696,
 'average_precision': 0.6493819873359272,
 'balanced_accuracy': 0.5080440514116944,
 'dor': 1.1194835525548184,
 'f1': 0.6843365253077975,
 'fall_out': 0.8196351325935678,
 'false_omission_rate': 0.5516596540439458,
 'fdr': 0.4206157706785059,
 'fn': 2360,
 'fp': 8716,
 'miss_rate': 0.1642767645830433,
 'nlr': 0.9108024580688646,
 'npv': 0.4483403459560542,
 'plr': 1.0196283714345935,
 'precision': 0.579384229321494,
 'prevalence': 0.57464,
 'recall': 0.8357232354169567,
 'specificity': 0.1803648674064322,
 'tn': 1918,
 'tp': 12006}

In [217]:
mdl = ltgl.best_estimator_

In [132]:
mdl.score(data_grid)

-648.3521475107405

In [113]:
ltgl.cv_results_

defaultdict(list,
            {'mean_fit_time': [59.08424797058105,
              66.17720727920532,
              66.97275977134704,
              64.39573440551757,
              61.84775342941284,
              59.56634469032288,
              63.95092988014221,
              2507.8452457904814,
              65.97609577178955,
              66.9068528175354,
              712.355920791626,
              1271.9847212791442,
              1444.4501823425294,
              1413.6207597732543,
              1554.165789794922,
              1777.535684633255,
              2018.0050109863282,
              1872.2741337776183,
              2012.9741005420685,
              1825.9705711841584,
              2219.35692782402,
              1189.926883840561,
              1956.9852410316466,
              1975.1635250091554,
              2616.8063933849335,
              1731.3013670444489,
              2541.313151407242,
              1935.2838866233826,
              1868.329342603683

In [125]:
ltgl.best_params_

{'alpha': 2.1945375330861676, 'beta': 0.17319499438515196}

In [115]:
mdl.precision_

array([[[ 1.69545767, -0.06819919,  0.05624769, ...,  0.00866332,
         -0.06240769,  0.09895316],
        [-0.06819919,  1.59819469, -0.08103202, ..., -0.00466499,
          0.        , -0.08148264],
        [ 0.05624769, -0.08103202,  1.41736224, ...,  0.07821061,
          0.08692673,  0.        ],
        ...,
        [ 0.00866332, -0.00466499,  0.07821061, ...,  1.58992992,
         -0.03837632, -0.0036484 ],
        [-0.06240769,  0.        ,  0.08692673, ..., -0.03837632,
          1.12858971,  0.        ],
        [ 0.09895316, -0.08148264,  0.        , ..., -0.0036484 ,
          0.        ,  1.73790145]],

       [[ 1.63205302,  0.02653864,  0.01943224, ...,  0.        ,
          0.        ,  0.02423671],
        [ 0.02653864,  1.16472631, -0.02983537, ..., -0.00466499,
          0.08179029,  0.        ],
        [ 0.01943224, -0.02983537,  1.09253107, ...,  0.07670778,
          0.02475099, -0.07210559],
        ...,
        [ 0.        , -0.00466499,  0.07670778, ...,  

### GridSearchCV
As for the hyper-parameters tuning, one may choose to fix a grid of parameters and select the best ones.
For this we can use `GridSearchCV`, from the `scikit-learn` library.

In [None]:
# data_grid = np.array(data.data).transpose(1,2,0)
param_grid=dict(alpha=np.logspace(-2,0,3), beta=np.logspace(-2,0,3), gamma=np.logspace(-2, 0, 3),
               time_norm=[1, 2])

mdl = time_graph_lasso_.TimeGraphLassoForwardBackward(
    verbose=0, time_on_axis='last')
    
cv = ShuffleSplit(2, test_size=0.2)
ltgl = GridSearchCV(mdl, param_grid, cv=cv, verbose=1)
ltgl.fit(data_grid)