# Populate HmovModels parameter tables

Currently, the workflow to re-populate (or newly populate) the parameter tables `GlmParams()` and `SplineLNPParams()` uses the `glm._save_paramsets_dict()` with an already full parameter table in order to save the parameter sets that have been used in the past. The parameter sets are saved as json files (see `glm._save_paramsets_dict()`). These json files can then be loaded and used to populate empty parameter tables. This is not the best solution and should be replaced by an automatic filling method in `glm.py` in the future that automaticall fills the parameter table with well working parameter sets. However, this needs a general collection of well working parameter sets. Exploring the parameter space still needs to be done thoroughly in the future.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import json

from pathlib import Path

# Connect to the `dj_hmov` schema 

In [3]:
run -im djd.main -- --dbname=dj_hmov --r

For remote access to work, make sure to first open an SSH tunnel with MySQL
port forwarding. Run the `djdtunnel` script in a separate terminal, with
optional `--user` argument if your local and remote user names differ.
Or, open the tunnel manually with:
  ssh -NL 3306:huxley.neuro.bzm:3306 -p 1021 USERNAME@tunnel.bio.lmu.de
Connecting execute@localhost:3306
Connected to database 'dj_hmov' as 'execute@10.153.172.3'
For remote file access to work, make sure to first mount the filesystem at tunnel.bio.lmu.de:1021 via SSHFS with `hux -r`


In [4]:
from djd import glm

# SplineLNPParams()

Parameter tables should be empty:

In [5]:
SplineLNPParams()

spl_paramset  parameter set ID,spl_distr  the nonlinearity for the LNP model.,spl_alpha  the weighting betw. L2 and L1 penalty (alpha=1 only uses L1),spl_lambda  regularization parameter of penalty term,spl_lr  initial learning rate for the JAX optimizer,spl_max_iter  maximum number of iterations for the solver,spl_dt  inverse of the sampling rate,"spl_spat_df  degrees of freedom for the spatial domain, i.e. number of basis functions for the spatial component; depends on the height of the stimulus frame (num of pixels); the df for width is computed automatically as spat_df*n_pixels_width/n_pixels_height","spl_temp_df  degrees of freedom in the temporal domain, i.e. number of basis functions for the temporal component; depends on the length of the filter (number of time steps, nlag)",spl_pshf  fit a post-spike history filter; can be either True or False,spl_pshf_len  length of the post-spike history filter or None if 'spl_pshf' = 'False',spl_pshf_df  number of basis functions for spline basis of the post-spike history filter.,"spl_verb  When verbose=0, progress is not printed. When verbose=n, progress will be printed in every n steps.",spl_metric  performance metric; can be 'None' or 'mse': mean squared error 'r2': R2 score 'corrcoef': Correlation coefficient,spl_norm_y  normalize observed responses between 0 and 1; could be either True or False,spl_nlag  number of time steps of the kernel; see glm._get_stimulus_design_mat(),spl_shift  shift kernel to not predict itself; for more info see glm._get_stimulus_design_mat(),spl_spat_scaling  scaling factor to down or upsample the spatial resolution of the movie frames; scaled_height = frame_orig_height * spl_spat_scaling
,,,,,,,,,,,,,,,,,


Load the respective parameter dictionaries:

In [5]:
with open('./data/param_dicts/SplineLNPParam_dicts_2020-12-02T110056.json', "r") as read_file:
     spl_param_dicts = json.load(read_file)

In [6]:
spl_param_dicts

[{'spl_paramset': 7,
  'spl_distr': 'softplus',
  'spl_alpha': 1.0,
  'spl_lambda': 1.0,
  'spl_lr': 0.01,
  'spl_max_iter': 2000,
  'spl_dt': 0.033,
  'spl_spat_df': 8,
  'spl_temp_df': 7,
  'spl_pshf': 'False',
  'spl_pshf_len': 0,
  'spl_pshf_df': 0,
  'spl_verb': 200,
  'spl_metric': 'corrcoef',
  'spl_norm_y': 'False',
  'spl_nlag': 8,
  'spl_shift': 1,
  'spl_spat_scaling': 0.06},
 {'spl_paramset': 8,
  'spl_distr': 'softplus',
  'spl_alpha': 1.0,
  'spl_lambda': 1.0,
  'spl_lr': 0.01,
  'spl_max_iter': 2000,
  'spl_dt': 0.033,
  'spl_spat_df': 8,
  'spl_temp_df': 7,
  'spl_pshf': 'True',
  'spl_pshf_len': 8,
  'spl_pshf_df': 6,
  'spl_verb': 200,
  'spl_metric': 'corrcoef',
  'spl_norm_y': 'False',
  'spl_nlag': 8,
  'spl_shift': 1,
  'spl_spat_scaling': 0.06}]

Two parameter sets for the `SplineLNP()` model: one with a post-spike history filter and one without.

Populate the parameter table:

In [10]:
for spl_param_dict in spl_param_dicts:
    SplineLNPParams().populate(spl_param_dict)

The parameter table should be full now:

In [7]:
SplineLNPParams()

spl_paramset  parameter set ID,spl_distr  the nonlinearity for the LNP model.,spl_alpha  the weighting betw. L2 and L1 penalty (alpha=1 only uses L1),spl_lambda  regularization parameter of penalty term,spl_lr  initial learning rate for the JAX optimizer,spl_max_iter  maximum number of iterations for the solver,spl_dt  inverse of the sampling rate,"spl_spat_df  degrees of freedom for the spatial domain, i.e. number of basis functions for the spatial component; depends on the height of the stimulus frame (num of pixels); the df for width is computed automatically as spat_df*n_pixels_width/n_pixels_height","spl_temp_df  degrees of freedom in the temporal domain, i.e. number of basis functions for the temporal component; depends on the length of the filter (number of time steps, nlag)",spl_pshf  fit a post-spike history filter; can be either True or False,spl_pshf_len  length of the post-spike history filter or None if 'spl_pshf' = 'False',spl_pshf_df  number of basis functions for spline basis of the post-spike history filter.,"spl_verb  When verbose=0, progress is not printed. When verbose=n, progress will be printed in every n steps.",spl_metric  performance metric; can be 'None' or 'mse': mean squared error 'r2': R2 score 'corrcoef': Correlation coefficient,spl_norm_y  normalize observed responses between 0 and 1; could be either True or False,spl_nlag  number of time steps of the kernel; see glm._get_stimulus_design_mat(),spl_shift  shift kernel to not predict itself; for more info see glm._get_stimulus_design_mat(),spl_spat_scaling  scaling factor to down or upsample the spatial resolution of the movie frames; scaled_height = frame_orig_height * spl_spat_scaling
7,softplus,1.0,1.0,0.01,2000,0.033,8,7,False,0,0,200,corrcoef,False,8,1,0.06
8,softplus,1.0,1.0,0.01,2000,0.033,8,7,True,8,6,200,corrcoef,False,8,1,0.06


# GlmParams()

Do the same as above for the `GlmParams()` table:

In [12]:
GlmParams()

glm_paramset  parameter set ID,glm_distr  distribution family can be: 'gaussian' 'binomial' 'poisson' 'softplus' 'probit' 'gamma',glm_alpha  weighting betw. L2 and L1 penalty (alpha = 1 only uses L1),glm_lambda  regularization parameter of penalty term,glm_solver  optimization method: one of the following: 'batch-gradient' (vanilla batch gradient descent) 'cdfast' (Newton coordinate gradient descent),glm_lr  learning rate for gradient descent,glm_max_iter  maximum number of iterations for the solver,glm_tol  convergence threshold or stopping criteria; optimization loop will stop when relative change in parameter norm is below the threshold,glm_seed  seed of the random number generator to initialize solution,glm_norm_y  normalize observed responses between 0 and 1; could be either 'True' or 'False',glm_nlag  num of time steps of the kernel; see _get_stimulus_design_mat,glm_shift  shift kernel to not predict itself for more info see glm._get_stimulus_design_mat(),glm_spat_scaling  scaling factor to down or upsample the spatial resolution of the movie frames; scaled_height = frame_orig_height * glm_spat_scaling
,,,,,,,,,,,,


In [8]:
with open('./data/param_dicts/GlmParam_dicts_2020-12-02T105150.json', "r") as read_file:
     glm_param_dicts = json.load(read_file)

In [9]:
glm_param_dicts

[{'glm_paramset': 1,
  'glm_distr': 'softplus',
  'glm_alpha': 1.0,
  'glm_lambda': 0.00015,
  'glm_solver': 'batch-gradient',
  'glm_lr': 0.7,
  'glm_max_iter': 1000,
  'glm_tol': 1e-06,
  'glm_seed': 0,
  'glm_norm_y': 'True',
  'glm_nlag': 8,
  'glm_shift': 1,
  'glm_spat_scaling': 0.06}]

In [10]:
for glm_param_dict in glm_param_dicts:
    GlmParams().populate(glm_param_dict)

In [10]:
GlmParams()

glm_paramset  parameter set ID,glm_distr  distribution family can be: 'gaussian' 'binomial' 'poisson' 'softplus' 'probit' 'gamma',glm_alpha  weighting betw. L2 and L1 penalty (alpha = 1 only uses L1),glm_lambda  regularization parameter of penalty term,glm_solver  optimization method: one of the following: 'batch-gradient' (vanilla batch gradient descent) 'cdfast' (Newton coordinate gradient descent),glm_lr  learning rate for gradient descent,glm_max_iter  maximum number of iterations for the solver,glm_tol  convergence threshold or stopping criteria; optimization loop will stop when relative change in parameter norm is below the threshold,glm_seed  seed of the random number generator to initialize solution,glm_norm_y  normalize observed responses between 0 and 1; could be either 'True' or 'False',glm_nlag  num of time steps of the kernel; see _get_stimulus_design_mat,glm_shift  shift kernel to not predict itself for more info see glm._get_stimulus_design_mat(),glm_spat_scaling  scaling factor to down or upsample the spatial resolution of the movie frames; scaled_height = frame_orig_height * glm_spat_scaling
1,softplus,1.0,0.00015,batch-gradient,0.7,1000,1e-06,0,True,8,1,0.06
