# Sparse Hebbian Learning: basics

We are interested here in learning the "optimal" components of a set of images (let's say some "natural", usual images). As there is no supervisor to guide the learning, this is called unsupervised learning. Our basic hypothesis to find the best ("optimal") components will be to assume that *a priori* the most sparse is more plausible. We will implement the derived algorithm in this set of scripts.

Here, we will show the basic operations that are implemented in this package. 

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(precision=4, suppress=True)

## experiments

To test and control for the role of different parameters, we will have a first object (in the [shl_experiments.py](https://github.com/bicv/SparseHebbianLearning/blob/master/shl_scripts/shl_experiments.py) script) that controls a learning experiment. It contains all relevant parameters, but can also keep a trace of the history of some statistics. This is useful to compare the relative efficiency of the different solutions.


In [3]:
matname = 'basics'

In [4]:
from shl_scripts.shl_experiments import SHL

In [5]:
shl = SHL(homeo_method='HAP', DEBUG_DOWNSCALE=1, verbose=10)
help(shl)

Help on SHL in module shl_scripts.shl_experiments object:

class SHL(builtins.object)
 |  SHL(height=256, width=256, patch_width=21, N_patches=65536, datapath='../database/', name_database='kodakdb', do_mask=True, do_bandpass=True, over_patches=16, patch_ds=1, n_dictionary=676, learning_algorithm='mp', fit_tol=None, l0_sparseness=34, alpha_MP=0.95, one_over_F=True, n_iter=4097, eta=0.015, beta1=0.95, beta2=0.999, epsilon=1e-08, do_precision=False, eta_precision=0.0005, homeo_method='HAP', eta_homeo=0.01, alpha_homeo=2.5, C=3.0, nb_quant=128, P_cum=None, do_sym=False, seed=42, patch_norm=False, batch_size=1024, record_each=32, record_num_batches=1024, n_image=None, DEBUG_DOWNSCALE=1, verbose=0, cache_dir='cache_dir')
 |  
 |  Base class to define SHL experiments:
 |      - initialization
 |      - coding and learning
 |      - visualization
 |      - quantitative analysis
 |  
 |  Methods defined here:
 |  
 |  __init__(self, height=256, width=256, patch_width=21, N_patches=65536, datap

In [6]:
!ls -l {shl.cache_dir}/{matname}*

-rw-r--r--  1 laurentperrinet  staff  231154688 Sep 26  2018 cache_dir/basics_coding.npy
-rw-r--r--  1 laurentperrinet  staff  134185088 Sep 26  2018 cache_dir/basics_data.npy
-rw-r--r--  1 laurentperrinet  staff          0 Jul 10 16:30 cache_dir/basics_dico.pkl_lock
-rw-r--r--  1 laurentperrinet  staff          0 Jul 10 16:30 cache_dir/basics_dico.pkl_lock_pid-52870_host-fortytwo


In [7]:
!rm {shl.cache_dir}/{matname}*
!ls -l {shl.cache_dir}/{matname}*

ls: cache_dir/basics*: No such file or directory


In [8]:
data = shl.get_data(matname=matname)
print('number of patches, size of patches = ', data.shape)
print('average of patches = ', data.mean(), ' +/- ', data.mean(axis=1).std())
SE = np.sqrt(np.sum(data**2, axis=1))
print('average energy of data = ', SE.mean(), '+/-', SE.std())

Extracting data..No cache found cache_dir/basics_data: Extracting data... Extracting data..bittern62.png, reflection63.png, yose07.png, rocky10.png, koala52.png, craterlake12.png, clouds43.png, yellowleaves39.png, yose05.png, goldwater67.png, bird08.png, cattails70.png, flowers37.png, woods54.png, cucorn50.png, bora04.png, geyser27.png, flowerhill29.png, calcoast09.png, hibiscus30.png, Data is of shape : (65520, 441) - done in 25.41s.
Data is of shape : (65520, 441) - done in 27.46s.
number of patches, size of patches =  (65520, 441)
average of patches =  -4.1888600727021664e-05  +/-  0.006270387629074682
average energy of data =  5.477384347012861 +/- 1.557168782769748


## learning

The actual learning is done in a second object (here ``dico``) from which we can access another set of properties and functions  (see the [shl_learn.py](https://github.com/bicv/SparseHebbianLearning/blob/master/shl_scripts/shl_learn.py) script):

In [None]:
list_figures = ['show_dico', 'time_plot_error', 'time_plot_logL', 'show_Pcum']#, 'plot_variance',  'plot_variance_histogram',  'time_plot_prob',  'time_plot_kurt',  'time_plot_var']
dico = shl.learn_dico(data=data, list_figures=list_figures, matname=matname)

No cache found cache_dir/basics_dico.pkl: Learning the dictionary with algo = mp 
 Training on 65520 patches
Iteration   1 /   4097 (elapsed time:  14s,   0mn  14s)
Iteration  33 /   4097 (elapsed time:  1279s,  21mn  19s)
Iteration  65 /   4097 (elapsed time:  2678s,  44mn  38s)
Iteration  97 /   4097 (elapsed time:  4042s,  67mn  22s)
Iteration  129 /   4097 (elapsed time:  5310s,  88mn  30s)
Iteration  161 /   4097 (elapsed time:  6503s,  108mn  23s)
Iteration  193 /   4097 (elapsed time:  7702s,  128mn  22s)
Iteration  225 /   4097 (elapsed time:  8887s,  148mn   7s)
Iteration  257 /   4097 (elapsed time:  10062s,  167mn  42s)
Iteration  289 /   4097 (elapsed time:  11193s,  186mn  33s)
Iteration  321 /   4097 (elapsed time:  12303s,  205mn   3s)
Iteration  353 /   4097 (elapsed time:  13415s,  223mn  35s)
Iteration  385 /   4097 (elapsed time:  14540s,  242mn  20s)
Iteration  417 /   4097 (elapsed time:  15616s,  260mn  16s)
Iteration  449 /   4097 (elapsed time:  16685s,  278mn  

In [None]:
help(dico)

In [None]:
print('size of dictionary = (number of filters, size of imagelets) = ', dico.dictionary.shape)
print('average of filters = ',  dico.dictionary.mean(axis=1).mean(), 
      '+/-',  dico.dictionary.mean(axis=1).std())
SE = np.sqrt(np.sum(dico.dictionary**2, axis=1))
print('average energy of filters = ', SE.mean(), '+/-', SE.std())

## coding

The learning itself is done via a gradient descent but is highly dependent on the coding / decoding algorithm. This belongs to a another function (in the [shl_encode.py](https://github.com/bicv/SparseHebbianLearning/blob/master/shl_scripts/shl_encode.py) script)

In [None]:
sparse_code = shl.code(data, dico, matname=matname, l0_sparseness=45)
print('number of codes, size of codewords = ', sparse_code.shape)
print('average of codewords = ', sparse_code.mean())
print('average energy of codewords = ', sparse_code.std(axis=0).mean())
print('std of the average of individual patches = ', sparse_code.mean(axis=0).std())

In [None]:
patches = sparse_code @ dico.dictionary
print('number of codes, size of reconstructed images = ', patches.shape)

In [None]:
error = data - patches
print('average of residual patches = ', error.mean(), '+/-', error.mean(axis=1).std())
SE = np.sqrt(np.sum(error**2, axis=1))
print('average energy of residual = ', SE.mean(), '+/-', SE.std())

## Version used

In [None]:
%load_ext version_information
%version_information numpy, shl_scripts