# Hopfield Network With Hashing

The following code takes as input .wav sound files and transforms each of them into MFCC vectors. These vectors are then each transformed into a hash. They are then used to train a Hopfield network. In this way, each sound vector becomes a memory pattern that can be accessed even if slightly corrupted.

The idea here is that if memory works as a Hopfield network with memory patterns being fixed points of the network, even noisy sounds or those corrupted to some extent can be accessed. Additionally, sounds are transformed each into a hash - with this we reduce their dimensionality.

In [1]:
# First, we load some dependencies.
import numpy as np
import math
from python_speech_features import mfcc
import scipy.io.wavfile as wav
import sys
import glob
import random

In [2]:
# Folder with some wav files to test this script.
folder = "./waveforms/"

First, we need to transform the sounds into a readable format that can be used for hashing.

In [3]:
# Go through the folder and find all (and only) files ending with .wav
# Here, we transform each .wav file into MFCCs and then flatten them into one vector
# We do this because we want one hash per .wav file
#
# Arguments: sound folder
# Returns: a list of flattened MFCC vectors

def make_mfcc(folder):
    V = []
    for file in glob.glob(folder + "*.wav", recursive=True):
        (rate,sig) = wav.read(file)
        mfcc_feat = mfcc(sig,rate)
        vect = mfcc_feat.flatten()
        V.append(vect)
    return V

We will transform each sound (that is, each sound transformed into MFCC vectors, then flattened into one vector) into a hash.

In [4]:
# Transform a vector of speech into a hash
# The hash will be a matrix of the dimension = k*m
# We choose a random number k of units of the vector.
# And look for the highest value and turn it into 1.
# Everything else is 0.
# We thus get sparse matrices.
# We do this m times. Final output is h=k*m.


def get_hash(vector, k, m):
    d = len(vector)
    p = np.zeros((m,k,))
    for i in range(m):
        p[i] = np.random.permutation(d)[:k]
        
    h = np.zeros((m,k,))
    for i in range(m):
        ix = np.argmax(p[i])
        hi = np.zeros(k)
        hi[ix] = 1
        h[i] = hi
    h = np.hstack(h)
    return h

## Test:
# V = make_mfcc(folder)
# get_hash(V[1], 5, 3)

# Principle
# - Algo: inputs of dimension d, params k, m (hash dim=k*m)
#   - pre-processing: 
#       p=[]; 
#       for i=1:m: 
#           p[i] = random_perm(d)[:k]
#   - getting hash for X: 
#       h = []
#       for i=1:m:
#         ix = argmax(X[p[i]])
#         hi = zeros(k)
#         hi[ix] = 1
#         h = h + hi
#   -> i.e. there is a local WTA on m sets of 
#   randomly chosen k-tuple of dims -> hash is of length mk with exactly m ones.

Hopfield network consists of a symmetric recurrent weight matrix that is trained with memory patterns (presented as hash vectors) we want to store. The weight matrix is trained with those patterns such that each of them becomes a fixed point of the network. Once we want to "retrieve" a memory pattern, we need to find one of the fixed points.

In [5]:
# Function for the matrix M (symmetric recurrent weight matrix)
#
# Arguments: 
# lmbda (eigenvalue represented as a lambda), alpha (amount of active units),
# c (constant value of active components, inactive have 0), 
# N (number of neurons), V (list of vectors in a hash form)

def get_m(lmbda, alpha, c, N, V):
    
    # n is a vector of ones
    n = np.ones(N)

    vect_sum = np.zeros(N)
    for vect in V:
        vect_sum = np.sum([vect_sum, vect], axis=0)
    m = (lmbda / (pow(c,2)*alpha*N*(1-alpha))) \
        * vect_sum \
        - (np.outer(n,n) / (alpha*N))

    return m

We need to determine what is a fixed point of the network. This is done by 

In [19]:
# Function for the fixed point
# Memory pattern satisfies v_m = F(M * v_m) (i.e. is a fixed point)
# 

def convergence_criterion(x0, x1, tau):
    return math.isclose(x0, x1, rel_tol=tau) 

def fixed_point(F, x0, tau):
    print(F.shape)
    print(x0.shape)
    x1 = F.dot(x0)
    if convergence_criterion(x0, x1, tau):
        return x1
    else:
        return fixed_point(F, x1)

In [20]:
# Test:

k = 5
m = 3
lmbda = 0.1
alpha = 0.6
c = 1
N = 10

mfccs_vectors = make_mfcc(folder)
for v in mfccs_vectors:
    V = get_hash(v, 5, 3)
    print("hash: ", V)
    m = get_m(lmbda, alpha, c, N, V)
    f = fixed_point(m, V, 0.001)
#     print("matrix m", m)
    print(f)



hash:  [0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0.]
(10, 10)
(15,)


ValueError: shapes (10,10) and (15,) not aligned: 10 (dim 1) != 15 (dim 0)