## Here we will do an implementation of the neural network language model 

### FNN architecture

The architecture of the Forward Neural Network. 

* $n$ context size
* $m$ the number of features associated with each word (ex: m = 100, Each word is represented by a vector of size 100).
* $C$ is size $|V|\times m$

$$y = b + Wx + U\tanh(d + Hx)$$

Where:

* $x = (C(w_{t-1}), C(w_{t-2}), \ldots, C(w_{t-n+1}))$, vector of size $m\times(n-1)$
* $h$ be the number of hidden units
* $H$ Corresponds to the dense layer. $H$ has $m\times(n-1)$ columns and $h$ rows
* $d$ Corresponds to the dense layer. $d$ is a vector of size $h$
* $U$ Corresponds to the second dense layer. $U$ has $h$ columns $|V|$ lines
* W dense **(can be equal to zero)**
* $b$ vector of size $|V|$ 


Total number of parameters

$ |V |(1 + nm + h) + h(1 + (n − 1)m)$

Input data
=====

For n=4

$$D = [(2, 10, 3, 5), (8, 30, 2, 20), ...]$$

In [1]:
import re
import numpy as np
import itertools
import pandas as pd
import numpy as np
import re
import os
from tqdm import tqdm
from utility import text_preprocessing, create_unique_word_dict
import csv 
%matplotlib inline
import matplotlib.pyplot as plt

In [2]:
np.random.seed(0)
m = 10
sizeV = 5
C = np.random.randn(sizeV, m)
a="ca|ca"
for i in range(len(a)):
    print(i)
a.replace(a[0],"")

0
1
2
3
4


'a|a'

In [14]:
a="http://arxiv.org/abs/1303.6933v1|Hans Grauert (1930-2011)|Alan Huckleberry|math.HO|Hans Grauert died in September of 2011. This article reviews his life in mathematics and recalls some detail his major accomplishments.|2013-03-27T19:23:57Z|2013-03-27T19:23:57Z|math"

In [23]:
a.split('|')[1]+' '+ a.split('|')[4]

'Hans Grauert (1930-2011) Hans Grauert died in September of 2011. This article reviews his life in mathematics and recalls some detail his major accomplishments.'

4


In [None]:
print(C)

In [None]:
np.shape(C)

In [None]:
C[[2, 4, 3], :]

In [None]:
X=[[1, 2,4], [0,3, 4]]
temp = C[X, :]
#print(temp)
nb_features=10
result = np.reshape(temp, (np.shape(X)[0], m * np.shape(X)[1]))
print(np.shape(result))

In [None]:
np.random.seed(0)
C = np.random.randn(4, 10)

In [None]:
np.shape(np.ravel(C))

In [None]:
C.shape

In [None]:
X = np.array([[1, 2, 3],[0,2,1]])

In [None]:
X.shape

In [None]:
np.shape(np.reshape(C[X,:],(np.shape(X)[0],10*np.shape(X)[1])))

In [None]:
np.reshape((np.concatenate(C[X, :])),(2,30))

In [None]:
np.shape(np.concatenate(C[:, np.concatenate(X)]).reshape((X.shape[0], X.shape[1]*C.shape[0])))

In [None]:
np.shape(C[:, np.concatenate(X)])

In [None]:
class Project_and_concat() : 
    """
    The input is a vector x = (w_{t-1}, w_{t-2}, ..., w_{t-n+1})
    For example, for n=4 the input vector x can be
    (4, 2, 10)
    where 4, 2 and 10 are the indexes of the corresponding words.
    """
    def __init__(self, nb_features,dict_size) : # V*m ou m*V
        self.nb_features = nb_features
        self.dict_size = dict_size
        self.C = np.random.randn(dict_size,nb_features)
        self.nb_params = nb_features * dict_size # Nombre de parametres de la couche
        self.save_X = None # Parametre de sauvegarde des donnees
    def set_params(self,params) : 
        # Permet de modifier les parametres de la couche, en entree, prend un vecteur de la taille self.nb_params
        pass
    def get_params(self) : 
        # Rend un vecteur de taille self.params qui contient les parametres de la couche
        return np.ravel(self.C)
    def forward(self,X) : 
        # calcul du forward, X est le vecteur des donnees d'entrees
        self.save_X = np.copy(X)
        return np.ravel(np.concatenate(C[X, :]))
    def backward(self,grad_sortie) :  
        # retropropagation du gradient sur la couche, 
        #grad_sortie est le vecteur du gradient en sortie
        #Cette fonction rend :
        #grad_local, un vecteur de taille self.nb_params qui contient le gradient par rapport aux parametres locaux
        #grad_entree, le gradient en entree de la couche 
        grad_local=None
        grad_entree=np.reshape(grad_sortie,(dict_size,nb_features))
        return grad_local,grad_entree
        
# 2 étapes dans cette couche, les selections des lignes de C puis la concaténation
# est ce que la selection des lignes de C rentre dans le calcul du dradient d'entree



In [None]:
A = np.array([[1, 2], [3, 4]])

In [None]:
A

In [None]:
np.dot(np.ones(4), np.concatenate(A))

In [None]:
a=np.array([[1,2,3],[4,5,6]])
b=np.array([2,3,4])

a=np.exp(a)
s=np.sum(a,axis=1)
print(a)
print(a.T)
print(s)
print(a.T/s)
print(np.sum(a.T/s,axis=0))

In [None]:
a=np.array([[1,2,3],[4,5,6]])
b=np.array([1,2,3])

print(np.sum(a,axis=0))

print(a)

print(np.sum(a,axis=0))

print(np.exp(a)/np.sum(np.exp(a),axis=0))

In [None]:
import Neuralword as Neur

print(np.shape(a))
print((Neur.ilogit(a)))

In [None]:
l=np.array([-5,-7,-10])
np.argmax(l)