# MLP Coding Example

Acesse o servidor remoto por ssh. Crie uma virtualenv com:
```
mkvirtualenv <nome-da-sua-env>
```
Ative a sua virtualenv com:
```
workon <nome-da-sua-env>
```
Instale o jupyter:
```
pip install jupyter
```
Na pasta contendo o setup.py, instale o pacote do projeto :
```
pip install -e .
```
Comando para servir o jupyter:
```
nohup jupyter notebook --no-browser &
```
Talvez você precise de um token. Se precisar consulte com:
```
jupyter notebook list
```




Na sua máquina local, redirecione a porta adequada:
```
ssh -NfL localhost:<porta-local>:localhost:<porta-remoto> <seu-usuario>@<ip-do-servidor>
```
Geralmente:
```
ssh -NfL localhost:8888:localhost:8888 <seu-usuario>@<ip-do-servidor>
```
Abra localhost:8888 no seu browser. Se você quiser fechar o jupyter, no localhost:8888 clique em Quit, depois libere a porta com:
```
lsof -ti:8888 | xargs kill -9
```

## Imports

In [1]:
import numpy as np
from tqdm.notebook import tqdm
from perceptronac.context_training import context_training
from perceptronac.context_coding import context_coding
from perceptronac.perfect_AC import perfect_AC
import torch

## Gerando dados randômicos correlacionados (substituir pelos seus dados)

In [2]:
# parameters  
L = 100000 # how many samples 
N = 7 # order of the AR
# Np = N # number of parameters to estimate 

C0 = np.random.rand(1,1) 
C = np.random.rand(N,1)

X = 2 * (np.random.rand(2*L,N) > 0.5) - 1 # correlated (context) signals

X = (X > 0).astype(int)

def sigmoid(x): 
    return 1 / (1 + np.e**(-x))

p = sigmoid(C0 + X @ C);
yy = (np.random.rand(2*L, 1) > (1 - p)).astype(int) # signal 
yt = yy[0:L] > 0 # train on the first part 
yc = yy[L:L+L] > 0 # encode the second part
Xt = X[0:L,0:N] # truncated X for training 
Xc = X[L:L+L,0:N] # truncated X for coding

## Entropia dos dados

In [3]:
# treino
perfect_AC(yt,context_coding(Xt,context_training(Xt,yt)))

0.6231081471345179

In [4]:
# teste
perfect_AC(yc,context_coding(Xc,context_training(Xc,yc)))

0.6211330550586928

## Treinando Modelo No Pytorch com Batch Gradient Descent (Quando todos os dados couberem na memória da placa de vídeo de uma só vez)

In [5]:
import torch

In [6]:
class Perceptron(torch.nn.Module):
    def __init__(self,N):
        super().__init__()
        self.linear = torch.nn.Linear(N, 1)
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        x = self.linear(x)
        x = self.sigmoid(x)
        return x

In [7]:
class Log2BCELoss(torch.nn.Module):
    def __init__(self,*args,**kwargs):
        super().__init__()
        self.bce_loss = torch.nn.BCELoss(*args,**kwargs)

    def forward(self, pred, target):
        return self.bce_loss(pred, target)/torch.log(torch.tensor(2,dtype=target.dtype,device=target.device))

In [8]:
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self,X,y):
        self.X = X
        self.y = y
    def __len__(self):
        return len(self.y)
    def __getitem__(self,idx):
        return self.X[idx,:],self.y[idx,:]

In [9]:
net = Perceptron(N)

In [10]:
net = net.cuda()

In [11]:
trainset = CustomDataset(Xt,yt)
validset = CustomDataset(Xc,yc)

In [12]:
criterion = Log2BCELoss(reduction="sum")
optimizer = torch.optim.SGD(net.parameters(), lr=0.00001)

In [13]:
train_loss = []
for epoch in range(20000):
    optimizer.zero_grad()
    outputs = net(torch.tensor(trainset.X).float().cuda())
    loss = criterion(outputs,torch.tensor(trainset.y).view(-1,1).float().cuda())
    loss.backward()
    optimizer.step()
    train_loss.append(loss.item()/len(trainset))

In [14]:
print(f"""comprimento médio de código final no dataset de treino: {train_loss[-1]}
(compare com a entropia do dataset de treino).""")

comprimento médio de código final no dataset de treino: 0.62384859375
(compare com a entropia do dataset de treino).


### Pesos aprendidos são aproximadamente os parâmetros usados para gerar os dados

In [15]:
for param in net.parameters():
    print(param.data)

tensor([[0.9022, 0.6098, 0.7072, 0.0310, 0.1403, 0.0376, 0.9343]],
       device='cuda:0')
tensor([0.0158], device='cuda:0')


In [16]:
C.T, C0

(array([[0.89106724, 0.62633137, 0.70989337, 0.03960594, 0.1334368 ,
         0.03148984, 0.92488601]]),
 array([[0.01856426]]))

## Treinando Modelo No Pytorch com Stochastic Gradient Descent (um pedaço dos dados na memória da placa de vídeo de cada vez)

In [17]:
net = Perceptron(N)

In [18]:
net = net.cuda()

In [19]:
trainset = CustomDataset(Xt,yt)
validset = CustomDataset(Xc,yc)

In [20]:
criterion = Log2BCELoss(reduction="sum")
optimizer = torch.optim.SGD(net.parameters(), lr=0.00001)

In [21]:

batch_size = 100000

train_loss, valid_loss = [], []

for epoch in range(200):  # loop over the dataset multiple times

    for phase in ['train','valid']:

        if phase == 'train':
            net.train(True)
            dataloader = torch.utils.data.DataLoader(
                trainset,batch_size=batch_size,shuffle=True,num_workers=2)
        else:
            net.train(False)
            dataloader=torch.utils.data.DataLoader(
                validset,batch_size=batch_size,shuffle=False,num_workers=2)
            
        running_loss = 0.0
        for data in dataloader: #tqdm(dataloader):
            
            X_b,y_b= data
            X_b = X_b.float().cuda()
            y_b = y_b.float().cuda()
            
            if phase == 'train':
                optimizer.zero_grad()
                outputs = net(X_b.float())
                loss = criterion(outputs,y_b.view(-1,1).float())
                loss.backward()
                optimizer.step()
            else:
                with torch.no_grad():
                    outputs = net(X_b)
                    loss = criterion(outputs, y_b)

            running_loss += loss.item()

        final_loss = running_loss / len(dataloader.dataset)
        if phase=='train':
            train_loss.append(final_loss)
        else:
            valid_loss.append(final_loss)
            
        print("epoch :" , epoch, ", phase :", phase, ", loss :", final_loss)

print('Finished Training')

epoch : 0 , phase : train , loss : 0.976436171875
epoch : 0 , phase : valid , loss : 0.65307
epoch : 1 , phase : train , loss : 0.65563890625
epoch : 1 , phase : valid , loss : 0.644643125
epoch : 2 , phase : train , loss : 0.647405546875
epoch : 2 , phase : valid , loss : 0.641886796875
epoch : 3 , phase : train , loss : 0.644700078125
epoch : 3 , phase : valid , loss : 0.640156015625
epoch : 4 , phase : train , loss : 0.64296703125
epoch : 4 , phase : valid , loss : 0.638736953125
epoch : 5 , phase : train , loss : 0.6415239453125
epoch : 5 , phase : valid , loss : 0.6374794921875
epoch : 6 , phase : train , loss : 0.6402335546875
epoch : 6 , phase : valid , loss : 0.6363410546875
epoch : 7 , phase : train , loss : 0.639059140625
epoch : 7 , phase : valid , loss : 0.6353037109375
epoch : 8 , phase : train , loss : 0.6379853125
epoch : 8 , phase : valid , loss : 0.6343559375
epoch : 9 , phase : train , loss : 0.6370019140625
epoch : 9 , phase : valid , loss : 0.6334887890625
epoch : 1

epoch : 80 , phase : valid , loss : 0.6224168359375
epoch : 81 , phase : train , loss : 0.624288984375
epoch : 81 , phase : valid , loss : 0.6224030859375
epoch : 82 , phase : train , loss : 0.624272890625
epoch : 82 , phase : valid , loss : 0.622389921875
epoch : 83 , phase : train , loss : 0.6242574609375
epoch : 83 , phase : valid , loss : 0.6223772265625
epoch : 84 , phase : train , loss : 0.62424265625
epoch : 84 , phase : valid , loss : 0.6223649609375
epoch : 85 , phase : train , loss : 0.6242283203125
epoch : 85 , phase : valid , loss : 0.62235328125
epoch : 86 , phase : train , loss : 0.62421453125
epoch : 86 , phase : valid , loss : 0.6223419921875
epoch : 87 , phase : train , loss : 0.6242012890625
epoch : 87 , phase : valid , loss : 0.622331171875
epoch : 88 , phase : train , loss : 0.6241884765625
epoch : 88 , phase : valid , loss : 0.6223208203125
epoch : 89 , phase : train , loss : 0.6241762109375
epoch : 89 , phase : valid , loss : 0.622310859375
epoch : 90 , phase : tr

epoch : 160 , phase : train , loss : 0.6238734765625
epoch : 160 , phase : valid , loss : 0.622083125
epoch : 161 , phase : train , loss : 0.6238726171875
epoch : 161 , phase : valid , loss : 0.62208265625
epoch : 162 , phase : train , loss : 0.62387171875
epoch : 162 , phase : valid , loss : 0.6220822265625
epoch : 163 , phase : train , loss : 0.6238709375
epoch : 163 , phase : valid , loss : 0.622081796875
epoch : 164 , phase : train , loss : 0.62387015625
epoch : 164 , phase : valid , loss : 0.62208140625
epoch : 165 , phase : train , loss : 0.6238693359375
epoch : 165 , phase : valid , loss : 0.622081015625
epoch : 166 , phase : train , loss : 0.6238686328125
epoch : 166 , phase : valid , loss : 0.622080703125
epoch : 167 , phase : train , loss : 0.623867890625
epoch : 167 , phase : valid , loss : 0.6220803515625
epoch : 168 , phase : train , loss : 0.6238672265625
epoch : 168 , phase : valid , loss : 0.622080078125
epoch : 169 , phase : train , loss : 0.6238666015625
epoch : 169 ,

In [22]:
print(f"""comprimento médio de código final no dataset de treino: {train_loss[-1]}
(compare com a entropia do dataset de treino).""")

print(f"""comprimento médio de código final no dataset de validação: {valid_loss[-1]}
(compare com a entropia do dataset de validação).""")

comprimento médio de código final no dataset de treino: 0.6238546875
(compare com a entropia do dataset de treino).
comprimento médio de código final no dataset de validação: 0.622076015625
(compare com a entropia do dataset de validação).


### Pesos aprendidos são aproximadamente os parâmetros usados para gerar os dados

In [23]:
for param in net.parameters():
    print(param.data)

tensor([[0.8959, 0.6037, 0.7009, 0.0253, 0.1345, 0.0318, 0.9280]],
       device='cuda:0')
tensor([0.0358], device='cuda:0')


In [24]:
C.T, C0

(array([[0.89106724, 0.62633137, 0.70989337, 0.03960594, 0.1334368 ,
         0.03148984, 0.92488601]]),
 array([[0.01856426]]))