# RNN

## But

Quel bail Armand?


## Données

2 jeux de données sont fournis: un jeu de relevés de température à travers 31 villes des Etats Unis et du Canada, qui pourra servir à de la classification de séquence (many to one), par exemple pour prédire une ville sachant une séquence de température, ou à du forecasting, en préduisant la température à ${t+1}$. L'autre est un jeu de données de discours de trump, qui pourra servir essentiellement à du forecasting.

In [1]:
import numpy as np
import torch
import torch.nn as nn

import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline
%load_ext autoreload
%autoreload 2

import models

In [85]:
temperatures_csv = pd.read_csv("data/tempAMAL_train.csv")
print("Nb exemples: {}, cities: {}".format(temperatures_csv.shape[0], temperatures_csv.shape[1]))

temperatures_csv.head(5)

Nb exemples: 11115, cities: 31


Unnamed: 0,datetime,Vancouver,Portland,San Francisco,Seattle,Los Angeles,San Diego,Las Vegas,Phoenix,Albuquerque,...,Detroit,Jacksonville,Charlotte,Miami,Pittsburgh,Toronto,Philadelphia,New York,Montreal,Boston
0,2012-10-01 13:00:00,284.63,282.08,289.48,281.8,291.87,291.53,293.41,296.6,285.12,...,284.03,298.17,288.65,299.72,281.0,286.26,285.63,288.22,285.83,287.17
1,2012-10-01 14:00:00,284.629041,282.083252,289.474993,281.797217,291.868186,291.533501,293.403141,296.608509,285.154558,...,284.069789,298.20523,288.650172,299.732518,281.024767,286.262541,285.663208,288.247676,285.83465,287.186092
2,2012-10-01 15:00:00,284.626998,282.091866,289.460618,281.789833,291.862844,291.543355,293.392177,296.631487,285.233952,...,284.173965,298.299595,288.650582,299.766579,281.088319,286.269518,285.756824,288.32694,285.84779,287.231672
3,2012-10-01 16:00:00,284.624955,282.100481,289.446243,281.782449,291.857503,291.553209,293.381213,296.654466,285.313345,...,284.27814,298.393961,288.650991,299.800641,281.15187,286.276496,285.85044,288.406203,285.860929,287.277251
4,2012-10-01 17:00:00,284.622911,282.109095,289.431869,281.775065,291.852162,291.563063,293.370249,296.677445,285.392738,...,284.382316,298.488326,288.651401,299.834703,281.215421,286.283473,285.944057,288.485467,285.874069,287.322831


### Construisons le jeu de données

On va construire un dataset et un dataLoader associé qui vont contenir nos séquences et batches associés. On procède comme suit:

- On sample au hasard 10 villes


In [86]:
cities = np.random.choice(temperatures_csv.columns, 10, replace=False)

seq_len = 20 # prendre une longueur de séquence de taille fixe pour le moment

def sample_sequence(city):
    start = np.random.randint(0, temperatures_csv.shape[0] - seq_len)
    return np.array(temperatures_csv[city][start:start + seq_len])

sequences = []
labels = []

for i, city in enumerate(cities):
    for j in range(300):
        sequences.append(sample_sequence(city))
        labels.append(i)
        
sequences = np.array(sequences)
labels = np.array(labels)
# shuffle lines
from sklearn.utils import shuffle

sequences -= 273.15

sequences, labels = shuffle(sequences, labels, random_state=1997)

sequences = np.expand_dims(sequences, axis=2) # pour créer une dimension supplémentaire de taille 1



assert sequences.shape[0] == labels.shape[0]
print(sequences.shape, labels.shape)

(3000, 20, 1) (3000,)


Le RNN que nous avons codé attend en entrée des matrices de taille sequence_length x batch x dim. Dans notre cas où nous avons pris des séquences de taille 20, on aurait donc des matrices de taille 20 x batch x 1, car nos données sont en dimension 1.

On va donc swaper les dimensions 0 et 1 de notre array de séquences:

In [88]:
print(sequences.shape)
sequences = np.swapaxes(sequences,0,1)
print(sequences.shape)

(3000, 20, 1)
(20, 3000, 1)


## Modèle et expérimentations

In [89]:
class Many2oneRNN(torch.nn.Module):
    """ """
    def __init__(self, dim, latent, nbClass):
        super(Many2oneRNN, self).__init__()
        self.rnn = models.RNN(dim, latent)
        self.decoder = models.Decoder(latent, nbClass, layers=[8])

    def forward(self, x):
        """ """
        hT = self.rnn(x)[-1]
        return self.decoder(hT)

In [90]:
latentdim = 10
inputdim = 1

sequence_tensor = torch.from_numpy(sequences).float()
label_tensor = torch.from_numpy(labels)

rnn = Many2oneRNN(inputdim, latentdim, len(cities))

criterion = nn.CrossEntropyLoss() # pas besoin de one-hot pour les labels avec cette fct là
optimizer = torch.optim.Adam(rnn.parameters(), lr=1e-3)

epochs = 50

for e in range(epochs):
    
    optimizer.zero_grad()
    
    preds = rnn(sequence_tensor)
    print(preds.size())
    loss = criterion(preds, label_tensor)
    loss.backward()
    optimizer.step()
    
    print("epoch {} training loss {}".format(e, loss.item()))

torch.Size([3000, 10])
epoch 0 training loss nan
torch.Size([3000, 10])
epoch 1 training loss nan
torch.Size([3000, 10])
epoch 2 training loss nan
torch.Size([3000, 10])
epoch 3 training loss nan
torch.Size([3000, 10])
epoch 4 training loss nan
torch.Size([3000, 10])
epoch 5 training loss nan


KeyboardInterrupt: 

In [50]:
preds = rnn(torch.from_numpy(sequence).float())