# Horoscope generation using Temporal Convolution Networks

The goal of this notebook is to implement a generator of horoscope based on neural networks. 

More specifically, the architecture used is a Temporal Convolution Network based on the research paper ["An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling"](https://arxiv.org/abs/1803.01271). This architecture is fully convolutional and can therefore take arbitrary length sequence as inputs. The main idea of the authors of this paper is to increase the perceptive field of each successive layer using [dilated convolution](https://github.com/vdumoulin/conv_arithmetic). 

The bulk of the code of the TCN has been taken from [the implementation](https://github.com/locuslab/TCN) linked in the article.

The data used here are all the horoscopes published [beliefnet.com](beliefnet.com) for the year 2011. 

The network takes a sequence of `window_size` characters coming from as input and outputs a sequence of `window_size` characters. The target that we use for training is a slice of the horoscope corresponding to the input slid on step to the right, effectively asking the network what should the next character be. 

## Imports

In [1]:
import random
import math

import pandas as pd

from tqdm import tnrange, tqdm_notebook
from tqdm import tqdm

import torch
import torch.nn            as nn
import torch.nn.functional as F
import torch.optim         as optim
from torch.nn.utils import weight_norm
from torch.autograd import Variable

## Training parameters

In [2]:
cuda          = True
file_path     = '../data/horoscope_2011.csv'
window_size   = 150
batch_size    = 700
print_every   = 500
test_seq_size = 600
epochs        = 7

## Data loading code

In [3]:
def load_data(path, window_size):
    df               = pd.read_csv(path)
    text             = ' '.join(df.TEXT).lower()
    characters       = set(text)
    n_characters     = len(characters)
    idx_to_character = dict(enumerate(characters))
    character_to_idx = {character : idx for idx, character in idx_to_character.items()}
    # As explained above, we use a rolling window on the horoscope text to create our
    # training data. An input is the content of window and the corresponding target
    # is the content of the window slid one step to the right.
    data             = [(text[i : i + window_size], text[i + 1 : i + window_size + 1])
                        for i in range(len(text) - window_size - 1)]
    
    return n_characters, idx_to_character, character_to_idx, data

In [4]:
def one_hot_encode(inp_tensor, length):
    # The following lines convert a tensor containing character indentifiers
    # into a one hot embedding.
    inp_                = torch.unsqueeze(inp_tensor, 2)
    batch_size, seq_len = inp_tensor.size()
    one_hot             = torch.FloatTensor(batch_size, seq_len, length).zero_()
    # As we are using a convolutional network, the expected input is of shape 
    # (batch, channel, sequence). We have to transpose the tensor that we create
    # in order to use the one hot encoding of the characters as channel (dim 1). 
    one_hot.scatter_(2, inp_, 1).transpose_(1, 2)
    
    return one_hot

In [5]:
def encode_seq(seq, char_to_id):
    return [char_to_id[c] for c in seq]

In [6]:
def data_to_tensor(data, char_to_id, n_char):
    input_tensor    = torch.LongTensor([encode_seq(input_seq, char_to_id) 
                                         for input_seq, _ in data])
    one_hot_input   = one_hot_encode(input_tensor, n_char)
    target_tensor   = torch.LongTensor([encode_seq(target_seq, char_to_id) 
                                         for _, target_seq in data])

    return one_hot_input, target_tensor

In [7]:
def batch_generator(data, batch_size, n_char, char_to_id, shuffle = True):
    if shuffle:
        data = random.sample(data, len(data))
    
    return (data_to_tensor(data[i : i + batch_size], char_to_id, n_char) 
                 for i in range(0, len(data), batch_size))

## Model visualization code

The model evaluation consists in asking it to generate a long sequence of character. We randomly select an input as a base for our generation and create a new sequence character by character using the model.

In [8]:
def test_model(tcn, final_sequence_size, window_size, n_char, 
               id_to_char, char_to_id, data):
    seq = list(random.choice(data)[0])
    while len(seq) < final_sequence_size:
        # As the sequence is able to take variable length inputs, it could be 
        # interesting to not limit ourselves on inputs of window_size.
        encoded_input = encode_seq(seq[-window_size:], char_to_id)
        input_tensor  = torch.LongTensor([encoded_input])
        one_hot_input = one_hot_encode(input_tensor, n_char)
        X             = Variable(one_hot_input)
        X             = X.cuda() if cuda else X 
        y_pred        = tcn(X)
        # It is important to take the maximum on the dim -2 as each channel of 
        # the output will correspond to the score associated to a character.
        char_pred_id  = y_pred[0].max(dim = -2)[1][-1].cpu().data[0]
        char_pred     = id_to_char[char_pred_id]
        seq.append(char_pred)
    
    return ''.join(seq)

The following function generates an horoscope starting from a `base` supplied by the caller. 

In [9]:
def genererate_long_sequence(tcn, final_sequence_size, n_char, id_to_char, char_to_id, base):
    seq = list(base)

    while len(seq) < final_sequence_size:
        # In this case we do not limit the size of the input to window_size.
        encoded_input = encode_seq(seq, char_to_id)
        input_tensor  = torch.LongTensor([encoded_input])
        one_hot_input = one_hot_encode(input_tensor, n_char)
        X             = Variable(one_hot_input)
        X             = X.cuda() if cuda else X 
        y_pred        = tcn(X)
        # It is important to take the maximum on the dim -2 as each channel of 
        # the output will correspond to the score associated to a character.
        char_pred_id  = y_pred[0].max(dim = -2)[1][-1].cpu().data[0]
        char_pred     = id_to_char[char_pred_id]
        seq.append(char_pred)
        
    return ''.join(seq)

## Model definition

The `Chomp1d` module is used to remove the extra values at the end of the sequence by the padding of a convolution. As the TCN architecture uses dilated convolution, the padding have to be increased in order to be able to generate a long enough output. We have to remove the extra values so that the last value of our output is the result of a dilated convolution whose rightmost value was the last value of the input sequence. 

In [10]:
class Chomp1d(nn.Module):
    def __init__(self, chomp_size):
        super(Chomp1d, self).__init__()
        self.chomp_size = chomp_size
        
    def forward(self, x):
        # As x can be stored on the GPU, if we use it to build a new tensor,
        # we have to ensure that our new value is stored contiguously.
        return x[:, :, :-self.chomp_size].contiguous()

The `TemporalBlock` module is a residual block containing two weight normalized dilated convolutions with ReLU activations and dropout2d (we drop whole channel at once). The residual connection may contain a 1x1 convolution if it is necessary to transform the input to the correct number of channels. 

In [11]:
class TemporalBlock(nn.Module):
    def __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout = 0.2):
        super(TemporalBlock, self).__init__()
        conv_params = {
            'kernel_size' : kernel_size,
            'stride'      : stride,
            'padding'     : padding,
            'dilation'    : dilation
        }
        self.conv1    = weight_norm(nn.Conv1d(n_inputs, n_outputs, **conv_params))
        self.chomp1   = Chomp1d(padding)
        self.relu1    = nn.ReLU()
        self.dropout1 = nn.Dropout2d(dropout)
        self.conv2    = weight_norm(nn.Conv1d(n_outputs, n_outputs, **conv_params))
        self.chomp2   = Chomp1d(padding)
        self.relu2    = nn.ReLU()
        self.dropout2 = nn.Dropout2d(dropout)
        self.net      = nn.Sequential(
            self.conv1, 
            self.chomp1,
            self.relu1,
            self.dropout1,
            self.conv2,
            self.chomp2,
            self.relu2,
            self.dropout2
        )
        # If the number of input channels is equal to the number of output channel then
        # no transformation is required.
        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None
        self.relu       = nn.ReLU()
        self.init_weights()
        
    def forward(self, x):
        # Convolutional branch of the residual block
        out = self.net(x)
        # Residual branch of the residual block
        res = x if self.downsample is None else self.downsample(x)

        return self.relu(out + res)
    
    def init_weights(self):
        self.conv1.weight.data.normal_(0, 0.01)
        self.conv2.weight.data.normal_(0, 0.01)
        if self.downsample is not None:
            self.downsample.weight.data.normal_(0, 0.01)

The Temporal Convolution Network is a sequence of Temporal Blocks whose dilation is doubled at each step. If enough blocks are uses, this definition allows the network to used information for arbitrarily far away in the past to generate its prediction. 

In [12]:
class TemporalConvNet(nn.Module):
    def __init__(self, num_inputs, num_channels, kernel_size = 2, dropout = 0.2):
        super(TemporalConvNet, self).__init__()
        layers     = []
        num_levels = len(num_channels)
        
        for i in range(num_levels):
            # The dilation is doubled at each layer to allow an exponential growth of 
            # the receptive field size.   
            dilation_size = 2 ** i
            in_channels   = num_inputs if i == 0 else num_channels[i - 1]
            out_channels  = num_channels[i]
            layers.append(
                TemporalBlock(
                    in_channels,
                    out_channels,
                    kernel_size,
                    stride   = 1,
                    dilation = dilation_size,
                    padding  = (kernel_size - 1) * dilation_size,
                    dropout  = dropout
                )
            )
        
        self.network = nn.Sequential(*layers)
        
    def forward(self, x):
        return self.network(x)

## Training

In [13]:
n_char, id_to_char, char_to_id, data = load_data(file_path, window_size)
num_channels                         = [512] * 6 + [n_char]
tcn                                  = TemporalConvNet(n_char, num_channels)
tcn                                  = tcn.cuda() if cuda else tcn
# We view the problem as a classification task in which the network tries
# to predict what class the following character should be.   
criterion                            = nn.CrossEntropyLoss()
# We use an Adam optimizer with the default learning rate of 1e-3.
optimizer                            = optim.Adam(tcn.parameters())

In [14]:
batch_per_epoch  = math.ceil(len(data) / batch_size)
loss_update_rate = 3

In [15]:
for epoch in tnrange(epochs, desc = 'epochs'):
    loss_pbar    = 0 
    running_loss = 0
    generator    = batch_generator(data, batch_size, n_char, char_to_id)

    with tqdm_notebook(
        enumerate(generator), 
        desc = 'batches', 
        total = batch_per_epoch, 
        unit = 'batch '
    ) as pbar:

        for i, (X, y) in pbar:
            X = Variable(X)
            y = Variable(y)
            X = X.cuda() if cuda else X
            y = y.cuda() if cuda else y
            optimizer.zero_grad()
            y_pred = tcn(X)
            loss   = criterion(y_pred, y)
            loss.backward()
            optimizer.step()

            loss_value    = loss.cpu().data[0]
            running_loss += loss_value
            loss_pbar    += loss_value

            if i % loss_update_rate == loss_update_rate - 1:
                pbar.set_postfix(loss = loss_pbar / loss_update_rate)
                loss_pbar = 0

            if i % print_every == print_every - 1:
                test_result = test_model(tcn, test_seq_size, window_size, n_char, 
                                         id_to_char, char_to_id, data)
                tqdm.write(f'Batch: {i + 1 : 6}, '
                           f'loss: {running_loss / print_every : .4f}\n'
                           f'{test_result}\n')
                running_loss = 0

Batch:    500, loss:  2.1264
l, you must seek spontaneous avenues of expression no matter how much your current obligations constrain you. ultimately, new ways of doing things today of the proble of the contro to a so to the to step to so your all than you to so so to so to a so so to stay be a so to the to see your feelings to start of your and hard to start to the prore to a so sore thing to start to starting to so so so to the prowerstep to the concerent of you can seer to so to stranging your concertation to so so the proble to the prose to shand to start of your feel to the prosted to the proble to so your concertati

Batch:   1000, loss:  1.4211
inner skin than yours. unfortunately, your self-confidence could be perceived by others as a false sense of authority, causing them to pull away from your current so you may be a strate your connection to your feelings and the moon in your concerns today and you can be a lot of the more in your control to your confront something you to be

Batch:    500, loss:  1.0508
is in your sign. you might think that you're being perfectly clear now, but your actions don't seem to support your words. it may be a better strategy is to stay on the present moment with an intensity today as your current situation is a strong enough to see the pressure to set your feelings that you can and then more than you can do something that you can be a way to start to the position of your future is a smart idea to set aside your production today and you may be so easy way to tal0t out of your plans and then still may be a bit of a position that you can do something that you can do to

Batch:   1000, loss:  1.0228
ience a relationship crisis. it's not easy to follow through on whatever you start today. you probably have already promised more than you can deliver on your plate today as you won0t want to someone you trust the stress of your plans and then someone else to stay on trac0to your own could be a situation today because you can set your emo

Batch:    500, loss:  0.9399
ely nothing to provoke this kind of rejection. be patient and let the situation unfold on its own. ultimately, you could discover that your angst stems to be pretty grown one that you can do now will set aside your personal listener is now that you might not be able to see how to get a healthy perspection of your feelings to a more complete and the sun in your sign today as you thin0th your agenda and stay opposition to start a problem that you are able to manage your words and then let you only to set yourself to be as possible if you are too hard to manage your desires today as the moon in y

Batch:   1000, loss:  0.9269
 possibilities when only yesterday everything seemed more problematic. however, you haven't been suddenly struck with answers to all of your questions about your discomfort with your friends and family minds you with a conscient day to reach your goals and then stic0ti0to your spiritual course of although you may be tempted to tale a soli

Batch:    500, loss:  0.8845
tionships from growing more complicated. consider the potential impact of what you do prior to taking action. practicing a bit of self-restraint now is to sharing your feelings and then the discussion by stepping out of your success as best approach to loo0th your concerted as you thin0th a tell you who will happy anyone else today and then someone who is complicated by imagination and support your feelings today as the moon moment is the best way to reach your goals and follow through on whate0th you see that you can do now is the best time to any sidestep a problem is that you can see the ho

Batch:   1000, loss:  0.8782
en to stop. thankfully, you can use your common sense to balance your need for connection with your desire to be alone. although you may believe that you can see the role of a relationship today as you thin0th house of communication is a small strategy that you can see the positi0t and enable your top parado0th someone you li0t of your cu

Batch:    500, loss:  0.8536
recedence today over encouraging the whimsical dreams of others. as much as you might want to help a friend turn a fantasy into reality, don't let your confidence will ultimately be for your personality or the moment way to manipulate your current stress is a smart strategy now that you can saster an inspiration of your dreams come true to your own who as long as you tall a strategy is to get your friends and associates than you will be able to do it alone today and your contemplation may be the situation by accepting a bit of a position to be an une0t an upcoming apparent that you ha0th a fan

Batch:   1000, loss:  0.8497
specifically wrong today, you still might feel bored and restless. don't be too judgmental about your current circumstances. it's okay if you want to promise your choices at wor0t on try not to deal with an impulsi common sense the position by simply tang the process of a few days before you easier to accomplishing what you want to do ney

Batch:    500, loss:  0.8334
hood days. unfortunately, it's hard to tell which memories are based upon truth and which ones are fully fabricated from the depths of your unconscious aries moon is in your sign today as the moon mo0th threaling shifts and be surprising a simple things to do it or you will go to concerned about your future is little down to the rest of your taste today and you may be feeling a bit uncert in a comple0s at wor0th a few obstacles that you must be cautious obligations as you can be so sure of yourself that you may not be able to engage in a relationship now that the needs of personal plans now an

Batch:   1000, loss:  0.8302
e the time to play. naturally, meeting your obligations remains a high priority, but your overall quality of work won't be as good if you don't also get benefit to your future is less that you can be scared of drawing what you should do new to ma0th a compromise more than you can deli0th the best courage to be follow you so fast that you 

Batch:    500, loss:  0.8185
t your calendar today, but all your efforts seem to be for naught. once the fabric of your day begins to unravel, nothing you do seems to put it all boost to reach your desire to do the situation is the best way to reach your destination as you can tell that you could be more interested in your future obstacles that can be a little completely problematic because you are able to catch your desires in the current circumstances may be undergorner when you begin to share your feelings with a smarter strategy to be a surprising possibilities about what you should be doing something that you set you

Batch:   1000, loss:  0.8166
t of creativity early on, you might get irritated with someone who tries to hijack your day with their agenda. but you're not interested in playing it to your position is now that the moon is bac0se by tomorrow and then step into your plan to get in the way of a difficult situation today is so much self0ti by steps into your personal disc

In [17]:
genererate_long_sequence(
    tcn, 
    3000, 
    n_char, 
    id_to_char, 
    char_to_id, 
    'your day will be terrible but you should stay optimistic because'
)

'your day will be terrible but you should stay optimistic because you try to listen to the finish your prodates and reality chec0s so you can deli0t a different direction now that shifts in your fantasies of the past and see where they can be an emotional outch out along the way0th a more meaning that it will be easier to learn something that you want to be accessed the role to your courage to focus on the edge with others isn0t a different situation by retreating up of change that you would be wise to tal0ti you intensional problems later on the shift of your plan and endle your world or may not be what you need to be in the long run0th solution could be totally unrape today as possible about your dreams that you need to do anything may be a position to someone else thin0th to support your producti0th ahead with a matter moti0t dri0t and then the sun in your sign lead you to ha0th your life of choice and prior an ally for affection to hear for a while to stay on trac0s and there may b