# <center>PROJECT SANDBOX</center>

## Documentation
The aim of this notebook is to provide a simple sandbox to test different NN architectures for the project. , here is a doc about the functions imported from `scripts` folder : 

- **`prepare_dataset(device,ratio=0.5,shuffle_ctx=False)`** :
    - **Input**:
        - device : a torch.device object
        - ratio : a float ratio between 0 and 1 that determines the average proportion of modern english verses in the data loader
        - shuffle_ctx : if `True`, shuffle the contexts within a Batch so that half of the `x_1` elements has a wrong context `ctx_1`. Useful to train the context recognizer model.
    - **Return** :
        - a torch Dataset | class : Shakespeare inherited from torch.utils.data.Dataset
        - a python word dictionary (aka tokenizer) | class : dict
    - **Tensors returned when loaded in the dataloader**:
        - x_1 : input verse (modern / shakespearian)
        - x_2 : output verse (modern / shakespearian)

        - ctx_1 = context of the input verse
        - ctx_2 = context of the output verse

        - len_x : length of the input verse
        - len_y : length of the output verse

        - len_ctx_x : length of the input verse context
        - len_ctx_y : length of the output verse context

        - label : label of the input verse (0 : modern, 1 : shakespearian)
        - label_ctx : label of the context (0 : wrong context, 1 : right context)
- **`string2code(string,dict)`** : 
    - **Input**:
        - string : a sentence
        - dict : a tokenizer
    - **Return** :
        - a torch Longtensor (sentence tokenized)
- **`code2string(torch.Longtensor,dict)`** : 
    - **Input**:
        - torch.Longtensor : a sentence tokenized
        - dict : a tokenizer
    - **Return** :
        - a string sentence

## Importing packages

In [1]:
from scripts.data_builders.prepare_dataset import prepare_dataset,string2code,code2string

import torch
import torchvision.datasets as datasets
import torch.nn.functional as F
from torch import nn
from torch import optim
from torch.utils.tensorboard import SummaryWriter
import ipdb
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("device = ",device)

device =  cpu


## Preprocessing data

In [2]:
train_data, dict_words = prepare_dataset(device,ratio=0.5,shuffle_ctx=True) #check with shift+tab to look at the data structure
batch_size = 20
dict_token = {b:a for a,b in dict_words.items()} #dict for code2string

train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,
                                           shuffle=True,collate_fn=train_data.collate)

Loading ...
- Shakespeare dataset length :  20316
- Corrupted samples (ignored) :  763


## Designing NN model

## Running model

In [5]:
for x,y , ctx_x,ctx_y , len_x,len_y , len_ctx_x,len_ctx_y, label,label_ctx in train_loader:
    
    for i in range(x.shape[0]):
        print("\n- x :")
        print(code2string(x[i],dict_token))
        print("- context of x :")
        print(code2string(ctx_x[i],dict_token))
        print("- context label :",label_ctx[i].item())
    break


- x :
DEAR MASTER , LISTEN TO ME .
- context of x :
HONEST IAGO’S ARRANGED THAT . WHAT , IS HE DEAD ? OH , HE’S BEEN BETRAYED , AND I’VE BEEN RUINED . SEND ME AWAY , MY LORD , BUT DON’T KILL ME !
- context label : 0

- x :
THOU HAST SOME SUIT TO CAESAR , HAST THOU NOT ?
- context of x :
MADAM , NOT YET . I GO TO TAKE MY STAND TO SEE HIM PASS ON TO THE CAPITOL . THAT I HAVE , LADY . IF IT WILL PLEASE CAESAR TO BE SO GOOD TO CAESAR AS TO HEAR ME , I SHALL BESEECH HIM TO BEFRIEND HIMSELF .
- context label : 1

- x :
GOODBYE , SIR .
- context of x :
IS TEMPERAMENTAL . BUT I WILL TELL YOU THAT LATELY THE DUKE HAS BEEN DISPLEASED WITH HIS NIECE , AND FOR NO OTHER REASON THAN THAT PEOPLE PRAISE HER VIRTUES AND PITY HER FOR HER FATHER’S SAKE . LATER , IN A BETTER WORLD THAN THIS , I’D LOVE TO GET TO KNOW YOU . I’M INDEBTED
- context label : 1

- x :
HOW IS MY JULIET ?
- context of x :
HIS BEARD WAS GRAY , RIGHT ? IT WAS JUST LIKE IN REAL LIFE , DARK BROWN WITH SILVER WHISKERS IN IT . MAYBE IT