<a href="https://colab.research.google.com/github/abulhasanat/NLP-Experiments/blob/master/bert_text_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **ShakesBERT**
BERT's BertForNextSentencePrediction class gives a score for the likelihood that a sentence (or line) follows a preceding one. We can use this for example to construct a new sonnet from lines of existing Shakespeare sonnets. The new sonnet will have a higher likelihood of making sense than if we merely drew the lines at random. The next sentence prediction therefore acts as a kind of sense discriminator.

Sonnet lines are taken from [Poetry DB](http://poetrydb.org/index.html).

In [3]:
!pip install pytorch_pretrained_bert

Collecting pytorch_pretrained_bert
[?25l  Downloading https://files.pythonhosted.org/packages/d7/e0/c08d5553b89973d9a240605b9c12404bcf8227590de62bae27acbcfe076b/pytorch_pretrained_bert-0.6.2-py3-none-any.whl (123kB)
[K     |██▋                             | 10kB 29.7MB/s eta 0:00:01[K     |█████▎                          | 20kB 5.8MB/s eta 0:00:01[K     |████████                        | 30kB 6.9MB/s eta 0:00:01[K     |██████████▋                     | 40kB 7.5MB/s eta 0:00:01[K     |█████████████▎                  | 51kB 6.8MB/s eta 0:00:01[K     |███████████████▉                | 61kB 7.7MB/s eta 0:00:01[K     |██████████████████▌             | 71kB 7.7MB/s eta 0:00:01[K     |█████████████████████▏          | 81kB 8.6MB/s eta 0:00:01[K     |███████████████████████▉        | 92kB 8.1MB/s eta 0:00:01[K     |██████████████████████████▌     | 102kB 8.2MB/s eta 0:00:01[K     |█████████████████████████████▏  | 112kB 8.2MB/s eta 0:00:01[K     |██████████████████████

In [4]:
import torch
from pytorch_pretrained_bert import BertTokenizer, BertForNextSentencePrediction

In [5]:
tokeniser = BertTokenizer.from_pretrained('bert-base-uncased')

100%|██████████| 231508/231508 [00:00<00:00, 616478.67B/s]


In [6]:

model = BertForNextSentencePrediction.from_pretrained('bert-base-uncased')

100%|██████████| 407873900/407873900 [00:13<00:00, 29276827.69B/s]


In [7]:
import urllib
import json
from random import *

url = 'http://poetrydb.org/author,linecount/Shakespeare;14/lines'
with urllib.request.urlopen(url) as response:
    data = json.load(response)   

    
poem_number = randint(0, len(data)-1)
previous_line = data[poem_number]['lines'][0]
print(previous_line.strip())

next_line_prediction = 0
threshold = 3
poems_picked = [poem_number]

for line_number in range(1, 14):
    next_line_prediction = 0
    while(line_number == len(poems_picked)):
        poem_number = randint(0, len(data)-1)
        line_to_check = data[poem_number]['lines'][line_number]
        
        len_line_1 = len(tokeniser.tokenize(previous_line))
        len_line_2 = len(tokeniser.tokenize(line_to_check))

        text = previous_line + ' ' + line_to_check
        tokenized_text = tokeniser.tokenize(text)

        indexed_tokens = tokeniser.convert_tokens_to_ids(tokenized_text)
        segments_ids = ([0] * len_line_1) + ([1] * len_line_2)
        tokens_tensor = torch.tensor([indexed_tokens])
        segments_tensors = torch.tensor([segments_ids])
        
        predictions = model(tokens_tensor, segments_tensors)
        
        next_line_prediction = predictions[0,0].item()
        # No poem should be taken a line from more than once
        if poem_number not in poems_picked and next_line_prediction > threshold:
            poems_picked = poems_picked + [poem_number]

    print(line_to_check.strip())
    previous_line = line_to_check

That god forbid, that made me first your slave,
That thereby beauty's rose might never die,
That love is merchandiz'd, whose rich esteeming,
And darkly bright, are bright in dark directed.
And each, though enemies to either's reign,
For thou art covetous, and he is kind;
For compound sweet; forgoing simple savour,
Was, sleeping, by a virgin hand disarm'd.
Love's not Time's fool, though rosy lips and cheeks
Mine eyes have drawn thy shape, and thine for me
But, like a sad slave, stay and think of nought
As tender nurse her babe from faring ill.
Yet so they mourn becoming of their woe,
Which, used, lives th' executor to be.


In [9]:
poem_number

3

In [13]:
data[poem_number]['lines'][0].strip()

'Unthrifty loveliness, why dost thou spend'