In [1]:
import pickle
import os

PIK = "claim_and_title.data"

if not os.path.isfile(PIK):
    # Download file
    !wget https://benhoyle.github.io/notebooks/title_generation/claim_and_title.data

with open(PIK, "rb") as f:
    print("Loading data")
    data = pickle.load(f)
    print("{0} samples loaded".format(len(data)))
    
print("\n\nAdding start and stop tokens to output")
data = [(c, "startseq {0} stopseq".format(t)) for c, t in data]
                                      
print("\n\nAn example title:", data[0][1])
print("----")
print("An example claim:", data[0][0])

Loading data
30000 samples loaded


Adding start and stop tokens to output


An example title: startseq System and method for session restoration at geo-redundant gateways stopseq
----
An example claim: 
1. A method for managing a backup service gateway (SGW) associated with a primary SGW, the method comprising:
periodically receiving from the primary SGW at least a portion of corresponding UE session state information, the received portion of session state information being sufficient to enable the backup SGW to indicate to an inquiring management entity that UEs having an active session supported by the primary SGW are in a live state; and
in response to a failure of the primary SGW, the backup SGW assuming management of IP addresses and paths associated with said primary SGW and transmitting a Downlink Data Notification (DDN) toward a Mobility Management Entity (MME) for each of said UEs having an active session supported by the failed primary SGW to detach from the network and reat

In [2]:
from ludwig_model import LudwigModel

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [3]:
lw = LudwigModel(
    encoder_texts=[d[0] for d in data],
    decoder_texts=[d[1] for d in data],
    encoder_seq_length=300,
    decoder_seq_length=22,
    num_encoder_tokens=2500,
    num_decoder_tokens=2500,
    latent_dim=128,
    weights_file="class_ludwigmodel.hdf5",
    training_set_size=250
)

Fitting tokenizers
Our input data has shape (30000, 300) and our output data has shape (30000, 22)
Generating training and test data
Building model
Loading GloVe 100d embeddings from file
Found 400000 word vectors.
Building embedding matrix
Compiling model
Loaded weights


In [None]:
lw.train()

Training for epoch 0
Training on batch 0 to 250 of 24000
Train on 2639 samples, validate on 655 samples
Epoch 1/1
Training on batch 250 to 500 of 24000
Train on 2484 samples, validate on 658 samples
Epoch 1/1
Training on batch 500 to 750 of 24000
Train on 2586 samples, validate on 583 samples
Epoch 1/1
Training on batch 750 to 1000 of 24000
Train on 2613 samples, validate on 667 samples
Epoch 1/1
Training on batch 1000 to 1250 of 24000
Train on 2532 samples, validate on 687 samples
Epoch 1/1
Training on batch 1250 to 1500 of 24000
Train on 2481 samples, validate on 699 samples
Epoch 1/1
Training on batch 1500 to 1750 of 24000
Train on 2425 samples, validate on 625 samples
Epoch 1/1
Training on batch 1750 to 2000 of 24000
Train on 2575 samples, validate on 625 samples
Epoch 1/1
Training on batch 2000 to 2250 of 24000
Train on 2449 samples, validate on 622 samples
Epoch 1/1
Training on batch 2250 to 2500 of 24000
Train on 2568 samples, validate on 661 samples
Epoch 1/1
Training on batch 

Training on batch 11000 to 11250 of 24000
Train on 2399 samples, validate on 669 samples
Epoch 1/1
Training on batch 11250 to 11500 of 24000
Train on 2476 samples, validate on 669 samples
Epoch 1/1
Training on batch 11500 to 11750 of 24000
Train on 2450 samples, validate on 648 samples
Epoch 1/1
Training on batch 11750 to 12000 of 24000
Train on 2553 samples, validate on 646 samples
Epoch 1/1
Training on batch 12000 to 12250 of 24000
Train on 2395 samples, validate on 595 samples
Epoch 1/1
Training on batch 12250 to 12500 of 24000
Train on 2497 samples, validate on 617 samples
Epoch 1/1
Training on batch 12500 to 12750 of 24000
Train on 2535 samples, validate on 641 samples
Epoch 1/1
Training on batch 12750 to 13000 of 24000
Train on 2426 samples, validate on 604 samples
Epoch 1/1
Training on batch 13000 to 13250 of 24000
Train on 2445 samples, validate on 605 samples
Epoch 1/1
Training on batch 13250 to 13500 of 24000
Train on 2408 samples, validate on 672 samples
Epoch 1/1
Training o

Training on batch 21750 to 22000 of 24000
Train on 2515 samples, validate on 693 samples
Epoch 1/1
Training on batch 22000 to 22250 of 24000
Train on 2586 samples, validate on 627 samples
Epoch 1/1
Training on batch 22250 to 22500 of 24000
Train on 2543 samples, validate on 676 samples
Epoch 1/1
Training on batch 22500 to 22750 of 24000
Train on 2487 samples, validate on 582 samples
Epoch 1/1
Training on batch 22750 to 23000 of 24000
Train on 2557 samples, validate on 610 samples
Epoch 1/1
Training on batch 23000 to 23250 of 24000
Train on 2364 samples, validate on 661 samples
Epoch 1/1
Training on batch 23250 to 23500 of 24000
Train on 2538 samples, validate on 637 samples
Epoch 1/1
Training on batch 23500 to 23750 of 24000
Train on 2373 samples, validate on 637 samples
Epoch 1/1
Training on batch 23750 to 24000 of 24000
Train on 2593 samples, validate on 590 samples
Epoch 1/1
Sample of claim text: 1 a data storage system for use by a file system consumer according to a file system in

Training on batch 6000 to 6250 of 24000
Train on 2531 samples, validate on 637 samples
Epoch 1/1
Training on batch 6250 to 6500 of 24000
Train on 2574 samples, validate on 602 samples
Epoch 1/1
Training on batch 6500 to 6750 of 24000
Train on 2479 samples, validate on 602 samples
Epoch 1/1
Training on batch 6750 to 7000 of 24000
Train on 2548 samples, validate on 625 samples
Epoch 1/1