<a href="https://colab.research.google.com/github/AslanDevbrat/Seq2Seq/blob/main/seq2seq.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Character-level recurrent sequence-to-sequence model

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2017/09/29<br>
**Last modified:** 2020/04/26<br>
**Description:** Character-level recurrent sequence-to-sequence model.

## Introduction

This example demonstrates how to implement a basic character-level
recurrent sequence-to-sequence model. We apply it to translating
short English sentences into short French sentences,
character-by-character. Note that it is fairly unusual to
do character-level machine translation, as word-level
models are more common in this domain.

**Summary of the algorithm**

- We start with input sequences from a domain (e.g. English sentences)
    and corresponding target sequences from another domain
    (e.g. French sentences).
- An encoder LSTM turns input sequences to 2 state vectors
    (we keep the last LSTM state and discard the outputs).
- A decoder LSTM is trained to turn the target sequences into
    the same sequence but offset by one timestep in the future,
    a training process called "teacher forcing" in this context.
    It uses as initial state the state vectors from the encoder.
    Effectively, the decoder learns to generate `targets[t+1...]`
    given `targets[...t]`, conditioned on the input sequence.
- In inference mode, when we want to decode unknown input sequences, we:
    - Encode the input sequence into state vectors
    - Start with a target sequence of size 1
        (just the start-of-sequence character)
    - Feed the state vectors and 1-char target sequence
        to the decoder to produce predictions for the next character
    - Sample the next character using these predictions
        (we simply use argmax).
    - Append the sampled character to the target sequence
    - Repeat until we generate the end-of-sequence character or we
        hit the character limit.


## Setup


In [6]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Embedding, SimpleRNNCell, GRUCell, Dense, LSTMCell
from tensorflow.keras import Input
import pandas as pd
from numpy import argmax
from math import log

## Download the data


## Configuration


In [7]:
!wget  https://storage.googleapis.com/gresearch/dakshina/dakshina_dataset_v1.0.tar
!tar -xf 'dakshina_dataset_v1.0.tar'
train_file_path = "/content/dakshina_dataset_v1.0/hi/lexicons/hi.translit.sampled.train.tsv"
val_file_path= "/content/dakshina_dataset_v1.0/hi/lexicons/hi.translit.sampled.test.tsv"
test_file_path  = "/content/dakshina_dataset_v1.0/hi/lexicons/hi.translit.sampled.dev.tsv"

--2022-06-18 17:53:08--  https://storage.googleapis.com/gresearch/dakshina/dakshina_dataset_v1.0.tar
Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.98.128, 142.250.97.128, 74.125.196.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.98.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2008340480 (1.9G) [application/x-tar]
Saving to: ‘dakshina_dataset_v1.0.tar’


2022-06-18 17:53:19 (171 MB/s) - ‘dakshina_dataset_v1.0.tar’ saved [2008340480/2008340480]



In [102]:
batch_size = 64  # Batch size for training.
epochs = 100  # Number of epochs to train for.
latent_dim = 256  # Latent dimensionality of the encoding space.
num_samples = 1000  # Number of samples to train on.
# Path to the data txt file on disk.
data_path = train_file_path


## Prepare the data


In [117]:
def processData(filename,input_chars=set(),target_chars=set()):
  input=[]
  target=[]
  with open(filename, "r", encoding="utf-8") as f:
    lines = f.read().split("\n")
  for line in lines[:100]:
      t_text,i_text, attestation = line.split("\t")
       # We use "\t" as the "start sequence" character and "\n" as "end sequence" character for the target text.
      input.append(i_text)
      target.append("\t"+t_text+"\n")
      for char in i_text:
        if char not in input_chars:
            input_chars.add(char)
      for char in t_text:
        if char not in target_chars:
            target_chars.add(char)
  target_chars.add("\t")
  target_chars.add("\n")

  input_chars = sorted(list(input_chars))
  target_chars = sorted(list(target_chars))
  num_encoder_tokens = len(input_chars)
  num_decoder_tokens = len(target_chars)
  max_encoder_seq_length = max([len(txt) for txt in input])
  max_decoder_seq_length = max([len(txt) for txt in target])
  return input,target,input_chars,target_chars,num_encoder_tokens,num_decoder_tokens, max_encoder_seq_length, max_decoder_seq_length     

In [118]:
# Vectorize the data.
input,target,input_chars,target_chars,num_encoder_tokens,num_decoder_tokens, max_encoder_seq_length, max_decoder_seq_length=processData(train_file_path)
print("Number of samples:", len(input))
print("Number of unique input tokens:", num_encoder_tokens)
print("Number of unique output tokens:", num_decoder_tokens)
print("Max sequence length for inputs:", max_encoder_seq_length)
print("Max sequence length for outputs:", max_decoder_seq_length)

Number of samples: 100
Number of unique input tokens: 23
Number of unique output tokens: 34
Max sequence length for inputs: 18
Max sequence length for outputs: 17


## Build the model


In [119]:
# Vectorize the data.
# Vectorize the data.
validation_input,validation_target,input_chars,target_chars,num_encoder_tokens,num_decoder_tokens, validation_max_encoder_seq_length, validation_max_decoder_seq_length=processData(val_file_path,set(input_chars),set(target_chars))

print("Number of validation samples:", len(validation_input))
print("Number of unique input tokens:", num_encoder_tokens)
print("Number of unique output tokens:", num_decoder_tokens)
print("validation Max sequence length for inputs:", validation_max_encoder_seq_length)
print("validation Max sequence length for outputs:", validation_max_decoder_seq_length)

Number of validation samples: 100
Number of unique input tokens: 23
Number of unique output tokens: 46
validation Max sequence length for inputs: 12
validation Max sequence length for outputs: 13


In [120]:
# Vectorize the data.
test_input,test_target,test_input_chars,test_target_chars,test_num_encoder_tokens,test_num_decoder_tokens, test_max_encoder_seq_length, test_max_decoder_seq_length=processData(test_file_path)
print("Number of validation samples:", len(test_input))
print("Test Max sequence length for inputs:", test_max_encoder_seq_length)
print("Test Max sequence length for outputs:", test_max_decoder_seq_length)

Number of validation samples: 100
Test Max sequence length for inputs: 12
Test Max sequence length for outputs: 12


In [121]:
input_token = dict([(char, i) for i, char in enumerate(input_chars)])
target_token = dict([(char, i) for i, char in enumerate(target_chars)])

reverse_input_token = dict((i, char) for char, i in input_token.items())
reverse_target_token = dict((i, char) for char, i in target_token.items())


encoder_input_data = np.zeros(
    (len(input), max_encoder_seq_length, num_encoder_tokens), dtype="float32"
)
validation_encoder_input_data=np.zeros(
    (len(validation_input), max_encoder_seq_length, num_encoder_tokens), dtype="float32"
)
test_encoder_input_data=np.zeros(
    (len(test_input), max_encoder_seq_length, num_encoder_tokens), dtype="float32"
)
decoder_input_data = np.zeros(
    (len(input), max_decoder_seq_length, num_decoder_tokens), dtype="float32"
)
validation_decoder_input_data =np.zeros(
    (len(validation_input), max_decoder_seq_length, num_decoder_tokens), dtype="float32"
)
decoder_target_data = np.zeros(
    (len(input), max_decoder_seq_length, num_decoder_tokens), dtype="float32"
)
validation_decoder_target_data = np.zeros(
    (len(validation_input), max_decoder_seq_length, num_decoder_tokens), dtype="float32"
)

for i, (input_text, target_text) in enumerate(zip(input, target)):
    for t, char in enumerate(input_text):
        encoder_input_data[i, t, input_token[char]] = 1.0
    for t, char in enumerate(target_text):
        # decoder_target_data is ahead of decoder_input_data by one timestep
        decoder_input_data[i, t, target_token[char]] = 1.0
        if t > 0:
            # decoder_target_data will be ahead by one timestep
            # and will not include the start character.
            decoder_target_data[i, t - 1, target_token[char]] = 1.0
# for validation data
for i, (validation_input_text, validation_target_text) in enumerate(zip(validation_input, validation_target)):
    for t, char in enumerate(validation_input_text):
        validation_encoder_input_data[i, t, input_token[char]] = 1.0
    for t, char in enumerate(validation_target_text):
        # decoder_target_data is ahead of decoder_input_data by one timestep
        validation_decoder_input_data[i, t, target_token[char]] = 1.0
        if t > 0:
            # decoder_target_data will be ahead by one timestep
            # and will not include the start character.
            validation_decoder_target_data[i, t - 1, target_token[char]] = 1.0

# for test data
for i, (test_input_text, test_target_text) in enumerate(zip(test_input, test_target)):
    for t, char in enumerate(test_input_text):
        test_encoder_input_data[i, t, input_token[char]] = 1.0

In [122]:
encoder_input_data.shape

(100, 18, 23)

In [123]:
class Seq2seq(tf.keras.Model):
  def __init__(self, num_encoder_tokens, num_decoder_tokens,embedding_dim,num_of_layers,unit_type, dropout , recurrent_dropout):
    super().__init__()
    self.encoder_inputs = Input(shape = (None,num_encoder_tokens))
    self.decoder_inputs = keras.Input(shape=(None, num_decoder_tokens))
    self.num_encoder_tokens = num_encoder_tokens
    self.embedding_dim = embedding_dim
    self.dropout = dropout
    self.recurrent_dropout = recurrent_dropout
    self.num_decoder_tokens = num_decoder_tokens
    self.num_of_encoder_layer  =num_of_layers
    self.num_of_decoder_layer =num_of_layers
    self.type_encoder_unit =unit_type 
    self.type_decoder_unit =unit_type
    self.train_step()
    self.build_model()

  def get_embedding_layer(self, num_encoder_tokens, embedding_dim):
    return Embedding(num_encoder_tokens, embedding_dim, )

  def get_cell(self, cell_type = "lstm", num_of_cell = 1, name = None):
    #print(cell_type)
    if cell_type == "lstm":
      return LSTMCell(num_of_cell, dropout = self.dropout, recurrent_dropout = self.recurrent_dropout)
    elif cell_type == "rnn":
      return SimpleRNNCell(num_of_cell, dropout = self.dropout, recurrent_dropout = self.recurrent_dropout)
    elif cell_type =="gru":
      return GRUCell(num_of_cell, dropout = self.dropout, recurrent_dropout = self.recurrent_dropout)
    else:
      print(f"Invalid cell type: {cell_type}")
  def get_encoder(self,latent_dim, cell_type = "lstm", num_of_layer = 1, ):
    return tf.keras.layers.RNN(tf.keras.layers.StackedRNNCells( [self.get_cell(cell_type, latent_dim) for i in range(num_of_layer)]), return_state = True,)

  def get_decoder(self,latent_dim ,cell_type = "lstm", num_of_layer = 1, name = None ):
    return tf.keras.layers.RNN(tf.keras.layers.StackedRNNCells( [self.get_cell(cell_type, latent_dim,) for i in range(num_of_layer)]), return_sequences=True, return_state=True)

  def get_dense_layer(self, num_decoder_token, activation = "softmax"):
    return Dense(num_decoder_tokens, activation= activation)

  def train_step(self):
    # self.embedding_layer = self.get_embedding_layer( self.num_encoder_tokens, self.embedding_dim)
    # self.embedding_results = self.embedding_layer()

    self.encoder = self.get_encoder( self.embedding_dim,self.type_encoder_unit, self.num_of_encoder_layer , )
    encoder_results = self.encoder(self.encoder_inputs)

    self.encoder_outputs, self.encoder_states = encoder_results[0], encoder_results[1:]

    # self.embedding_layer2 = self.get_embedding_layer( self.num_decoder_tokens, self.embedding_dim)
    # self.embedding_results2 = self.embedding_layer2(self.decoder_inputs,)

    self.decoder = self.get_decoder( self.embedding_dim, self.type_decoder_unit, self.num_of_decoder_layer,)
    self.decoder_results = self.decoder(self.decoder_inputs, initial_state=self.encoder_states)

    self.decoder_output = self.decoder_results[0]
    self.decoder_dense = self.get_dense_layer(self.num_decoder_tokens)
    self.dense_output = self.decoder_dense(self.decoder_output)

  def build_model(self):
    
    self.model = keras.Model([self.encoder_inputs, self.decoder_inputs], self.dense_output, name = "Seq2Seq_model")
    return self.model



In [124]:

seq2seq = Seq2seq(num_encoder_tokens,num_decoder_tokens, 64,4,"lstm", 0.0, 0.0).build_model()
seq2seq.summary()

Model: "Seq2Seq_model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_5 (InputLayer)           [(None, None, 23)]   0           []                               
                                                                                                  
 input_6 (InputLayer)           [(None, None, 46)]   0           []                               
                                                                                                  
 rnn_4 (RNN)                    [(None, 64),         121600      ['input_5[0][0]']                
                                 [(None, 64),                                                     
                                 (None, 64)],                                                     
                                 [(None, 64),                                         

In [None]:
def lstm_cell():
  return tf.keras.layers.LSTMCell(4)

In [None]:
num_encoder_tokens

32

In [5]:
# Define an input sequence and process it.
encoder_inputs = keras.Input(shape=(None, num_encoder_tokens))
encoder = keras.layers.LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)

# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]

# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = keras.Input(shape=(None, num_decoder_tokens))

# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the
# return states in the training model, but we will use them in inference.
decoder_lstm = keras.layers.LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = keras.layers.Dense(num_decoder_tokens, activation="softmax")
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)

In [6]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, None, 32)]   0           []                               
                                                                                                  
 input_2 (InputLayer)           [(None, None, 25)]   0           []                               
                                                                                                  
 lstm (LSTM)                    [(None, 256),        295936      ['input_1[0][0]']                
                                 (None, 256),                                                     
                                 (None, 256)]                                                     
                                                                                              

In [None]:
encoder_inputs

<KerasTensor: shape=(None, None, 32) dtype=float32 (created by layer 'input_2')>

In [None]:
model.summary()

In [None]:
model.summary()

## Train the model


In [224]:
predict = seq2seq( [encoder_input_data[2:3], decoder_input_data[2:3]],
    decoder_target_data[2:3])

In [225]:
predict[0].shape

TensorShape([21, 68])

In [25]:
class BeamSearch(keras.callbacks.Callback):

  def __init__(self, beam_size):
    self.beam_size = beam_size

  def beam_search_decoder(aelf, data, k):
    sequences = [[list(), 0.0]]
    # walk over each step in sequence
    for row in data:
      all_candidates = list()
      # expand each current candidate
      for i in range(len(sequences)):
        seq, score = sequences[i]
        for j in range(len(row)):
          candidate = [seq + [j], score - log(row[j])]
          all_candidates.append(candidate)
      # order all candidates by score
      ordered = sorted(all_candidates, key=lambda tup:tup[1])
      # select k best
      sequences = ordered[:k]
    return sequences
  
  def on_epoch_end(self, epoch, logs = None):
    prediction = self.model.predict([validation_encoder_input_data , validation_decoder_input_data])
    print(prediction.shape)
    for i, pred in enumerate(prediction):
      beam_search_prediction = self.beam_search_decoder(pred, self.beam_size)
      correct_prediction = 0
      for k in range(self.beam_size):
        translated_word = "\t"+"".join([reverse_target_token[x] for x in beam_search_prediction[k][0]])
        #print(translated_word)
        if translated_word == validation_target[i]:
          #print("fuck yeah")
          correct_prediction+=1
          break
    print(f"Accuracy by Beam Search {correct_prediction/len(validation_target)}")
      # print(len(beam_search_prediction))
      # print(beam_search_prediction)


In [1]:
%%capture
!pip install wandb --upgrade

In [157]:
sweep_config = {
    
    'method':'bayes',
    'metric': {
        'name':'val_accuracy',
        'goal':'maximize'
    },
    'parameters':{
    
    "num_of_layer" : {'values': [1,2,3]},
    "unit_size": {"values":[16,32,64]},
    "unit_type": {"values":["lstm","rnn","gru"]},
    "dropout": {"values": [0.0, 0.2, 0.4]},
    'recurrent_dropout':{'values':[0.0,0.3]},
    "beam_size" : {"values":[1,2,3,4]},
    "epochs":{"value":2}               
                   }
}

import pprint

pprint.pprint(sweep_config)

{'method': 'bayes',
 'metric': {'goal': 'maximize', 'name': 'val_accuracy'},
 'parameters': {'beam_size': {'values': [1, 2, 3, 4]},
                'dropout': {'values': [0.0, 0.2, 0.4]},
                'epochs': {'value': 2},
                'num_of_layer': {'values': [1, 2, 3]},
                'recurrent_dropout': {'values': [0.0, 0.3]},
                'unit_size': {'values': [16, 32, 64]},
                'unit_type': {'values': ['lstm', 'rnn', 'gru']}}}


In [151]:
import wandb
from wandb.keras import WandbCallback
wandb.login()

True

In [139]:
sweep_id = wandb.sweep(sweep_config, project="seq2seq")

Create sweep with ID: 6fur9pnn
Sweep URL: https://wandb.ai/aslan/seq2seq/sweeps/6fur9pnn


In [143]:
print(history.history.keys())

NameError: ignored

In [None]:
def train(config = None):
  with wandb.init(config=config):
    config = wandb.config
    #print(config)
    seq2seq = Seq2seq(num_encoder_tokens,num_decoder_tokens, config.unit_size, config.num_of_layer,config.unit_type , config.dropout,config.recurrent_dropout).build_model()
    seq2seq.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy",])
    seq2seq.fit(
        [encoder_input_data, decoder_input_data],
        decoder_target_data,
        batch_size=batch_size,
        epochs=config.epochs,
        validation_data =  ([validation_encoder_input_data , validation_decoder_input_data] ,validation_decoder_target_data),
        callbacks = [BeamSearch(config.beam_size), WandbCallback()],verbose = 1, 
        )


    
    
wandb.agent(sweep_id, train)

[34m[1mwandb[0m: Agent Starting Run: 324vtdkz with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.385 MB of 0.385 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.08882
best_epoch,1.0
best_val_loss,1.53178
epoch,1.0
loss,1.77208
val_accuracy,0.08294
val_loss,1.53178


[34m[1mwandb[0m: Agent Starting Run: 2jm9p11k with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: lstm


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.331 MB of 0.331 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.06706
best_epoch,1.0
best_val_loss,1.54294
epoch,1.0
loss,1.78177
val_accuracy,0.02118
val_loss,1.54294


[34m[1mwandb[0m: Agent Starting Run: w9gnyzzn with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.04
best_epoch,1.0
best_val_loss,1.53924
epoch,1.0
loss,1.77653
val_accuracy,0.05118
val_loss,1.53924


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: qt4qcde1 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 64
[34m[1mwandb[0m: 	unit_type: lstm


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='1.498 MB of 1.498 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.13235
best_epoch,1.0
best_val_loss,1.41562
epoch,1.0
loss,1.74625
val_accuracy,0.11706
val_loss,1.41562


[34m[1mwandb[0m: Agent Starting Run: fl8l94we with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.09176
best_epoch,1.0
best_val_loss,1.53676
epoch,1.0
loss,1.77119
val_accuracy,0.03235
val_loss,1.53676


[34m[1mwandb[0m: Agent Starting Run: 3yqabi43 with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 64
[34m[1mwandb[0m: 	unit_type: rnn


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.449 MB of 0.449 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.01647
best_epoch,1.0
best_val_loss,1.52969
epoch,1.0
loss,1.85858
val_accuracy,0.02059
val_loss,1.52969


[34m[1mwandb[0m: Agent Starting Run: 1di30wc8 with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: rnn


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.111 MB of 0.111 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,█▁
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.03118
best_epoch,1.0
best_val_loss,1.53547
epoch,1.0
loss,1.80503
val_accuracy,0.02235
val_loss,1.53547


[34m[1mwandb[0m: Agent Starting Run: 8fmhq6d1 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0
[34m[1mwandb[0m: 	unit_size: 64
[34m[1mwandb[0m: 	unit_type: lstm


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='1.498 MB of 1.498 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.10294
best_epoch,1.0
best_val_loss,1.39573
epoch,1.0
loss,1.7385
val_accuracy,0.08
val_loss,1.39573


[34m[1mwandb[0m: Agent Starting Run: rb5otp88 with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.129 MB of 0.129 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.05588
best_epoch,1.0
best_val_loss,1.54746
epoch,1.0
loss,1.78525
val_accuracy,0.07765
val_loss,1.54746


[34m[1mwandb[0m: Agent Starting Run: to7y0ys6 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 1
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.158 MB of 0.158 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.00941
best_epoch,1.0
best_val_loss,1.54532
epoch,1.0
loss,1.77931
val_accuracy,0.01529
val_loss,1.54532


[34m[1mwandb[0m: Agent Starting Run: ex22y9cf with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 1
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.158 MB of 0.158 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.06647
best_epoch,1.0
best_val_loss,1.54665
epoch,1.0
loss,1.77796
val_accuracy,0.03235
val_loss,1.54665


[34m[1mwandb[0m: Agent Starting Run: dyv7tnfr with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 64
[34m[1mwandb[0m: 	unit_type: rnn


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.307 MB of 0.307 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.01765
best_epoch,1.0
best_val_loss,1.51442
epoch,1.0
loss,1.85082
val_accuracy,0.03059
val_loss,1.51442


[34m[1mwandb[0m: Agent Starting Run: ifzgo4r2 with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.10471
best_epoch,1.0
best_val_loss,1.5394
epoch,1.0
loss,1.77631
val_accuracy,0.09
val_loss,1.5394


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 4l7338jy with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.08824
best_epoch,1.0
best_val_loss,1.54197
epoch,1.0
loss,1.77687
val_accuracy,0.09
val_loss,1.54197


[34m[1mwandb[0m: Agent Starting Run: rvh0iep2 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 1
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.158 MB of 0.158 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.03
best_epoch,1.0
best_val_loss,1.54608
epoch,1.0
loss,1.77984
val_accuracy,0.01412
val_loss,1.54608


[34m[1mwandb[0m: Agent Starting Run: yi183i1b with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.08529
best_epoch,1.0
best_val_loss,1.53258
epoch,1.0
loss,1.7733
val_accuracy,0.11118
val_loss,1.53258


[34m[1mwandb[0m: Agent Starting Run: 5hlzk00t with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.08588
best_epoch,1.0
best_val_loss,1.5473
epoch,1.0
loss,1.78508
val_accuracy,0.09588
val_loss,1.5473


[34m[1mwandb[0m: Agent Starting Run: wz7shcel with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 64
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.17529
best_epoch,1.0
best_val_loss,1.50928
epoch,1.0
loss,1.75534
val_accuracy,0.13176
val_loss,1.50928


[34m[1mwandb[0m: Agent Starting Run: 464k8pev with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.08588
best_epoch,1.0
best_val_loss,1.54385
epoch,1.0
loss,1.78115
val_accuracy,0.08
val_loss,1.54385


[34m[1mwandb[0m: Agent Starting Run: x4j2cdbx with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.04
best_epoch,1.0
best_val_loss,1.54367
epoch,1.0
loss,1.78235
val_accuracy,0.04765
val_loss,1.54367


[34m[1mwandb[0m: Agent Starting Run: qji0f49l with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.59059
best_epoch,1.0
best_val_loss,1.54394
epoch,1.0
loss,1.78004
val_accuracy,0.64824
val_loss,1.54394


[34m[1mwandb[0m: Agent Starting Run: fhds7gj8 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.385 MB of 0.385 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.10882
best_epoch,1.0
best_val_loss,1.53678
epoch,1.0
loss,1.77414
val_accuracy,0.03824
val_loss,1.53678


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 4yrmzbj0 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.06118
best_epoch,1.0
best_val_loss,1.53518
epoch,1.0
loss,1.77244
val_accuracy,0.11765
val_loss,1.53518


[34m[1mwandb[0m: Agent Starting Run: t4mjitwz with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.129 MB of 0.129 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.09765
best_epoch,1.0
best_val_loss,1.54674
epoch,1.0
loss,1.78261
val_accuracy,0.08412
val_loss,1.54674


[34m[1mwandb[0m: Agent Starting Run: qhbvakt8 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.129 MB of 0.129 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.06118
best_epoch,1.0
best_val_loss,1.54977
epoch,1.0
loss,1.78523
val_accuracy,0.06118
val_loss,1.54977


[34m[1mwandb[0m: Agent Starting Run: 7jub4kta with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.385 MB of 0.385 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.11824
best_epoch,1.0
best_val_loss,1.5301
epoch,1.0
loss,1.77405
val_accuracy,0.11118
val_loss,1.5301


[34m[1mwandb[0m: Agent Starting Run: 7zc0fk04 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.385 MB of 0.385 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.11412
best_epoch,1.0
best_val_loss,1.52645
epoch,1.0
loss,1.76884
val_accuracy,0.07235
val_loss,1.52645


[34m[1mwandb[0m: Agent Starting Run: gafh3ehd with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.09294
best_epoch,1.0
best_val_loss,1.52884
epoch,1.0
loss,1.77082
val_accuracy,0.11
val_loss,1.52884


[34m[1mwandb[0m: Agent Starting Run: oui27pc3 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.10059
best_epoch,1.0
best_val_loss,1.54244
epoch,1.0
loss,1.78308
val_accuracy,0.11294
val_loss,1.54244


[34m[1mwandb[0m: Agent Starting Run: avi0tl3q with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.270 MB of 0.270 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.10412
best_epoch,1.0
best_val_loss,1.53731
epoch,1.0
loss,1.77538
val_accuracy,0.07471
val_loss,1.53731


[34m[1mwandb[0m: Agent Starting Run: pojmsf3a with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.081 MB of 0.081 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,█▁
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.11
best_epoch,1.0
best_val_loss,1.5492
epoch,1.0
loss,1.78298
val_accuracy,0.04706
val_loss,1.5492


[34m[1mwandb[0m: Agent Starting Run: x608jn1p with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.129 MB of 0.129 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.11647
best_epoch,1.0
best_val_loss,1.54657
epoch,1.0
loss,1.77993
val_accuracy,0.09529
val_loss,1.54657


[34m[1mwandb[0m: Agent Starting Run: 1zeoq7py with config:
[34m[1mwandb[0m: 	beam_size: 3
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,█▁
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.01824
best_epoch,1.0
best_val_loss,1.54377
epoch,1.0
loss,1.78195
val_accuracy,0.08294
val_loss,1.54377


[34m[1mwandb[0m: Agent Starting Run: c29jhrik with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 1
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.091 MB of 0.091 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.03588
best_epoch,1.0
best_val_loss,1.55392
epoch,1.0
loss,1.79355
val_accuracy,0.01235
val_loss,1.55392


[34m[1mwandb[0m: Agent Starting Run: 3hahioqs with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.172 MB of 0.172 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.07882
best_epoch,1.0
best_val_loss,1.54484
epoch,1.0
loss,1.78123
val_accuracy,0.03882
val_loss,1.54484


[34m[1mwandb[0m: Agent Starting Run: 312h38gm with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.2
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 3
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 32
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.385 MB of 0.385 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.14765
best_epoch,1.0
best_val_loss,1.5233
epoch,1.0
loss,1.76557
val_accuracy,0.12059
val_loss,1.5233


[34m[1mwandb[0m: Agent Starting Run: a8rs6tlb with config:
[34m[1mwandb[0m: 	beam_size: 2
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.081 MB of 0.081 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,▁█
val_loss,█▁

0,1
accuracy,0.12647
best_epoch,1.0
best_val_loss,1.54587
epoch,1.0
loss,1.78194
val_accuracy,0.09765
val_loss,1.54587


[34m[1mwandb[0m: Agent Starting Run: a4qq1jxc with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 2
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,▁█
epoch,▁█
loss,█▁
val_accuracy,█▁
val_loss,█▁

0,1
accuracy,0.05412
best_epoch,1.0
best_val_loss,1.5457
epoch,1.0
loss,1.78323
val_accuracy,0.03235
val_loss,1.5457


[34m[1mwandb[0m: Agent Starting Run: dsmcop94 with config:
[34m[1mwandb[0m: 	beam_size: 4
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 2
[34m[1mwandb[0m: 	num_of_layer: 1
[34m[1mwandb[0m: 	recurrent_dropout: 0.3
[34m[1mwandb[0m: 	unit_size: 16
[34m[1mwandb[0m: 	unit_type: gru


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0


VBox(children=(Label(value='0.091 MB of 0.091 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

In [127]:
seq2seq.compile(
    optimizer="rmsprop", loss="categorical_crossentropy", metrics=[tf.keras.metrics.CategoricalAccuracy(name='acc')]
)
seq2seq.metrics_names

[]

In [158]:

histotry = seq2seq.fit(
    [encoder_input_data, decoder_input_data],
    decoder_target_data,
    batch_size=batch_size,
    epochs=2,
    validation_data =  ([validation_encoder_input_data , validation_decoder_input_data] ,validation_decoder_target_data),
    callbacks = [BeamSearch(3)]
)
# Save model
seq2seq.save("s2s")


Epoch 1/2
Accuracy by Beam Search 0.0
Epoch 2/2
Accuracy by Beam Search 0.0




INFO:tensorflow:Assets written to: s2s/assets


INFO:tensorflow:Assets written to: s2s/assets


In [161]:
for key in histotry.history.keys():
      print(key , histotry.history[key])
      #wandb.log({key : histotry.history[key]})

loss [1.3459969758987427, 1.3179850578308105]
acc [0.17529411613941193, 0.17176470160484314]
val_loss [1.3198268413543701, 1.3006969690322876]
val_acc [0.11705882102251053, 0.13294117152690887]


In [129]:
seq2seq.metrics_names

['loss', 'acc']

## Run inference (sampling)

1. encode input and retrieve initial decoder state
2. run one step of decoder with this initial state
and a "start of sequence" token as target.
Output will be the next target token.
3. Repeat with the current target token and current states


In [None]:
next(iter(train_dataset))[1]

<tf.Tensor: shape=(21,), dtype=int32, numpy=
array([ 1, 31, 11,  2,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0], dtype=int32)>

In [None]:
next(iter(train_dataset))[1]

<tf.Tensor: shape=(21,), dtype=int32, numpy=
array([ 1, 31, 11,  2,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0], dtype=int32)>

In [None]:
encoder_input_data[43].shape

(15, 32)

In [209]:
model_x.summary()

Model: "Seq2Seq_model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_96 (InputLayer)          [(None, None, 28)]   0           []                               
                                                                                                  
 input_97 (InputLayer)          [(None, None, 68)]   0           []                               
                                                                                                  
 rnn_47 (RNN)                   [(None, 64),         122880      ['input_96[0][0]']               
                                 [(None, 64),                                                     
                                 (None, 64)],                                                     
                                 [(None, 64),                                         

In [208]:
# Define sampling models
# Restore the model and construct the encoder and decoder.
model_x = keras.models.load_model("s2s")

encoder_inputs = model.input[0]  # input_1
temp = model.layers[2].output
encoder_outputs, state = temp[0], temp[1:]  # lstm_1
encoder_states = state
encoder_model = keras.Model(encoder_inputs, encoder_states)

decoder_inputs = model.input[1]  # input_2
decoder_state_input_h = keras.Input(shape=(latent_dim,))
decoder_state_input_c = keras.Input(shape=(latent_dim,))
decoder_states_inputs = state
decoder_lstm = model.layers[3]
temp = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_outputs, state_dec = temp[0], temp[1:]
decoder_states = state_dec
decoder_dense = model.layers[4]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = keras.Model(
    [decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states
)

# Reverse-lookup token index to decode sequences back to
# something readable.
# reverse_input_char_index = dict((i, char) for char, i in num_encoder_tokens.items())
# reverse_target_char_index = dict((i, char) for char, i in num_decoder_tokens.items())
# print(reverse_input_char_index)
# print(input_token_index)

reverse_input_token = dict((i, char) for char, i in input_token.items())
reverse_target_token = dict((i, char) for char, i in target_token.items())
def decode_sequence(input_seq):
    # Encode the input as state vectors.
    states_value = encoder_model.predict(input_seq)

    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1, num_decoder_tokens))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0, target_token_index["\t"]] = 1.0

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ""
    while not stop_condition:
        temp = decoder_model.predict([target_seq] + states_value)
        output_tokens, state = temp[0],temp[1:]

        # Sample a token
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        #print(reverse_target_char_index)
        sampled_char = reverse_target_token[sampled_token_index]
        decoded_sentence += sampled_char

        # Exit condition: either hit max length
        # or find stop character.
        if sampled_char == "\n" or len(decoded_sentence) > max_decoder_seq_length:
            stop_condition = True

        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1, num_decoder_tokens))
        target_seq[0, 0, sampled_token_index] = 1.0

        # Update states
        states_value = state
    return decoded_sentence



You can now generate decoded sentences as such:


In [207]:
for seq_index in range(20):
    # Take one sequence (part of the training set)
    # for trying out decoding.
    input_seq = encoder_input_data[seq_index : seq_index + 1]
    decoded_sentence = decode_sequence(input_seq)
    print("-")
    print("Input sentence:", input_texts[seq_index])
    print("Decoded sentence:", decoded_sentence)


-
Input sentence: an
Decoded sentence: अ

-
Input sentence: ankganit
Decoded sentence: अननतत

-
Input sentence: uncle
Decoded sentence: उननल

-
Input sentence: ankur
Decoded sentence: अनाा

-
Input sentence: ankuran
Decoded sentence: अन्काा

-
Input sentence: ankurit
Decoded sentence: अन्तता

-
Input sentence: aankush
Decoded sentence: आनाा

-
Input sentence: ankush
Decoded sentence: अनाक

-
Input sentence: ang
Decoded sentence: अं

-
Input sentence: anga
Decoded sentence: अं

-
Input sentence: agandh
Decoded sentence: आआाा

-
Input sentence: angad
Decoded sentence: अंा

-
Input sentence: angane
Decoded sentence: अनना

-
Input sentence: angbhang
Decoded sentence: अंिना

-
Input sentence: angarakshak
Decoded sentence: अं्कााााा

-
Input sentence: angrakshak
Decoded sentence: अं््काााा

-
Input sentence: angara
Decoded sentence: अंाा

-
Input sentence: angaare
Decoded sentence: अनाा

-
Input sentence: angare
Decoded sentence: अंा

-
Input sentence: angi
Decoded sentence: अं

