# **Encoder-decoder model for sequence to sequence translation from German to English**

## Import modules and packages
Import all the necessary modules and packages necessary to build, train, and evaluate the model. Also, set random seeds so that the program can be run for multiple seeds to check how variant the model's performance is.

In [17]:
#Add the Google Drive location of all files to system path
import os,sys
from google.colab import drive
drive.mount('/content/drive/')
GOOGLE_DRIVE_PATH_AFTER_MYDRIVE='Colab Notebooks/Sequence to sequence'
GOOGLE_DRIVE_PATH=os.path.join('drive','My Drive',GOOGLE_DRIVE_PATH_AFTER_MYDRIVE)
print(os.listdir(GOOGLE_DRIVE_PATH))
sys.path.append(GOOGLE_DRIVE_PATH)

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
['2 - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.ipynb', '3 - Neural Machine Translation by Jointly Learning to Align and Translate.ipynb', 'requirements.txt', 'README.md', 'assets', 'legacy', '1 - Sequence to Sequence Learning with Neural Networks.ipynb']


In [18]:
!pip install datasets evaluate --upgrade
!python -m spacy download de_core_news_sm
!pip install -U torchtext==0.15.2
!pip install -r requirements.txt

#Download the tokeniers for Englisha and German
!python -m spacy download en_core_web_sm
!python -m spacy download de_core_news_sm

import collections,random,time
import numpy as np
import datasets
#import evaluate
import spacy
import tqdm
import torchtext
import torchtext.vocab as ttv
import torch
import torch.nn as nn
import torch.optim as optim
import torch.functional as F
from torch.utils.data import Dataset

#Set the environment time
os.environ['TZ/']='US/Central'
time.tzset()

#Set the seed
sd=random.randint(0,50)
sd=1234
random.seed(sd)
np.random.seed(sd)
torch.manual_seed(sd)
torch.cuda.manual_seed(sd)
torch.backends.cudnn.deterministic=True

#Check GPU availability
if torch.cuda.is_available():
  print('\n\nThe GPU available is a {}.'.format(torch.cuda.get_device_name(0)))
  print('\nProgram execution started on: {}'.format(time.ctime()))
else:print('To ensure GPU availability, try Edit -> Notebook Settings.')

Collecting de-core-news-sm==3.7.0
  Downloading https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-3.7.0/de_core_news_sm-3.7.0-py3-none-any.whl (14.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.6/14.6 MB[0m [31m61.3 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('de_core_news_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting torch==2.0.1 (from torchtext==0.15.2)
  Using cached torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl.metadata (24 kB)
Collecting triton==2.0.0 (from torch==2.0.1->torchtext==0.15.2)
  Using cached triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.0 kB)
Using cached torch-2.0.1-cp310-cp310-ma

Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m68.8 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting de-core-news-sm==3.7.0
  Using cached https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-3.7.0/de_core_news_sm-3.7.0-py3-none-any.whl (14.6 MB)
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('de_core_news_sm')
[38;5;3m⚠ Restart to reload dependencies[0m


## The dataset
The dataset used is a subset of [Multi30k dataset](https://github.com/multi30k/dataset), which is from HuggingFace and hosted at [the page](https://huggingface.co/datasets/bentrevett/multi30k). The dataset contains about 30K parallel English and German sentences.


### Define the dataset
Download the 'train' and 'test' splits of the dataset, and inspect its characteristics.

In [19]:
#Download the 'train', the 'validation', and the 'test' splits of the dataset
trSet,valSet,tsSet=datasets.load_dataset('bentrevett/multi30k',split=['train','validation','test'])
print('The respective sizes of the train, the validation, and the test datasets are {}, {}, and {} respectively.'.format(trSet.num_rows,valSet.num_rows,tsSet.num_rows))
print('A sample picked from the train dataset would look as shown below:\n\n{}'.format(trSet[0]))

The respective sizes of the train, the validation, and the test datasets are 29000, 1014, and 1000 respectively.
A sample picked from the train dataset would look as shown below:

{'en': 'Two young, White males are outside near many bushes.', 'de': 'Zwei junge weiße Männer sind im Freien in der Nähe vieler Büsche.'}


### Tokenization
Convert the datasets from collections of strings to collections of tokens so that it is possible for any machine learning model to process them after the tokens are transformed into numbers.

In [20]:
#The function that tokenizes the datasets
def tokenize(dataset,enTokenizer,deTokenizer,maxLn,lower,sos,eos):
  enTokens=[token.text for token in enTokenizer.tokenizer(dataset['en'])][:maxLn]
  deTokens=[token.text for token in deTokenizer.tokenizer(dataset['de'])][:maxLn]
  if lower:                              #convert all uppercase tokens to lower if 'lower' is set
    enTokens=[token.lower() for token in enTokens]
    deTokens=[token.lower() for token in deTokens]
  enTokens=[sos]+enTokens+[eos]
  deTokens=[sos]+deTokens+[eos]
  return {'enTokens':enTokens,'deTokens':deTokens}

#Define the tokenizer and perform tokenization of the datasets
maxLn=1000                               #maximum length of each token
lower=True                               #whether to make all tokens lowercase or not
sosToken,eosToken='<sos>','<eos>'        #start and end of string tokens
enTokenizer=spacy.load('en_core_web_sm') #use 'tokenizer' in enTokenizer to tokenize
deTokenizer=spacy.load('de_core_news_sm')#use 'tokenizer' in deTokenizer to tokenize
print('Tokenizing the train dataset...')
trSet=trSet.map(tokenize,fn_kwargs={'enTokenizer':enTokenizer,'deTokenizer':deTokenizer,'maxLn':maxLn,'lower':lower,'sos':sosToken,'eos':eosToken})
print('Tokenizing the validation dataset...')
valSet=valSet.map(tokenize,fn_kwargs={'enTokenizer':enTokenizer,'deTokenizer':deTokenizer,'maxLn':maxLn,'lower':lower,'sos':sosToken,'eos':eosToken})
print('Tokenizing the test dataset...')
tsSet=tsSet.map(tokenize,fn_kwargs={'enTokenizer':enTokenizer,'deTokenizer':deTokenizer,'maxLn':maxLn,'lower':lower,'sos':sosToken,'eos':eosToken})
print('\nThe contents of the tokenized datasets are as follows:\n{}'.format(trSet))
print('\nA sample from the tokenized train dataset would look as shown below:\n1. English:\n{}'.format(trSet[0]['en']))
print('2. English tokens:\n{}'.format(trSet[0]['enTokens']))
print('3. German:\n{}'.format(trSet[0]['de']))
print('4. German tokens:\n{}'.format(trSet[0]['deTokens']))



Tokenizing the train dataset...
Tokenizing the validation dataset...


Map:   0%|          | 0/1014 [00:00<?, ? examples/s]

Tokenizing the test dataset...


Map:   0%|          | 0/1000 [00:00<?, ? examples/s]


The contents of the tokenized datasets are as follows:
Dataset({
    features: ['en', 'de', 'enTokens', 'deTokens'],
    num_rows: 29000
})

A sample from the tokenized train dataset would look as shown below:
1. English:
Two young, White males are outside near many bushes.
2. English tokens:
['<sos>', 'two', 'young', ',', 'white', 'males', 'are', 'outside', 'near', 'many', 'bushes', '.', '<eos>']
3. German:
Zwei junge weiße Männer sind im Freien in der Nähe vieler Büsche.
4. German tokens:
['<sos>', 'zwei', 'junge', 'weiße', 'männer', 'sind', 'im', 'freien', 'in', 'der', 'nähe', 'vieler', 'büsche', '.', '<eos>']


### Vocabulary
Convert the tokens into one-hot representations by defining a minimum threshold frequency of occurence for each token and the special tokens consisting of the unknown, the padding tokens, the SOS, and the EOS tokens.

In [21]:
#Define the vocabularies for English and German
threshold=2                                   #minimum frequency of occurence for a token to be assigned a one-hot representation
spclTokens=['<unk>','<pad>',sosToken,eosToken]#to represent tokens with frequency<threshold and for padding to make all sentences have the same lengths
enVocabulary=ttv.build_vocab_from_iterator(trSet['enTokens'],min_freq=threshold,specials=spclTokens)                       #English vocabulary
deVocabulary=ttv.build_vocab_from_iterator(trSet['deTokens'],min_freq=threshold,specials=spclTokens)                       #German vocabulary
print('The number of tokens in the vocabularies are as follows:\n1. English: {}\n2. German: {}'.format(len(enVocabulary),len(deVocabulary)))

#The special tokens in the vocabularies
enIdxUnk,enIdxPad,enIdxSos,enIdxEos=enVocabulary['<unk>'],enVocabulary['<pad>'],enVocabulary['<sos>'],enVocabulary['<eos>']#indices of special tokens in English vocabulary
deIdxUnk,deIdxPad,deIdxSos,deIdxEos=deVocabulary['<unk>'],deVocabulary['<pad>'],deVocabulary['<sos>'],deVocabulary['<eos>']#indices of special tokens in German vocabulary
assert enIdxUnk==deIdxUnk                     #in both vocabularies, the special tokens will be at indices 0,1,2, and 3 in the order they appear in 'spclTokens'. So, assert that
assert enIdxPad==deIdxPad
assert enIdxSos==deIdxSos
assert enIdxEos==deIdxEos
enVocabulary.set_default_index(enIdxUnk)      #set the default token for the English vocabulary
deVocabulary.set_default_index(deIdxUnk)      #set the default token for the German vocabulary

The number of tokens in the vocabularies are as follows:
1. English: 5893
2. German: 7853


### Numerize the vocabulary
Convert the tokens into a numerical format so that they can be processed by neural networks.

In [22]:
#The function that numerizes the tokenized dataset
def numerize(dataset,enVocab,deVocab):
  enIndices=enVocab.lookup_indices(dataset['enTokens'])
  deIndices=deVocab.lookup_indices(dataset['deTokens'])
  return {'enIndices':enIndices,'deIndices':deIndices}

#Numerize the datasets
print('Numerizing the training dataset...')
trSet=trSet.map(numerize,fn_kwargs={'enVocab':enVocabulary,'deVocab':deVocabulary})
print('Numerizing the validation dataset...')
valSet=valSet.map(numerize,fn_kwargs={'enVocab':enVocabulary,'deVocab':deVocabulary})
print('Numerizing the testing dataset...')
tsSet=tsSet.map(numerize,fn_kwargs={'enVocab':enVocabulary,'deVocab':deVocabulary})
print('The contents of the numerized datasets are as follows:\n{}.'.format(trSet))
print('\nA sample from the numerized dataset would look as shown below:\n1. English:\n{}'.format(trSet[0]['en']))
print('2. English tokens:\n{}'.format(trSet[0]['enTokens']))
print('3. English indices:\n{}'.format(trSet[0]['enIndices']))
print('4. German:\n{}'.format(trSet[0]['de']))
print('5. German tokens:\n{}'.format(trSet[0]['deTokens']))
print('6. German indices:\n{}'.format(trSet[0]['deIndices']))

#Convert the datasets to tensors after retaining only 'label' and 'indices'
trSet=trSet.with_format(type='torch',columns=['enIndices','deIndices'],output_all_columns=True)
valSet=valSet.with_format(type='torch',columns=['enIndices','deIndices'],output_all_columns=True)
tsSet=tsSet.with_format(type='torch',columns=['enIndices','deIndices'],output_all_columns=True)
print('\nThe numerized datasets\' contents after their conversion to tensors are as follows:\n{}.'.format(trSet))
print('\nA sample from the converted dataset would look as shown below:\n1. English:\n{}'.format(trSet[0]['en']))
print('2. English tokens:\n{}'.format(trSet[0]['enTokens']))
print('3. English indices:\n{}'.format(trSet[0]['enIndices']))
print('4. German:\n{}'.format(trSet[0]['de']))
print('5. German tokens:\n{}'.format(trSet[0]['deTokens']))
print('6. German indices:\n{}'.format(trSet[0]['deIndices']))

Numerizing the training dataset...
Numerizing the validation dataset...


Map:   0%|          | 0/1014 [00:00<?, ? examples/s]

Numerizing the testing dataset...


Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

The contents of the numerized datasets are as follows:
Dataset({
    features: ['en', 'de', 'enTokens', 'deTokens', 'enIndices', 'deIndices'],
    num_rows: 29000
}).

A sample from the numerized dataset would look as shown below:
1. English:
Two young, White males are outside near many bushes.
2. English tokens:
['<sos>', 'two', 'young', ',', 'white', 'males', 'are', 'outside', 'near', 'many', 'bushes', '.', '<eos>']
3. English indices:
[2, 16, 24, 15, 25, 778, 17, 57, 80, 202, 1312, 5, 3]
4. German:
Zwei junge weiße Männer sind im Freien in der Nähe vieler Büsche.
5. German tokens:
['<sos>', 'zwei', 'junge', 'weiße', 'männer', 'sind', 'im', 'freien', 'in', 'der', 'nähe', 'vieler', 'büsche', '.', '<eos>']
6. German indices:
[2, 18, 26, 253, 30, 84, 20, 88, 7, 15, 110, 7647, 3171, 4, 3]

The numerized datasets' contents after their conversion to tensors are as follows:
Dataset({
    features: ['en', 'de', 'enTokens', 'deTokens', 'enIndices', 'deIndices'],
    num_rows: 29000
}).

A sam

### Data loader
The data loaders are iterated upon to return a batch of data. Each batch will be a dictionary containing numericalized English and German sentences represented as tensors.

In [23]:
#The class prepares batches of data that can be iterated over by the data loader.
#The batch is prepared by collating a group of data samples in the function
#'collate'. The function also pads the shorter sequences in the batch with '<pad>'
#to make their lengths equal to that of the batch's longest sequence. This
#ensures that the neural network model's I/P dimension remains fixed.
class languageDataset(Dataset):
  def __init__(self,dataset,batchSize,enIdxPad,deIdxPad,shuffle=False) -> None:
    self.dataset,self.batchSize=dataset,batchSize
    self.enIdxPad,self.deIdxPad=enIdxPad,deIdxPad
    self.shuffle=shuffle

  def collate(self,batch):
    batchEnIndices=[sample['enIndices'] for sample in batch]
    batchDeIndices=[sample['deIndices'] for sample in batch]
    batchEnIndices=nn.utils.rnn.pad_sequence(batchEnIndices,padding_value=self.enIdxPad)
    batchDeIndices=nn.utils.rnn.pad_sequence(batchDeIndices,padding_value=self.deIdxPad)
    print('Batch sizes for English and German are {} and {} respectively.'.format(batchEnIndices.shape,batchDeIndices.shape))
    return {'enIndices':batchEnIndices,'deIndices':batchDeIndices}

  def loadData(self):
    dataLoader=torch.utils.data.DataLoader(dataset=self.dataset,batch_size=self.batchSize,collate_fn=self.collate,shuffle=self.shuffle)
    return dataLoader

def dataIterator(set,dataset,params):
  if set=='train':obj=languageDataset(dataset,params['batchSize'],params['enIdxPad'],params['deIdxPad'],shuffle=True)
  elif set=='validation':obj=languageDataset(dataset,params['batchSize'],params['enIdxPad'],params['deIdxPad'],shuffle=True)
  elif set=='test':obj=languageDataset(dataset,params['batchSize'],params['enIdxPad'],params['deIdxPad'],shuffle=True)
  return obj.loadData()

## Sequence-to-sequence LSTM model for language translation
The language translation model will use an LSTM-based encoder-decoder network arcitecture.

### The encoder
The model's encoder consists of an embedding layer, which accepts index representations of the source language, and one or more layers of LSTM units that generate context vectors to feedi the model's decoder.

In [24]:
class Encoder(nn.Module):
  #Initialize the Encoder class
  def __init__(self,ipDim,embDim,hiddenDim,L,rate,bi=False,batchFirst=False) -> None:
    super(Encoder,self).__init__()
    self.hiddenDim,self.L=hiddenDim,L
    self.embed=nn.Embedding(ipDim,embDim)#I/P dimension is the length of the source language's vocabulary
    self.lstm=nn.LSTM(embDim,hiddenDim,L,bidirectional=bi,dropout=rate,batch_first=batchFirst)
    self.dropout=nn.Dropout(rate)

  #Forward propagates a batch of indices (denoted by 'x') of size
  #[seqLength,batchSize] through the network
  def forward(self,x):
    embedding=self.dropout(self.embed(x))#size is [seqLength,batchSize,embDim]
    #Output sizes of the LSTM layer are:
    #1. 'op': ht of all last layer units i.e., h^L_t; t=0,1,2,...,T. Its size is [seqLength,batchSize,hiddenDim*nDir]; 'nDir'=2 for bidirectional and 1 otherwise
    #2. 'h': hT of all layers stacked together i.e., ht of the last units of all layers. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir'=2 for bidirectional and 1 otherwise
    #3. 'c': cT of all layers stacked together i.e., ct of the last units of all layers. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir'=2 for bidirectional and 1 otherwise
    op,(h,c)=self.lstm(embedding)        #h and c together form z, the context vector feed for the decoder. If initial 'h' and 'c' not given, nn.LSTM assumes them to be all zeros
    return h,c

### The decoder
The model's decoder consists of an embedding layer that accepts index representations of the true/model-predicted target language (based on teacher-forcing learning) and the same number of LSTM layers as the encoder, which predict a target-language token for the current time step with the help of a dense layer mapping.

In [25]:
class Decoder(nn.Module):
  #Initializes the Decoder class
  def __init__(self,opDim,embDim,hiddenDim,L,rate,bi=False,batchFirst=False) -> None:
    super(Decoder,self).__init__()
    self.opDim,self.hiddenDim,self.L=opDim,hiddenDim,L
    self.embed=nn.Embedding(opDim,embDim) #O/P dimension is the length of the target language's vocabulary
    self.lstm=nn.LSTM(embDim,hiddenDim,L,bidirectional=bi,dropout=rate,batch_first=batchFirst)
    self.linear=nn.Linear(hiddenDim,opDim)#O/P dimension is the length of the target language's vocabulary
    self.dropout=nn.Dropout(rate)

  #Forward propagates a batch of context vectors (denoted by 'x') through the
  #network
  def forward(self,x,h,c):
    #Input sizes of the decoder are:
    #1. 'x': True/predicted (based on teacher-forcing) word for time t-1. Its size is [batchSize], and it covers only the immediate previous time step since decoder translates one time step at a time.
    #2. 'h': hT of all layers of the encoder stacked together i.e., ht of the last units of all layers in the encoder. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir'=2 for bidirectional and 1 otherwise
    #3. 'c': cT of all layers of the encoder stacked together i.e., ct of the last units of all layers in the encoder. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir'=2 for bidirectional and 1 otherwise
    #'h' and 'c' form the context vector I/P for the decoder
    x=x.unsqueeze(0)                      #x's size changed to [1,batchSize] to explicitly represent the sequence length dimension for processing (seqLength=1 as x covers only the immediate previous time step).
    embedding=self.dropout(self.embed(x)) #size is [1,batchSize,embDim]
    #Output sizes of the LSTM layer are:
    #1. 'op': ht of all last layer units i.e., h^L_t; t=0,1,2,...,T. Its size is [seqLength,batchSize,hiddenDim*nDir]; 'nDir' is always 1 for the decoder
    #2. 'h': hT of all layers stacked together i.e., ht of the last units of all layers. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir' is always 1 for the decoder
    #3. 'c': cT of all layers stacked together i.e., ct of the last units of all layers. Its size is [L*nDir,batchSize,hiddenDim]; 'nDir' is always 1 for the decoder
    op,(h,c)=self.lstm(embedding,(h,c))   #initial 'h' and 'c' are not assumed to be zeros since they are passed explicity to nn.LSTM
    #As the decoder predicts the target sentence one word (current time-step) at
    #a time, 'op' is stripped of the sequence length dimension before it is fed
    #to the dense layer.
    yHat=self.linear(op.squeeze(0))        #size is [batchSize,opDim]
    return yHat,h,c

### The sequence-to-sequence architecture
The architecture's

In [26]:
class Seq2seq(nn.Module):
  #Initializes the sequence-to-sequence architecture class
  def __init__(self,encoder,decoder,device) -> None:
    super(Seq2seq,self).__init__()
    self.encoder=encoder
    self.decoder=decoder
    self.device=device
    assert(encoder.hiddenDim==decoder.hiddenDim),'The hidden dimensions of the encoder and the decoder must be equal'
    assert(encoder.L==decoder.L),'The number of hidden layers of the encoder and the decoder must be equal'

  #Forward propagates a batch of sentences in the source language, 'source',
  #through the network
  def forward(self,source,target,ratio=0):
    #I/P sizes of the network are:
    #1. 'source': A batch of sentences in the source language. Its size is [souSeqLength,batchSize].
    #2. 'target': The batch of true sentences corresponding to 'source' in the target language. Its size is [tarSeqLength,batchSize]
    targetLn,batchSize=target.shape[0],target.shape[1]
    targetVocabSize=self.decoder.opDim                #the target language's vocabulary size
    #Pass 'source' through the encoder
    h,c=self.encoder(source)
    #Pass the context vector and the teacher-forced target language word for the
    #immediate previous time-step through the decoder. Note that the decoder
    #predicts words one time step at a time, requiring the use of a loop
    ops=torch.zeros(targetLn,batchSize,targetVocabSize).to(self.device)#stores decoder outputs
    ip=target[0,:]                                    #decoder I/P for the first time-step for all sequences in the batch is <sos>
    for t in range(1,targetLn):
      #O/P sizes of the decoder have been explained previously
      op,h,c=self.decoder(ip,h,c)
      ops[t]=op                                       #store 'op' of size [batchSize,opDim] in 'ops' along the 'targetLn' dimension
      tfFlag=random.random()<ratio                    #flag for teacher-forcing learning
      if tfFlag:ip=target[t]                          #teacher-forced I/P is true immediate previous word if flag is set, and predicted immediate previous word otherwise
      else:ip=op.argmax(1)                            #token with the highest probability score is the prediction for the current time-step. It also becomes teacher-forced I/P for the next time-step is flag is not set
    return ops

## Model training and evaluation
The functions that train, validate, and evaluate the LSTM model are defined below.

In [27]:
#This function trains and validates the model
def train(enVocabulary,deVocabulary,trDataLoader,valDataLoader,params):
  #Define the model and initialize its weights
  encoder=Encoder(params['ipDim'],params['encEmbDim'],params['hiddenDim'],params['numLayers'],params['encDORate'])
  decoder=Decoder(params['opDim'],params['decEmbDim'],params['hiddenDim'],params['numLayers'],params['decDORate'])
  model=Seq2seq(encoder,decoder,params['device']).to(params['device'])
  model.apply(initWeights)
  #Define the cost-function and the optimizer
  optimizer=optim.Adam(model.parameters(),lr=params['alpha'])
  costFun=nn.CrossEntropyLoss(ignore_index=params['enIdxPad']).to(params['device'])
  #Start training the model
  metrics=collections.defaultdict(list)
  bestValLoss=float('inf')
  model.to(params['device'])                                  #move the model to the GPU
  model.train()                                               #configure the model for training
  for epoch in range(params['epochs']):
    model.train()                                             #configure the model for training
    trLosses=[]
    #Loop over each training batch
    for batch in tqdm.tqdm(trDataLoader,desc='Model training in progress...'):
      source=batch['deIndices'].to(params['device'])
      target=batch['enIndices'].to(params['device'])
      #Forward propagation
      ops=model(source,target,params['teacherForcingRatio'])  #size is [tarSeqLength,batchSize,opDim]
      ops=ops[1:].view(-1,params['opDim'])                    #strip off <sos> from predicted target before computing loss. New shape is [tarSeqLength-1,batchSize,opDim]
      target=target[1:].view(-1)                              #change size of 'target' from [tarSeqLength,batchSize] to [(tarSeqLength-1)*batchSize]
      #Backpropagation of loss
      loss=costFun(ops,target)
      optimizer.zero_grad()                                   #initialize all gradients to zero
      loss.backward()
      torch.nn.utils.clip_grad_norm_(model.parameters(),params['clip'])
      optimizer.step()
      #Update the losses
      trLosses.append(loss.item())
    metrics['train_losses'].append(np.mean(trLosses))
    #Validate the model after each epoch
    model.eval()                                              #configure the model for evaluation
    valLosses=[]
    with torch.no_grad():                                     #backpropagation not required
      for batch in tqdm.tqdm(valDataLoader,desc='Model validation in progress...'):
        source=batch['deIndices'].to(params['device'])
        target=batch['enIndices'].to(params['device'])
        #Forward propagation
        ops=model(source,target)                              #teacher-forcing is 0 for validation and size is [tarSeqLength,batchSize,opDim]
        ops=ops[1:].view(-1,params['opDim'])
        target=target[1:].view(-1)
        #Validation loss
        loss=costFun(ops,target)
        valLosses.append(loss.item())
    metrics['valid_losses'].append(np.mean(valLosses))
    if np.mean(valLosses)<bestValLoss:
      bestValLoss=np.mean(valLosses)
      torch.save(model.state_dict(),'seq2seq.pt')
    print(f'Epoch: {epoch}')
    print(f'Training loss: {np.mean(trLosses):.3f}')
    print(f'Validation loss: {np.mean(valLosses):.3f}\n\n')
  return model,costFun

#This function initializes the network's weights from a uniform distribution
def initWeights(model):
  for layer,weights in model.named_parameters():
    nn.init.uniform_(weights.data,-8e-2,8e-2)

#This function evaluates the model
def evaluate(model,enVocabulary,deVocabulary,tsDataLoader,costFun,params):
  model.eval()                                                #configure the model for evaluation
  tsLosses=[]
  with torch.no_grad():                                       #backpropagation not required
    for batch in tqdm.tqdm(tsDataLoader,desc='Model evaluation in progress...'):
      source=batch['deIndices'].to(params['device'])
      target=batch['enIndices'].to(params['device'])
      #Forward propagation
      ops=model(source,target)                              #teacher-forcing is 0 for validation and size is [tarSeqLength,batchSize,opDim]
      ops=ops[1:].view(-1,params['opDim'])
      target=target[1:].view(-1)
      #Evaluation loss
      loss=costFun(ops,target)
      tsLosses.append(loss.item())
  return np.mean(tsLosses)

#This function translates sentences in one langugage to another andcalculates
#the BLEU loss
def translate(model,dataset,sourceTokenizer,targetTokenizer,sourceVocab,
              targetVocab,sos,eos,lower,device,maxOPLn=25):
  model.eval()                                                #configure the model for evaluation
  translations=[]
  #Start translating the sentences in the dataset
  with torch.no_grad():                                       #backpropagation not required
    for sentence in tqdm.tqdm(dataset):
      sentence=sentence['de']
      if isinstance(sentence,str):
        tokens=[token.text for token in targetTokenizer.tokenizer(sentence)]
      else:tokens=[]
      if lower:tokens=[token.lower() for token in tokens]
      tokens=[sos]+tokens+[eos]
      indices=targetVocab.lookup_indices(tokens)
      indices=torch.LongTensor(indices).unsqueeze(-1).to(device)
      h,c=model.encoder(indices)
      ips=sourceVocab.lookup_indices([sos])
      for _ in range(maxOPLn):
        op,h,c=model.decoder(torch.LongTensor([ips[-1]]).to(device),h,c)
        yHat=op.argmax(-1).item()
        ips.append(yHat)
        if yHat==sourceVocab[eos]:break
      tokens=sourceVocab.lookup_tokens(ips)
    translations.append(tokens)
  #Form the translations from tokens
  translationsHat=[''.join(translation[1:-1]) for translation in translations]
  translationsTrue=[[sentence['en']] for sentence in dataset]
  return translationsHat,translationsTrue

#This function returns the tokenizer function to claculate the BLEU loss
def getTokenizer(tokenizer,lower):
  def tokenizerFun(sentence):
    tokens=[token.text for token in tokenizer.tokenizer(sentence)]
    if lower():tokens=[token.lower() for token in tokens]
    return tokens
  return tokenizerFun

## Main function

In [28]:
if __name__=='__main__':
  device=torch.device('cuda' if torch.cuda.is_available() else 'cpu')
  params={'device':device,
          'epochs':10,
          'alpha':1e-3,
          'batchSize':128,
          'clip':1,
          'teacherForcingRatio':5e-1,#percentage time for which immediate previous groud-truth is used by decoder instead of immediate previous prediction
          'ipDim':len(deVocabulary),
          'opDim':len(enVocabulary),
          'encEmbDim':256,
          'decEmbDim':256,
          'hiddenDim':512,
          'numLayers':2,
          'encDORate':5e-1,
          'decDORate':5e-1,
          'enIdxUnk':enIdxUnk,
          'enIdxPad':enIdxPad,
          'deIdxUnk':deIdxUnk,
          'deIdxPad':deIdxPad}
#Data iterators
trIterator=dataIterator('train',trSet,params)
valIterator=dataIterator('validation',valSet,params)
tsIterator=dataIterator('test',tsSet,params)
#Train the sequence-to-sequence model for language translation
model,costFun=train(enVocabulary,deVocabulary,trIterator,valIterator,params)
print('Model training is complete, and the model has {} learnable parameters.\n\n'.format(sum(weight.numel() for weight in model.parameters() if weight.requires_grad==True)))
#Evaluate the sequence-to-sequence model
model.load_state_dict(torch.load('seq2seq.pt'))
tsLoss=evaluate(model,enVocabulary,deVocabulary,tsIterator,costFun,params)
print(f'Testing loss: {tsLoss:.3f}\n\n')
#Evaluate the sequence-to-sequence model's langugage translation performance
#translations=[translate(model,sentence['de'],enTokenizer,deTokenizer,
#                        enVocabulary,deVocabulary,sosToken,eosToken,lower,
#                        params['device']) for sentence in tqdm.tqdm(tsSet)]
bleu=evaluate.load('bleu')
transHat,trans=translate(model,tsSet,enTokenizer,deTokenizer,enVocabulary,
                       deVocabulary,sosToken,eosToken,lower,params['device'])
tokenizerFun=getTokenizer(enTokenizer,lower)
evaluation=bleu.compute(predictions=transHat,references=trans,tokenizer=tokenizerFun)

Model training in progress...:   0%|          | 1/227 [00:00<00:45,  4.95it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:38,  5.84it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:37,  5.88it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:35,  6.20it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:35,  6.22it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:32,  6.65it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:33,  6.31it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:32,  6.60it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   7%|▋         | 17/227 [00:02<00:34,  6.04it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   8%|▊         | 19/227 [00:03<00:33,  6.21it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   9%|▉         | 21/227 [00:03<00:31,  6.49it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  10%|█         | 23/227 [00:03<00:29,  6.95it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  11%|█         | 25/227 [00:03<00:32,  6.13it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:31,  6.31it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  13%|█▎        | 29/227 [00:04<00:30,  6.48it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:04<00:31,  6.17it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:31,  6.14it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:32,  5.84it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:05<00:33,  5.75it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:34,  5.39it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:30,  6.16it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:06<00:31,  5.93it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:31,  5.81it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:07<00:27,  6.54it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  22%|██▏       | 49/227 [00:07<00:27,  6.52it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  22%|██▏       | 51/227 [00:08<00:28,  6.26it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  23%|██▎       | 53/227 [00:08<00:29,  5.80it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:08<00:28,  6.02it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:26,  6.46it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  26%|██▌       | 59/227 [00:09<00:25,  6.55it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:09<00:25,  6.44it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:26,  6.30it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  29%|██▊       | 65/227 [00:10<00:24,  6.69it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:10<00:26,  6.03it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  30%|███       | 69/227 [00:11<00:26,  5.90it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  31%|███▏      | 71/227 [00:11<00:25,  6.04it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  32%|███▏      | 73/227 [00:11<00:27,  5.70it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  33%|███▎      | 75/227 [00:12<00:24,  6.33it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  34%|███▍      | 77/227 [00:12<00:25,  5.87it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  35%|███▍      | 79/227 [00:12<00:24,  6.15it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  36%|███▌      | 81/227 [00:13<00:25,  5.82it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  37%|███▋      | 83/227 [00:13<00:25,  5.59it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:13<00:24,  5.85it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:24,  5.84it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:23,  5.90it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:14<00:21,  6.28it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:15<00:22,  6.11it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:21,  6.17it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:15<00:21,  6.22it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:15<00:20,  6.30it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:21,  5.91it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:16<00:20,  6.18it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:16<00:19,  6.37it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:19,  6.26it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:17<00:21,  5.63it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:17<00:19,  5.97it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([22, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:18<00:18,  6.30it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:18<00:17,  6.60it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:18<00:16,  6.54it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:17,  6.16it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:19<00:18,  5.67it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:19<00:16,  6.19it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:18,  5.68it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:20<00:17,  5.74it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  56%|█████▌    | 127/227 [00:20<00:16,  6.18it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:16,  5.79it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:21<00:15,  6.33it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:21<00:16,  5.77it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:17,  5.41it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  60%|█████▉    | 136/227 [00:22<00:16,  5.61it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  61%|██████    | 138/227 [00:22<00:15,  5.78it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  62%|██████▏   | 140/227 [00:23<00:15,  5.75it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  63%|██████▎   | 142/227 [00:23<00:14,  6.05it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  63%|██████▎   | 144/227 [00:23<00:13,  6.17it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  64%|██████▍   | 146/227 [00:23<00:12,  6.59it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  65%|██████▌   | 148/227 [00:24<00:11,  6.70it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  66%|██████▌   | 150/227 [00:24<00:11,  6.63it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  67%|██████▋   | 152/227 [00:24<00:12,  6.19it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:12,  6.02it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:25<00:11,  6.20it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:25<00:10,  6.47it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:10,  6.66it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:26<00:10,  6.40it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:26<00:10,  6.14it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:27<00:09,  6.29it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:27<00:09,  6.03it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:27<00:09,  6.27it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:28<00:08,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  77%|███████▋  | 174/227 [00:28<00:08,  6.23it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  78%|███████▊  | 176/227 [00:28<00:08,  6.16it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  78%|███████▊  | 178/227 [00:29<00:07,  6.44it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  79%|███████▉  | 180/227 [00:29<00:07,  6.44it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:29<00:07,  6.43it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:30<00:06,  6.45it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:30<00:06,  6.21it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:30<00:06,  6.34it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:30<00:05,  6.69it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:31<00:05,  6.37it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:31<00:05,  6.24it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  86%|████████▌ | 195/227 [00:31<00:05,  5.99it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  87%|████████▋ | 197/227 [00:32<00:05,  5.71it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  88%|████████▊ | 199/227 [00:32<00:04,  6.19it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:32<00:04,  6.46it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.06it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:33<00:03,  6.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:33<00:03,  6.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:02,  6.21it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:34<00:02,  5.73it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:34<00:02,  5.82it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  95%|█████████▍| 215/227 [00:35<00:02,  5.65it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  96%|█████████▌| 217/227 [00:35<00:01,  5.82it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  96%|█████████▋| 219/227 [00:35<00:01,  6.23it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:36<00:01,  5.93it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  98%|█████████▊| 222/227 [00:36<00:00,  6.19it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  99%|█████████▊| 224/227 [00:36<00:00,  5.73it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...: 100%|█████████▉| 226/227 [00:37<00:00,  5.63it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 72]) and torch.Size([28, 72]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.10it/s]
Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 14.57it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.76it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 14.11it/s]


Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 118]) and torch.Size([28, 118]) respectively.
Epoch: 0
Training loss: 5.043
Validation loss: 5.047




Model training in progress...:   0%|          | 1/227 [00:00<00:38,  5.85it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:34,  6.43it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:37,  5.92it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:36,  6.07it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:33,  6.48it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:32,  6.58it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:36,  5.87it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:38,  5.45it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:   7%|▋         | 17/227 [00:02<00:39,  5.38it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   8%|▊         | 19/227 [00:03<00:35,  5.85it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   9%|▉         | 21/227 [00:03<00:32,  6.31it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  10%|█         | 23/227 [00:03<00:36,  5.66it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  11%|█         | 25/227 [00:04<00:33,  6.01it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:34,  5.76it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  13%|█▎        | 29/227 [00:04<00:32,  6.01it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:05<00:31,  6.16it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:30,  6.29it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:28,  6.65it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:06<00:31,  6.06it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:31,  6.06it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:28,  6.53it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:07<00:30,  5.99it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:29,  6.21it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:07<00:29,  6.11it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  22%|██▏       | 49/227 [00:08<00:29,  6.01it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  22%|██▏       | 51/227 [00:08<00:27,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  23%|██▎       | 53/227 [00:08<00:25,  6.70it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:09<00:27,  6.15it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:29,  5.77it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  26%|██▌       | 59/227 [00:09<00:27,  6.03it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:10<00:27,  6.10it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:31,  5.27it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  28%|██▊       | 64/227 [00:10<00:29,  5.55it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  29%|██▉       | 66/227 [00:10<00:26,  6.04it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  30%|██▉       | 68/227 [00:11<00:26,  5.91it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:26,  5.82it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:11<00:24,  6.42it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:24,  6.14it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:12<00:24,  6.15it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:12<00:23,  6.41it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:26,  5.52it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:13<00:23,  6.14it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:13<00:24,  5.92it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:25,  5.60it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:22,  6.16it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:14<00:22,  5.98it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:15<00:22,  5.94it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:22,  5.99it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:15<00:20,  6.35it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:16<00:19,  6.60it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:20,  6.27it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:16<00:19,  6.45it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:17<00:19,  6.37it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:19,  6.26it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:17<00:18,  6.37it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:18<00:18,  6.24it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:18<00:20,  5.51it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:18<00:18,  5.97it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:19<00:18,  6.15it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:18,  5.82it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:19<00:16,  6.40it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:16,  6.56it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  54%|█████▍    | 123/227 [00:20<00:16,  6.38it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  55%|█████▌    | 125/227 [00:20<00:18,  5.44it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  56%|█████▌    | 127/227 [00:21<00:17,  5.76it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:16,  5.81it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:21<00:16,  5.89it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:22<00:15,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:15,  5.98it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:22<00:15,  5.93it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:14,  6.08it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:14,  6.07it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:14,  6.00it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:13,  6.10it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:13,  5.95it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:24<00:13,  5.81it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:25<00:12,  6.16it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  67%|██████▋   | 153/227 [00:25<00:12,  5.91it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  68%|██████▊   | 155/227 [00:25<00:11,  6.50it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  69%|██████▉   | 157/227 [00:25<00:10,  6.51it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  70%|███████   | 159/227 [00:26<00:10,  6.49it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  71%|███████   | 161/227 [00:26<00:10,  6.33it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  72%|███████▏  | 163/227 [00:26<00:10,  5.95it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  73%|███████▎  | 165/227 [00:27<00:09,  6.29it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  74%|███████▎  | 167/227 [00:27<00:09,  6.04it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  74%|███████▍  | 169/227 [00:27<00:08,  6.45it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  75%|███████▌  | 171/227 [00:28<00:09,  6.13it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  76%|███████▌  | 173/227 [00:28<00:08,  6.37it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:28<00:08,  6.35it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:08,  6.19it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:29<00:07,  6.01it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  80%|███████▉  | 181/227 [00:29<00:06,  6.61it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  81%|████████  | 183/227 [00:30<00:06,  6.37it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  81%|████████▏ | 185/227 [00:30<00:07,  5.82it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:30<00:07,  5.81it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:31<00:06,  5.93it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:31<00:06,  5.45it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  84%|████████▍ | 191/227 [00:31<00:06,  5.89it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  85%|████████▌ | 193/227 [00:31<00:05,  5.76it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  86%|████████▌ | 195/227 [00:32<00:05,  6.16it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  87%|████████▋ | 197/227 [00:32<00:04,  6.42it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  88%|████████▊ | 199/227 [00:32<00:04,  5.67it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:33<00:04,  5.90it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.17it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:33<00:03,  6.06it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:03,  5.73it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:03,  5.83it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:34<00:02,  5.84it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:35<00:02,  5.58it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  94%|█████████▍| 214/227 [00:35<00:02,  5.64it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  95%|█████████▌| 216/227 [00:35<00:01,  5.59it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  96%|█████████▌| 218/227 [00:36<00:01,  5.54it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  97%|█████████▋| 220/227 [00:36<00:01,  5.83it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  98%|█████████▊| 222/227 [00:36<00:00,  5.65it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  99%|█████████▊| 224/227 [00:37<00:00,  5.55it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...: 100%|█████████▉| 226/227 [00:37<00:00,  5.94it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 72]) and torch.Size([30, 72]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.02it/s]
Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 12.65it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([35, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.03it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.61it/s]


Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 118]) and torch.Size([31, 118]) respectively.
Epoch: 1
Training loss: 4.460
Validation loss: 4.838




Model training in progress...:   0%|          | 1/227 [00:00<00:38,  5.80it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:35,  6.28it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:33,  6.61it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:33,  6.63it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:37,  5.78it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:35,  6.07it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:34,  6.21it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:32,  6.52it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:   7%|▋         | 17/227 [00:02<00:32,  6.45it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:   8%|▊         | 19/227 [00:03<00:36,  5.75it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   9%|▉         | 21/227 [00:03<00:37,  5.55it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  10%|█         | 23/227 [00:03<00:34,  5.99it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  11%|█         | 25/227 [00:04<00:31,  6.50it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:30,  6.57it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  13%|█▎        | 29/227 [00:04<00:32,  6.03it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:05<00:31,  6.20it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:29,  6.59it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:29,  6.55it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:05<00:30,  6.15it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([22, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:30,  6.07it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:31,  5.90it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:07<00:31,  5.80it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:28,  6.40it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:07<00:32,  5.62it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:07<00:32,  5.56it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:28,  6.27it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:27,  6.36it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:08<00:27,  6.31it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:08<00:26,  6.46it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:30,  5.61it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  26%|██▌       | 58/227 [00:09<00:33,  5.10it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  26%|██▋       | 60/227 [00:09<00:31,  5.24it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  27%|██▋       | 62/227 [00:10<00:30,  5.46it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  28%|██▊       | 64/227 [00:10<00:28,  5.62it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  29%|██▉       | 66/227 [00:11<00:29,  5.41it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:11<00:28,  5.64it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  30%|███       | 69/227 [00:11<00:25,  6.17it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  31%|███▏      | 71/227 [00:11<00:26,  5.82it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  32%|███▏      | 73/227 [00:12<00:24,  6.29it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  33%|███▎      | 75/227 [00:12<00:23,  6.54it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  34%|███▍      | 77/227 [00:12<00:22,  6.59it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  35%|███▍      | 79/227 [00:13<00:22,  6.65it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  36%|███▌      | 81/227 [00:13<00:23,  6.23it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  37%|███▋      | 83/227 [00:13<00:24,  5.88it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  37%|███▋      | 85/227 [00:14<00:22,  6.34it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  38%|███▊      | 87/227 [00:14<00:22,  6.21it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  39%|███▉      | 89/227 [00:14<00:23,  5.99it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  40%|████      | 91/227 [00:15<00:24,  5.47it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  41%|████      | 93/227 [00:15<00:23,  5.77it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  42%|████▏     | 95/227 [00:15<00:23,  5.52it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  43%|████▎     | 97/227 [00:16<00:22,  5.86it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  44%|████▎     | 99/227 [00:16<00:21,  5.95it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  44%|████▍     | 101/227 [00:16<00:20,  6.04it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  45%|████▌     | 103/227 [00:17<00:20,  6.15it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  46%|████▋     | 105/227 [00:17<00:19,  6.14it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  47%|████▋     | 107/227 [00:17<00:19,  6.04it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  48%|████▊     | 109/227 [00:18<00:19,  5.98it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  49%|████▉     | 111/227 [00:18<00:19,  5.88it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  50%|████▉     | 113/227 [00:18<00:17,  6.47it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  51%|█████     | 115/227 [00:19<00:17,  6.31it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  52%|█████▏    | 117/227 [00:19<00:16,  6.71it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  52%|█████▏    | 119/227 [00:19<00:17,  6.28it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  53%|█████▎    | 121/227 [00:20<00:16,  6.39it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  54%|█████▍    | 123/227 [00:20<00:16,  6.25it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  55%|█████▌    | 125/227 [00:20<00:17,  5.92it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  56%|█████▌    | 127/227 [00:21<00:17,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:17,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:21<00:15,  6.24it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:22<00:14,  6.35it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:15,  5.87it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:22<00:14,  6.15it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:14,  6.06it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:14,  6.13it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:14,  5.74it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:14,  5.67it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:12,  6.16it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:24<00:12,  6.05it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:25<00:13,  5.75it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  67%|██████▋   | 152/227 [00:25<00:12,  5.99it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:12,  5.83it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:25<00:11,  5.98it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:26<00:11,  6.14it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:10,  6.16it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:26<00:11,  5.75it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  72%|███████▏  | 163/227 [00:27<00:10,  5.93it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  73%|███████▎  | 165/227 [00:27<00:10,  5.92it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  74%|███████▎  | 167/227 [00:27<00:10,  5.72it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  74%|███████▍  | 169/227 [00:28<00:10,  5.60it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  75%|███████▌  | 171/227 [00:28<00:09,  6.14it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  76%|███████▌  | 173/227 [00:28<00:08,  6.19it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:29<00:08,  5.97it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:08,  6.12it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:29<00:07,  6.28it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  80%|███████▉  | 181/227 [00:30<00:07,  6.10it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  81%|████████  | 183/227 [00:30<00:07,  6.09it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  81%|████████▏ | 185/227 [00:30<00:06,  6.45it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  82%|████████▏ | 187/227 [00:31<00:06,  6.27it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  83%|████████▎ | 189/227 [00:31<00:05,  6.61it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  84%|████████▍ | 191/227 [00:31<00:05,  6.36it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  85%|████████▌ | 193/227 [00:32<00:05,  5.86it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  86%|████████▌ | 195/227 [00:32<00:05,  5.82it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  87%|████████▋ | 197/227 [00:32<00:04,  6.06it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  88%|████████▊ | 199/227 [00:33<00:04,  6.09it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:33<00:04,  6.01it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.06it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:33<00:03,  6.20it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:02,  6.70it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:02,  6.24it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  93%|█████████▎| 210/227 [00:34<00:02,  6.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  93%|█████████▎| 212/227 [00:35<00:02,  5.92it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  94%|█████████▍| 214/227 [00:35<00:02,  6.01it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  95%|█████████▌| 216/227 [00:35<00:01,  6.45it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  96%|█████████▌| 218/227 [00:36<00:01,  6.17it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  97%|█████████▋| 220/227 [00:36<00:01,  6.33it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  98%|█████████▊| 222/227 [00:36<00:00,  6.28it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  99%|█████████▊| 224/227 [00:37<00:00,  6.46it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...: 100%|█████████▉| 226/227 [00:37<00:00,  6.60it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([22, 72]) and torch.Size([28, 72]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.07it/s]
Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 14.20it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.86it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.81it/s]


Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 118]) and torch.Size([31, 118]) respectively.
Epoch: 2
Training loss: 4.166
Validation loss: 4.605




Model training in progress...:   0%|          | 0/227 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   1%|          | 2/227 [00:00<00:38,  5.78it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   2%|▏         | 4/227 [00:00<00:40,  5.56it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   3%|▎         | 6/227 [00:01<00:42,  5.24it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   4%|▎         | 8/227 [00:01<00:36,  5.99it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   4%|▍         | 10/227 [00:01<00:34,  6.26it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   5%|▌         | 12/227 [00:02<00:32,  6.55it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   6%|▌         | 14/227 [00:02<00:34,  6.26it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   7%|▋         | 16/227 [00:02<00:35,  5.95it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   8%|▊         | 18/227 [00:03<00:36,  5.73it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   9%|▉         | 20/227 [00:03<00:35,  5.78it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  10%|▉         | 22/227 [00:03<00:33,  6.10it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  11%|█         | 24/227 [00:04<00:34,  5.96it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  11%|█▏        | 26/227 [00:04<00:33,  6.07it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  12%|█▏        | 28/227 [00:04<00:33,  5.95it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  13%|█▎        | 30/227 [00:05<00:31,  6.24it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  14%|█▍        | 32/227 [00:05<00:34,  5.74it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  15%|█▍        | 34/227 [00:05<00:33,  5.77it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  16%|█▌        | 36/227 [00:06<00:32,  5.83it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  17%|█▋        | 38/227 [00:06<00:30,  6.23it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  18%|█▊        | 40/227 [00:06<00:28,  6.51it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  19%|█▊        | 42/227 [00:06<00:28,  6.55it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  19%|█▉        | 44/227 [00:07<00:27,  6.69it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  20%|██        | 46/227 [00:07<00:28,  6.31it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:08<00:31,  5.65it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:30,  5.88it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:32,  5.41it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:09<00:31,  5.42it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  25%|██▍       | 56/227 [00:09<00:29,  5.88it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  26%|██▌       | 58/227 [00:09<00:27,  6.09it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  26%|██▋       | 60/227 [00:10<00:29,  5.68it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:10<00:28,  5.73it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:27,  5.95it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  29%|██▊       | 65/227 [00:10<00:26,  6.18it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:11<00:25,  6.34it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  30%|███       | 69/227 [00:11<00:24,  6.35it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  31%|███▏      | 71/227 [00:11<00:25,  6.18it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  32%|███▏      | 73/227 [00:12<00:24,  6.36it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  33%|███▎      | 75/227 [00:12<00:25,  6.04it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  34%|███▍      | 77/227 [00:12<00:24,  6.04it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  35%|███▍      | 79/227 [00:13<00:25,  5.73it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  36%|███▌      | 81/227 [00:13<00:25,  5.75it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  37%|███▋      | 83/227 [00:13<00:24,  5.87it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  37%|███▋      | 85/227 [00:14<00:22,  6.33it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  38%|███▊      | 87/227 [00:14<00:21,  6.62it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  39%|███▉      | 89/227 [00:14<00:24,  5.60it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  40%|████      | 91/227 [00:15<00:23,  5.90it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  41%|████      | 93/227 [00:15<00:21,  6.19it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:21,  6.12it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:16<00:22,  5.90it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:16<00:23,  5.55it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:20,  6.29it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  44%|████▍     | 101/227 [00:16<00:19,  6.36it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  45%|████▌     | 103/227 [00:17<00:21,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  46%|████▋     | 105/227 [00:17<00:19,  6.11it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  47%|████▋     | 107/227 [00:17<00:19,  6.05it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  48%|████▊     | 109/227 [00:18<00:21,  5.44it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  49%|████▉     | 111/227 [00:18<00:19,  5.96it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  50%|████▉     | 113/227 [00:18<00:17,  6.51it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  51%|█████     | 115/227 [00:19<00:19,  5.80it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:19<00:19,  5.79it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:17,  6.16it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:20<00:17,  6.28it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:16,  6.23it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:15,  6.49it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:21<00:15,  6.41it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  56%|█████▋    | 128/227 [00:21<00:16,  6.12it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  57%|█████▋    | 130/227 [00:21<00:16,  5.97it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  58%|█████▊    | 132/227 [00:22<00:16,  5.88it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  59%|█████▉    | 134/227 [00:22<00:14,  6.28it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  60%|█████▉    | 136/227 [00:22<00:15,  5.73it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  61%|██████    | 138/227 [00:23<00:14,  6.18it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  62%|██████▏   | 140/227 [00:23<00:14,  6.18it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  63%|██████▎   | 142/227 [00:23<00:13,  6.49it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  63%|██████▎   | 144/227 [00:23<00:13,  6.31it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  64%|██████▍   | 146/227 [00:24<00:13,  6.23it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  65%|██████▌   | 148/227 [00:24<00:12,  6.34it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  66%|██████▌   | 150/227 [00:24<00:12,  6.12it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  67%|██████▋   | 152/227 [00:25<00:11,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:11,  6.57it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 155/227 [00:25<00:10,  6.57it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  69%|██████▉   | 157/227 [00:26<00:12,  5.70it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  70%|███████   | 159/227 [00:26<00:11,  5.93it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  71%|███████   | 161/227 [00:26<00:10,  6.22it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  72%|███████▏  | 163/227 [00:27<00:10,  5.99it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  73%|███████▎  | 165/227 [00:27<00:09,  6.55it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  74%|███████▎  | 167/227 [00:27<00:09,  6.07it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  74%|███████▍  | 169/227 [00:28<00:10,  5.71it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  75%|███████▌  | 171/227 [00:28<00:09,  5.86it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  76%|███████▌  | 173/227 [00:28<00:08,  6.16it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:29<00:08,  6.22it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:07,  6.37it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:29<00:07,  6.05it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  80%|███████▉  | 181/227 [00:30<00:07,  5.95it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  81%|████████  | 183/227 [00:30<00:07,  5.51it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  81%|████████▏ | 185/227 [00:30<00:07,  5.57it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  82%|████████▏ | 187/227 [00:31<00:06,  5.74it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  83%|████████▎ | 189/227 [00:31<00:06,  6.23it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  84%|████████▍ | 191/227 [00:31<00:05,  6.01it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  85%|████████▌ | 193/227 [00:32<00:05,  5.68it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  86%|████████▌ | 195/227 [00:32<00:05,  5.84it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  87%|████████▋ | 197/227 [00:32<00:05,  5.90it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  88%|████████▊ | 199/227 [00:33<00:04,  6.20it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:33<00:04,  6.31it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.28it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:34<00:03,  5.75it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:03,  5.49it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:03,  5.58it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:35<00:02,  6.07it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:35<00:02,  6.33it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  95%|█████████▍| 215/227 [00:35<00:02,  5.84it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  96%|█████████▌| 217/227 [00:36<00:01,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  96%|█████████▋| 219/227 [00:36<00:01,  5.88it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:36<00:01,  5.95it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  98%|█████████▊| 223/227 [00:37<00:00,  6.04it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  99%|█████████▉| 225/227 [00:37<00:00,  6.25it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.00it/s]


Batch sizes for English and German are torch.Size([28, 72]) and torch.Size([27, 72]) respectively.


Model validation in progress...:   0%|          | 0/8 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 14.24it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([35, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.18it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.


Model validation in progress...:  75%|███████▌  | 6/8 [00:00<00:00, 13.80it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 14.24it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 118]) and torch.Size([23, 118]) respectively.





Epoch: 3
Training loss: 3.954
Validation loss: 4.372




Model training in progress...:   0%|          | 0/227 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:   1%|          | 2/227 [00:00<00:38,  5.81it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   2%|▏         | 4/227 [00:00<00:36,  6.04it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   3%|▎         | 6/227 [00:00<00:33,  6.55it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:   4%|▎         | 8/227 [00:01<00:35,  6.19it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   4%|▍         | 10/227 [00:01<00:34,  6.34it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   5%|▌         | 12/227 [00:01<00:32,  6.61it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   6%|▌         | 14/227 [00:02<00:32,  6.58it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:   7%|▋         | 16/227 [00:02<00:36,  5.76it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:   8%|▊         | 18/227 [00:02<00:32,  6.35it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:   9%|▉         | 20/227 [00:03<00:33,  6.18it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  10%|▉         | 22/227 [00:03<00:33,  6.18it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  11%|█         | 24/227 [00:03<00:34,  5.96it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  11%|█▏        | 26/227 [00:04<00:30,  6.65it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  12%|█▏        | 28/227 [00:04<00:30,  6.46it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  13%|█▎        | 30/227 [00:04<00:33,  5.88it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:05<00:32,  6.04it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:30,  6.43it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:32,  5.94it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:05<00:29,  6.49it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:28,  6.59it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  18%|█▊        | 40/227 [00:06<00:27,  6.73it/s]

Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  19%|█▊        | 42/227 [00:06<00:30,  6.02it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  19%|█▉        | 44/227 [00:07<00:28,  6.45it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  20%|██        | 46/227 [00:07<00:29,  6.13it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:07<00:29,  6.10it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:28,  6.25it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:28,  6.25it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:08<00:27,  6.34it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  25%|██▍       | 56/227 [00:09<00:27,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  26%|██▌       | 58/227 [00:09<00:28,  5.87it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  26%|██▋       | 60/227 [00:09<00:27,  6.12it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  27%|██▋       | 62/227 [00:10<00:26,  6.26it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  28%|██▊       | 64/227 [00:10<00:29,  5.52it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  29%|██▉       | 66/227 [00:10<00:26,  5.98it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  30%|██▉       | 68/227 [00:11<00:26,  5.92it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:27,  5.75it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:11<00:26,  5.90it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:24,  6.23it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:12<00:24,  6.25it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:12<00:23,  6.32it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:24,  5.96it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:13<00:22,  6.33it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:13<00:23,  6.17it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:22,  6.27it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:24,  5.75it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:14<00:21,  6.49it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:14<00:21,  6.27it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:22,  5.91it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:15<00:21,  6.16it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:15<00:20,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:21,  5.98it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:16<00:21,  5.93it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:16<00:19,  6.28it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:19,  6.22it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:17<00:19,  6.24it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:17<00:20,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:18<00:19,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:18<00:19,  5.77it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  51%|█████     | 115/227 [00:18<00:18,  6.08it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  52%|█████▏    | 117/227 [00:19<00:17,  6.26it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  52%|█████▏    | 119/227 [00:19<00:16,  6.65it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:19<00:16,  6.44it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:19<00:19,  5.49it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  54%|█████▍    | 123/227 [00:20<00:18,  5.66it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  55%|█████▌    | 125/227 [00:20<00:19,  5.26it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  56%|█████▌    | 127/227 [00:20<00:17,  5.86it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:16,  5.86it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:21<00:15,  6.12it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:21<00:15,  5.92it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:15,  5.80it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:22<00:14,  6.20it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:22<00:14,  6.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:13,  6.49it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:13,  6.11it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:23<00:13,  6.15it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:13,  5.82it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:24<00:13,  5.93it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:24<00:11,  6.38it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  67%|██████▋   | 153/227 [00:25<00:12,  5.92it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:12,  5.95it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:25<00:11,  6.12it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:25<00:11,  6.04it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:11,  5.66it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:26<00:10,  5.98it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:26<00:09,  6.33it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:27<00:09,  6.29it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:27<00:09,  6.17it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:27<00:09,  6.14it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:28<00:09,  5.79it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  76%|███████▌  | 173/227 [00:28<00:09,  5.98it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:28<00:09,  5.42it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:08,  5.82it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:29<00:07,  6.24it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  80%|███████▉  | 181/227 [00:29<00:08,  5.31it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:30<00:08,  5.33it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:30<00:07,  5.97it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:30<00:07,  5.65it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:31<00:06,  6.09it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:31<00:06,  5.70it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:31<00:05,  6.00it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:32<00:05,  5.77it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  86%|████████▋ | 196/227 [00:32<00:05,  6.12it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  87%|████████▋ | 198/227 [00:32<00:04,  6.26it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  88%|████████▊ | 200/227 [00:33<00:04,  6.19it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([22, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  89%|████████▉ | 202/227 [00:33<00:04,  5.83it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.10it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  90%|████████▉ | 204/227 [00:33<00:04,  5.75it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  91%|█████████ | 206/227 [00:34<00:03,  5.47it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  92%|█████████▏| 208/227 [00:34<00:03,  5.49it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  93%|█████████▎| 210/227 [00:34<00:02,  5.74it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  93%|█████████▎| 212/227 [00:35<00:02,  6.24it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  94%|█████████▍| 214/227 [00:35<00:02,  6.20it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  95%|█████████▌| 216/227 [00:35<00:01,  6.33it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  96%|█████████▌| 218/227 [00:36<00:01,  5.94it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  97%|█████████▋| 220/227 [00:36<00:01,  6.30it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  98%|█████████▊| 222/227 [00:36<00:00,  6.17it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  99%|█████████▊| 224/227 [00:37<00:00,  5.96it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...: 100%|█████████▉| 226/227 [00:37<00:00,  6.04it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 72]) and torch.Size([32, 72]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.05it/s]
Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 13.67it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 14.29it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.89it/s]


Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 118]) and torch.Size([35, 118]) respectively.
Epoch: 4
Training loss: 3.783
Validation loss: 4.283




Model training in progress...:   0%|          | 1/227 [00:00<00:40,  5.61it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:40,  5.50it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   2%|▏         | 4/227 [00:00<00:42,  5.23it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:41,  5.29it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:38,  5.69it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:35,  6.09it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:35,  6.11it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:35,  6.04it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   6%|▌         | 14/227 [00:02<00:34,  6.20it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:   7%|▋         | 16/227 [00:02<00:36,  5.82it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   8%|▊         | 18/227 [00:03<00:35,  5.93it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   9%|▉         | 20/227 [00:03<00:34,  5.96it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  10%|▉         | 22/227 [00:03<00:34,  5.96it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  11%|█         | 24/227 [00:04<00:33,  6.08it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  11%|█▏        | 26/227 [00:04<00:31,  6.37it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  12%|█▏        | 28/227 [00:04<00:31,  6.40it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  13%|█▎        | 30/227 [00:04<00:30,  6.43it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  14%|█▍        | 32/227 [00:05<00:31,  6.19it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▍        | 34/227 [00:05<00:30,  6.23it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  16%|█▌        | 36/227 [00:05<00:29,  6.42it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  17%|█▋        | 38/227 [00:06<00:31,  6.00it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:30,  6.17it/s]

Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:33,  5.48it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:07<00:32,  5.73it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:31,  5.85it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:07<00:32,  5.62it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:08<00:31,  5.74it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([22, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:30,  5.90it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:30,  5.75it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:09<00:29,  5.93it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  25%|██▍       | 56/227 [00:09<00:27,  6.29it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  26%|██▌       | 58/227 [00:09<00:27,  6.25it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  26%|██▋       | 60/227 [00:10<00:27,  6.18it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  27%|██▋       | 62/227 [00:10<00:28,  5.78it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  28%|██▊       | 64/227 [00:10<00:27,  5.86it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  29%|██▉       | 66/227 [00:11<00:27,  5.76it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  30%|██▉       | 68/227 [00:11<00:28,  5.62it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:28,  5.58it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:12<00:25,  5.99it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:24,  6.28it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:12<00:24,  6.16it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([21, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:13<00:22,  6.52it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:23,  6.31it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:13<00:22,  6.38it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:14<00:25,  5.66it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:23,  5.93it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:24,  5.69it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:15<00:22,  5.96it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:15<00:24,  5.58it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:22,  5.90it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:16<00:21,  6.08it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:16<00:22,  5.84it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:23,  5.37it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:17<00:21,  5.81it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:17<00:20,  6.10it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:20,  6.00it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:18<00:19,  6.05it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:18<00:18,  6.36it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:18<00:19,  6.00it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:19<00:18,  6.26it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:19<00:17,  6.46it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:17,  6.39it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:20<00:15,  6.69it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:15,  6.68it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:16,  6.33it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:21<00:17,  5.70it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  56%|█████▋    | 128/227 [00:21<00:17,  5.74it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:17,  5.69it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:21<00:16,  5.67it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:22<00:15,  6.01it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:14,  6.52it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:22<00:13,  6.44it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:13,  6.57it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:13,  6.15it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:14,  5.94it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:14,  5.55it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  64%|██████▍   | 146/227 [00:24<00:13,  6.04it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  65%|██████▌   | 148/227 [00:24<00:14,  5.60it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  66%|██████▌   | 150/227 [00:25<00:12,  5.97it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  67%|██████▋   | 152/227 [00:25<00:12,  5.95it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:11,  6.21it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:26<00:11,  6.08it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:26<00:11,  6.14it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:12,  5.50it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:27<00:11,  5.80it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:27<00:10,  5.82it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:27<00:10,  5.82it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:28<00:09,  6.52it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:28<00:09,  6.18it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:28<00:09,  5.84it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  77%|███████▋  | 174/227 [00:29<00:09,  5.65it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  78%|███████▊  | 176/227 [00:29<00:08,  6.17it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  78%|███████▊  | 178/227 [00:29<00:07,  6.35it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  79%|███████▉  | 180/227 [00:30<00:08,  5.73it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:30<00:07,  5.69it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:30<00:07,  5.81it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:31<00:06,  6.41it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:31<00:06,  6.28it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:31<00:05,  6.67it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:32<00:05,  6.62it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:32<00:05,  6.40it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  86%|████████▋ | 196/227 [00:32<00:05,  5.64it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  87%|████████▋ | 198/227 [00:33<00:04,  6.36it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  88%|████████▊ | 200/227 [00:33<00:04,  5.76it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:33<00:04,  5.90it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:33<00:03,  6.16it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:34<00:03,  6.58it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:03,  6.02it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:03,  5.88it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:35<00:02,  6.62it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:35<00:02,  6.56it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  95%|█████████▍| 215/227 [00:35<00:01,  6.18it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  96%|█████████▌| 217/227 [00:36<00:01,  5.78it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  96%|█████████▋| 219/227 [00:36<00:01,  5.89it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:36<00:01,  5.88it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  98%|█████████▊| 223/227 [00:37<00:00,  6.29it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  99%|█████████▉| 225/227 [00:37<00:00,  5.86it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.01it/s]


Batch sizes for English and German are torch.Size([24, 72]) and torch.Size([26, 72]) respectively.


Model validation in progress...:   0%|          | 0/8 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([24, 128]) respectively.


Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 15.37it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.77it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model validation in progress...:  75%|███████▌  | 6/8 [00:00<00:00, 13.41it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 118]) and torch.Size([35, 118]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.68it/s]


Epoch: 5
Training loss: 3.642
Validation loss: 4.202




Model training in progress...:   0%|          | 0/227 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   1%|          | 2/227 [00:00<00:43,  5.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   2%|▏         | 4/227 [00:00<00:39,  5.69it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   3%|▎         | 6/227 [00:01<00:36,  5.99it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   4%|▎         | 8/227 [00:01<00:34,  6.35it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   4%|▍         | 10/227 [00:01<00:37,  5.81it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   5%|▌         | 12/227 [00:02<00:36,  5.85it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:   6%|▌         | 14/227 [00:02<00:37,  5.62it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:   7%|▋         | 16/227 [00:02<00:37,  5.66it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   8%|▊         | 18/227 [00:03<00:35,  5.89it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   9%|▉         | 20/227 [00:03<00:35,  5.85it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  10%|▉         | 22/227 [00:03<00:34,  5.94it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  11%|█         | 24/227 [00:04<00:33,  6.09it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  11%|█▏        | 26/227 [00:04<00:32,  6.24it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  12%|█▏        | 28/227 [00:04<00:31,  6.35it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  13%|█▎        | 30/227 [00:05<00:30,  6.37it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  14%|█▍        | 32/227 [00:05<00:33,  5.81it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:35,  5.53it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:32,  6.00it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:06<00:31,  6.02it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:30,  6.15it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:31,  5.89it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  19%|█▊        | 42/227 [00:07<00:31,  5.81it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  19%|█▉        | 44/227 [00:07<00:33,  5.53it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  20%|██        | 46/227 [00:07<00:30,  5.94it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:08<00:28,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:30,  5.78it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:28,  6.09it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:09<00:30,  5.73it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:09<00:29,  5.89it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:27,  6.24it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  26%|██▌       | 59/227 [00:09<00:28,  5.95it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:10<00:27,  5.95it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:26,  6.11it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  29%|██▊       | 65/227 [00:10<00:25,  6.37it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:11<00:25,  6.19it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  30%|███       | 69/227 [00:11<00:24,  6.34it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  31%|███▏      | 71/227 [00:11<00:24,  6.49it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  32%|███▏      | 73/227 [00:12<00:26,  5.78it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  33%|███▎      | 75/227 [00:12<00:26,  5.76it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  34%|███▍      | 77/227 [00:12<00:24,  6.04it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  35%|███▍      | 79/227 [00:13<00:23,  6.30it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  36%|███▌      | 81/227 [00:13<00:24,  5.85it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  37%|███▋      | 83/227 [00:13<00:23,  6.15it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  37%|███▋      | 85/227 [00:14<00:23,  6.14it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  38%|███▊      | 87/227 [00:14<00:24,  5.79it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  39%|███▉      | 89/227 [00:14<00:23,  5.77it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  40%|████      | 91/227 [00:15<00:23,  5.78it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  41%|████      | 93/227 [00:15<00:22,  5.97it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  42%|████▏     | 95/227 [00:15<00:22,  5.97it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  43%|████▎     | 97/227 [00:16<00:22,  5.90it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  44%|████▎     | 99/227 [00:16<00:22,  5.78it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  44%|████▍     | 101/227 [00:17<00:20,  6.01it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  45%|████▌     | 103/227 [00:17<00:21,  5.87it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  46%|████▋     | 105/227 [00:17<00:19,  6.14it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  47%|████▋     | 107/227 [00:17<00:19,  6.27it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  48%|████▊     | 109/227 [00:18<00:18,  6.31it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  49%|████▉     | 111/227 [00:18<00:20,  5.78it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:18<00:18,  6.07it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:19<00:18,  6.00it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:19<00:16,  6.54it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:16,  6.44it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:20<00:17,  6.25it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:18,  5.58it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:18,  5.52it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:21<00:16,  6.00it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  56%|█████▋    | 128/227 [00:21<00:16,  6.01it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  57%|█████▋    | 130/227 [00:21<00:15,  6.08it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  58%|█████▊    | 132/227 [00:22<00:14,  6.42it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:22<00:15,  6.20it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:15,  5.87it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:22<00:14,  6.39it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:13,  6.69it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:13,  6.26it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:14,  5.91it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  63%|██████▎   | 144/227 [00:24<00:14,  5.82it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  64%|██████▍   | 146/227 [00:24<00:12,  6.36it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  65%|██████▌   | 148/227 [00:24<00:12,  6.31it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  66%|██████▌   | 150/227 [00:24<00:12,  6.02it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  67%|██████▋   | 152/227 [00:25<00:12,  6.02it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  68%|██████▊   | 154/227 [00:25<00:12,  5.99it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:26<00:12,  5.86it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:26<00:11,  5.76it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:11,  5.80it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:27<00:11,  5.85it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:27<00:10,  6.14it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:27<00:10,  5.86it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:28<00:10,  5.83it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:28<00:09,  6.20it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:28<00:09,  5.80it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  77%|███████▋  | 174/227 [00:29<00:08,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  78%|███████▊  | 176/227 [00:29<00:09,  5.62it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  78%|███████▊  | 178/227 [00:29<00:08,  5.54it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  79%|███████▉  | 180/227 [00:30<00:07,  6.35it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:30<00:07,  5.99it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:30<00:07,  6.02it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:31<00:06,  6.25it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:31<00:06,  6.50it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:31<00:05,  6.70it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:31<00:05,  6.28it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:32<00:05,  6.13it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  86%|████████▋ | 196/227 [00:32<00:05,  5.92it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  87%|████████▋ | 198/227 [00:33<00:04,  5.80it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  88%|████████▊ | 200/227 [00:33<00:04,  6.08it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  89%|████████▉ | 202/227 [00:33<00:03,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  90%|████████▉ | 204/227 [00:34<00:04,  5.71it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:34<00:03,  5.68it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:03,  5.66it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:34<00:02,  6.09it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:35<00:02,  6.21it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:35<00:02,  6.08it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  95%|█████████▍| 215/227 [00:35<00:01,  6.18it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  96%|█████████▌| 217/227 [00:36<00:01,  6.46it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  96%|█████████▋| 219/227 [00:36<00:01,  6.58it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:36<00:01,  5.97it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  98%|█████████▊| 223/227 [00:37<00:00,  5.89it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  99%|█████████▉| 225/227 [00:37<00:00,  6.08it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  6.01it/s]


Batch sizes for English and German are torch.Size([35, 72]) and torch.Size([33, 72]) respectively.


Model validation in progress...:   0%|          | 0/8 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([23, 128]) respectively.


Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 13.85it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.11it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.


Model validation in progress...:  75%|███████▌  | 6/8 [00:00<00:00, 12.32it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 12.82it/s]


Batch sizes for English and German are torch.Size([32, 118]) and torch.Size([34, 118]) respectively.
Epoch: 6
Training loss: 3.510
Validation loss: 4.178




Model training in progress...:   0%|          | 1/227 [00:00<00:38,  5.82it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:34,  6.40it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:36,  6.12it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:37,  5.80it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:38,  5.60it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:36,  5.86it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:35,  6.09it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:37,  5.60it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   7%|▋         | 17/227 [00:02<00:34,  6.03it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:   8%|▊         | 19/227 [00:03<00:32,  6.39it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   9%|▉         | 21/227 [00:03<00:32,  6.27it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  10%|█         | 23/227 [00:03<00:31,  6.52it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  11%|█         | 25/227 [00:04<00:32,  6.25it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:32,  6.13it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  13%|█▎        | 29/227 [00:04<00:35,  5.60it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:05<00:32,  6.00it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:33,  5.80it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:05<00:33,  5.69it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:06<00:33,  5.63it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:31,  6.01it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:30,  6.16it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  19%|█▊        | 42/227 [00:07<00:29,  6.24it/s]

Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  19%|█▉        | 44/227 [00:07<00:32,  5.67it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  20%|██        | 46/227 [00:07<00:29,  6.12it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  21%|██        | 48/227 [00:08<00:30,  5.92it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  22%|██▏       | 50/227 [00:08<00:29,  5.97it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  23%|██▎       | 52/227 [00:08<00:31,  5.55it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  24%|██▍       | 54/227 [00:09<00:30,  5.64it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  25%|██▍       | 56/227 [00:09<00:28,  5.98it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  26%|██▌       | 58/227 [00:09<00:27,  6.14it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  26%|██▋       | 60/227 [00:10<00:26,  6.39it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  27%|██▋       | 62/227 [00:10<00:28,  5.72it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  28%|██▊       | 64/227 [00:10<00:28,  5.71it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  29%|██▉       | 66/227 [00:11<00:26,  6.12it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  30%|██▉       | 68/227 [00:11<00:25,  6.34it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:25,  6.22it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:12<00:24,  6.26it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:25,  6.06it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:12<00:23,  6.30it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:13<00:24,  6.03it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:27,  5.38it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:13<00:23,  6.05it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:14<00:23,  6.16it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:21,  6.57it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:21,  6.55it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:15<00:21,  6.36it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  40%|████      | 91/227 [00:15<00:21,  6.35it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  41%|████      | 93/227 [00:15<00:24,  5.45it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  42%|████▏     | 95/227 [00:15<00:23,  5.68it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  43%|████▎     | 97/227 [00:16<00:20,  6.23it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  44%|████▎     | 99/227 [00:16<00:22,  5.60it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:22,  5.74it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:17<00:21,  5.81it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:17<00:20,  6.02it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:21,  5.64it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  47%|████▋     | 107/227 [00:18<00:22,  5.43it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:18<00:20,  5.83it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:18<00:19,  5.94it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  49%|████▉     | 111/227 [00:18<00:19,  5.92it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  50%|████▉     | 113/227 [00:19<00:21,  5.43it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  51%|█████     | 115/227 [00:19<00:18,  5.96it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  52%|█████▏    | 117/227 [00:19<00:19,  5.68it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  52%|█████▏    | 119/227 [00:20<00:18,  5.82it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  53%|█████▎    | 121/227 [00:20<00:17,  5.95it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  54%|█████▍    | 123/227 [00:20<00:17,  5.89it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:17,  5.99it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:21<00:17,  5.91it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  56%|█████▋    | 128/227 [00:21<00:16,  5.85it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  57%|█████▋    | 130/227 [00:21<00:15,  6.32it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  58%|█████▊    | 132/227 [00:22<00:14,  6.45it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  59%|█████▉    | 134/227 [00:22<00:15,  6.08it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  60%|█████▉    | 136/227 [00:22<00:15,  5.94it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  61%|██████    | 138/227 [00:23<00:14,  5.94it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  62%|██████▏   | 140/227 [00:23<00:15,  5.72it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:14,  5.92it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  63%|██████▎   | 142/227 [00:23<00:13,  6.12it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  63%|██████▎   | 144/227 [00:24<00:15,  5.27it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:14,  5.69it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:14,  5.58it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:25<00:14,  5.55it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:25<00:12,  5.92it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  67%|██████▋   | 153/227 [00:25<00:12,  5.91it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 155/227 [00:26<00:12,  5.68it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  69%|██████▊   | 156/227 [00:26<00:12,  5.51it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:26<00:12,  5.65it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:27<00:11,  6.04it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:27<00:10,  6.10it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:27<00:10,  6.20it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:28<00:09,  6.34it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:28<00:08,  6.57it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:28<00:09,  6.13it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:29<00:09,  5.59it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  76%|███████▌  | 173/227 [00:29<00:09,  5.96it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:29<00:08,  6.44it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:07,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:30<00:08,  5.92it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  79%|███████▉  | 180/227 [00:30<00:08,  5.84it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:30<00:07,  5.91it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:30<00:06,  6.23it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  81%|████████▏ | 185/227 [00:31<00:07,  5.95it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  82%|████████▏ | 187/227 [00:31<00:07,  5.42it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  83%|████████▎ | 189/227 [00:31<00:06,  5.68it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  84%|████████▍ | 191/227 [00:32<00:05,  6.03it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  85%|████████▌ | 193/227 [00:32<00:05,  6.29it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  86%|████████▌ | 195/227 [00:32<00:05,  6.17it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  87%|████████▋ | 197/227 [00:33<00:04,  6.11it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  88%|████████▊ | 199/227 [00:33<00:04,  5.94it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  89%|████████▊ | 201/227 [00:33<00:04,  6.01it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  89%|████████▉ | 203/227 [00:34<00:03,  6.11it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  90%|█████████ | 205/227 [00:34<00:03,  6.29it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  91%|█████████ | 207/227 [00:34<00:03,  5.99it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  92%|█████████▏| 209/227 [00:35<00:02,  6.19it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  93%|█████████▎| 211/227 [00:35<00:02,  6.11it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  94%|█████████▍| 213/227 [00:35<00:02,  6.27it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  95%|█████████▍| 215/227 [00:36<00:01,  6.06it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  96%|█████████▌| 217/227 [00:36<00:01,  6.40it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  96%|█████████▋| 219/227 [00:36<00:01,  6.43it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:37<00:00,  6.32it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  98%|█████████▊| 223/227 [00:37<00:00,  6.26it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  99%|█████████▉| 225/227 [00:37<00:00,  6.32it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:37<00:00,  5.98it/s]


Batch sizes for English and German are torch.Size([26, 72]) and torch.Size([24, 72]) respectively.


Model validation in progress...:   0%|          | 0/8 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([23, 128]) respectively.


Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 12.97it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 13.15it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.60it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 118]) and torch.Size([34, 118]) respectively.





Epoch: 7
Training loss: 3.373
Validation loss: 4.074




Model training in progress...:   0%|          | 1/227 [00:00<00:34,  6.48it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:37,  6.05it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:36,  6.15it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:33,  6.64it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:   4%|▍         | 9/227 [00:01<00:34,  6.38it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:   5%|▍         | 11/227 [00:01<00:36,  5.85it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:34,  6.16it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:39,  5.43it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   7%|▋         | 16/227 [00:02<00:36,  5.82it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   8%|▊         | 18/227 [00:02<00:33,  6.17it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   9%|▉         | 20/227 [00:03<00:31,  6.55it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  10%|▉         | 22/227 [00:03<00:33,  6.20it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  11%|█         | 24/227 [00:03<00:33,  6.00it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  11%|█▏        | 26/227 [00:04<00:34,  5.88it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:33,  5.96it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  12%|█▏        | 28/227 [00:04<00:37,  5.37it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  13%|█▎        | 30/227 [00:05<00:35,  5.60it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  14%|█▍        | 32/227 [00:05<00:33,  5.78it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▍        | 34/227 [00:05<00:31,  6.07it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  16%|█▌        | 36/227 [00:06<00:31,  6.08it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  17%|█▋        | 38/227 [00:06<00:34,  5.51it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:32,  5.86it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:29,  6.21it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:07<00:29,  6.29it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:29,  6.18it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:07<00:27,  6.48it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  22%|██▏       | 49/227 [00:08<00:26,  6.77it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  22%|██▏       | 51/227 [00:08<00:26,  6.67it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  23%|██▎       | 53/227 [00:08<00:25,  6.77it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:08<00:25,  6.70it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:27,  6.22it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  26%|██▌       | 59/227 [00:09<00:27,  6.10it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:09<00:28,  5.92it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:27,  6.03it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  29%|██▊       | 65/227 [00:10<00:26,  6.05it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:10<00:25,  6.18it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  30%|██▉       | 68/227 [00:11<00:25,  6.27it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:27,  5.70it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:11<00:28,  5.37it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  32%|███▏      | 73/227 [00:12<00:28,  5.45it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:26,  5.69it/s]

Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:12<00:27,  5.46it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([35, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:13<00:27,  5.39it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:27,  5.34it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  36%|███▌      | 81/227 [00:13<00:28,  5.14it/s]

Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:13<00:26,  5.56it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:14<00:25,  5.69it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:24,  5.86it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:23,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:15<00:20,  6.64it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:15<00:21,  6.21it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:15<00:22,  5.88it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:16<00:21,  6.11it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:16<00:22,  5.83it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:22,  5.62it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  44%|████▍     | 101/227 [00:16<00:21,  5.83it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  45%|████▌     | 103/227 [00:17<00:19,  6.23it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  46%|████▋     | 105/227 [00:17<00:21,  5.56it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  47%|████▋     | 107/227 [00:17<00:20,  5.81it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  48%|████▊     | 109/227 [00:18<00:20,  5.63it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  49%|████▉     | 111/227 [00:18<00:20,  5.74it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  50%|████▉     | 113/227 [00:18<00:19,  5.71it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  51%|█████     | 115/227 [00:19<00:18,  6.20it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  52%|█████▏    | 117/227 [00:19<00:18,  5.96it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  52%|█████▏    | 119/227 [00:19<00:18,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:20<00:17,  6.09it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:17,  5.84it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  55%|█████▍    | 124/227 [00:20<00:17,  6.00it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  56%|█████▌    | 126/227 [00:21<00:16,  6.00it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  56%|█████▋    | 128/227 [00:21<00:16,  6.17it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  57%|█████▋    | 130/227 [00:21<00:16,  5.91it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  58%|█████▊    | 132/227 [00:22<00:16,  5.85it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  59%|█████▉    | 134/227 [00:22<00:14,  6.20it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  60%|█████▉    | 136/227 [00:22<00:16,  5.44it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:23<00:15,  5.84it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:14,  6.08it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:13,  6.27it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:23<00:13,  6.44it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:13,  6.02it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:14,  5.58it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:25<00:13,  5.60it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:25<00:13,  5.69it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  67%|██████▋   | 153/227 [00:25<00:12,  6.02it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 155/227 [00:26<00:11,  6.27it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  69%|██████▉   | 157/227 [00:26<00:11,  6.04it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  70%|██████▉   | 158/227 [00:26<00:11,  5.98it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  70%|███████   | 160/227 [00:26<00:12,  5.36it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  71%|███████▏  | 162/227 [00:27<00:11,  5.48it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  72%|███████▏  | 164/227 [00:27<00:10,  5.92it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  73%|███████▎  | 166/227 [00:27<00:10,  5.84it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  74%|███████▍  | 168/227 [00:28<00:09,  6.04it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  75%|███████▍  | 170/227 [00:28<00:09,  6.12it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:28<00:08,  6.37it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  77%|███████▋  | 174/227 [00:29<00:09,  5.61it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  77%|███████▋  | 175/227 [00:29<00:09,  5.56it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  78%|███████▊  | 177/227 [00:29<00:08,  5.83it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  79%|███████▉  | 179/227 [00:30<00:08,  5.91it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  80%|███████▉  | 181/227 [00:30<00:07,  6.10it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  81%|████████  | 183/227 [00:30<00:07,  5.94it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  81%|████████▏ | 185/227 [00:31<00:06,  6.42it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  82%|████████▏ | 187/227 [00:31<00:06,  6.38it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  83%|████████▎ | 189/227 [00:31<00:06,  6.27it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:31<00:05,  6.37it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:32<00:06,  5.56it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:32<00:05,  5.75it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  86%|████████▋ | 196/227 [00:32<00:05,  5.87it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  87%|████████▋ | 198/227 [00:33<00:04,  6.24it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  88%|████████▊ | 200/227 [00:33<00:04,  6.14it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  89%|████████▉ | 202/227 [00:33<00:04,  6.11it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  90%|████████▉ | 204/227 [00:34<00:03,  6.25it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  91%|█████████ | 206/227 [00:34<00:03,  6.04it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  92%|█████████▏| 208/227 [00:34<00:03,  6.19it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  93%|█████████▎| 210/227 [00:35<00:02,  6.56it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  93%|█████████▎| 212/227 [00:35<00:02,  5.98it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  94%|█████████▍| 214/227 [00:35<00:02,  6.45it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  95%|█████████▌| 216/227 [00:36<00:01,  6.15it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  96%|█████████▌| 218/227 [00:36<00:01,  5.66it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  97%|█████████▋| 220/227 [00:36<00:01,  5.53it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  97%|█████████▋| 221/227 [00:37<00:01,  5.78it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  98%|█████████▊| 223/227 [00:37<00:00,  5.78it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  99%|█████████▉| 225/227 [00:37<00:00,  6.10it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:38<00:00,  5.97it/s]


Batch sizes for English and German are torch.Size([24, 72]) and torch.Size([30, 72]) respectively.


Model validation in progress...:   0%|          | 0/8 [00:00<?, ?it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 12.97it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model validation in progress...:  50%|█████     | 4/8 [00:00<00:00, 12.90it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model validation in progress...:  75%|███████▌  | 6/8 [00:00<00:00, 12.80it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 118]) and torch.Size([26, 118]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 12.95it/s]


Epoch: 8
Training loss: 3.257
Validation loss: 3.993




Model training in progress...:   0%|          | 1/227 [00:00<00:38,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:   1%|▏         | 3/227 [00:00<00:40,  5.55it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:   2%|▏         | 5/227 [00:00<00:37,  5.96it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:   3%|▎         | 7/227 [00:01<00:39,  5.55it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:   4%|▎         | 8/227 [00:01<00:38,  5.62it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   4%|▍         | 10/227 [00:01<00:39,  5.52it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:   5%|▌         | 12/227 [00:02<00:35,  6.02it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:   6%|▌         | 13/227 [00:02<00:36,  5.91it/s]

Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([41, 128]) respectively.


Model training in progress...:   7%|▋         | 15/227 [00:02<00:37,  5.66it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:   7%|▋         | 17/227 [00:02<00:35,  5.98it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:   8%|▊         | 19/227 [00:03<00:35,  5.94it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:   9%|▉         | 21/227 [00:03<00:34,  6.03it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  10%|█         | 23/227 [00:04<00:36,  5.65it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([43, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  11%|█         | 25/227 [00:04<00:38,  5.31it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  12%|█▏        | 27/227 [00:04<00:34,  5.82it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  13%|█▎        | 29/227 [00:05<00:33,  5.92it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([46, 128]) respectively.


Model training in progress...:  14%|█▎        | 31/227 [00:05<00:33,  5.82it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  15%|█▍        | 33/227 [00:05<00:33,  5.83it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  15%|█▌        | 35/227 [00:06<00:30,  6.25it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  16%|█▋        | 37/227 [00:06<00:30,  6.22it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  17%|█▋        | 39/227 [00:06<00:28,  6.60it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  18%|█▊        | 41/227 [00:06<00:28,  6.56it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  19%|█▉        | 43/227 [00:07<00:31,  5.79it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  20%|█▉        | 45/227 [00:07<00:31,  5.84it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  21%|██        | 47/227 [00:08<00:30,  5.89it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  22%|██▏       | 49/227 [00:08<00:32,  5.46it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  22%|██▏       | 51/227 [00:08<00:30,  5.71it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  23%|██▎       | 53/227 [00:09<00:31,  5.48it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  24%|██▍       | 55/227 [00:09<00:30,  5.71it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  25%|██▌       | 57/227 [00:09<00:28,  6.04it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  26%|██▌       | 59/227 [00:10<00:27,  6.03it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  27%|██▋       | 61/227 [00:10<00:27,  6.00it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  28%|██▊       | 63/227 [00:10<00:26,  6.16it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  29%|██▊       | 65/227 [00:11<00:25,  6.38it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  30%|██▉       | 67/227 [00:11<00:26,  5.95it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  30%|███       | 69/227 [00:11<00:27,  5.69it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  31%|███       | 70/227 [00:11<00:28,  5.58it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  32%|███▏      | 72/227 [00:12<00:26,  5.87it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([42, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  33%|███▎      | 74/227 [00:12<00:27,  5.65it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  33%|███▎      | 76/227 [00:13<00:27,  5.48it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  34%|███▍      | 78/227 [00:13<00:24,  6.13it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  35%|███▌      | 80/227 [00:13<00:23,  6.14it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  36%|███▌      | 82/227 [00:14<00:23,  6.14it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  37%|███▋      | 84/227 [00:14<00:22,  6.34it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  38%|███▊      | 86/227 [00:14<00:22,  6.37it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  39%|███▉      | 88/227 [00:14<00:23,  6.03it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  40%|███▉      | 90/227 [00:15<00:24,  5.62it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([39, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  41%|████      | 92/227 [00:15<00:24,  5.57it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  41%|████▏     | 94/227 [00:16<00:23,  5.74it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  42%|████▏     | 96/227 [00:16<00:21,  6.18it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  43%|████▎     | 98/227 [00:16<00:20,  6.30it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  44%|████▍     | 100/227 [00:16<00:19,  6.53it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  45%|████▍     | 102/227 [00:17<00:19,  6.32it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  46%|████▌     | 104/227 [00:17<00:20,  5.87it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  47%|████▋     | 106/227 [00:17<00:18,  6.41it/s]

Batch sizes for English and German are torch.Size([24, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([38, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  48%|████▊     | 108/227 [00:18<00:20,  5.92it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  48%|████▊     | 110/227 [00:18<00:19,  5.98it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  49%|████▉     | 112/227 [00:19<00:19,  5.93it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  50%|█████     | 114/227 [00:19<00:18,  6.26it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  51%|█████     | 116/227 [00:19<00:18,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  52%|█████▏    | 118/227 [00:19<00:17,  6.28it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  53%|█████▎    | 120/227 [00:20<00:16,  6.60it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  54%|█████▎    | 122/227 [00:20<00:17,  5.88it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  54%|█████▍    | 123/227 [00:20<00:16,  6.22it/s]

Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  55%|█████▌    | 125/227 [00:21<00:18,  5.42it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  56%|█████▌    | 127/227 [00:21<00:17,  5.85it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  57%|█████▋    | 129/227 [00:21<00:15,  6.23it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  58%|█████▊    | 131/227 [00:22<00:15,  6.11it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  59%|█████▊    | 133/227 [00:22<00:14,  6.52it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  59%|█████▉    | 135/227 [00:22<00:14,  6.35it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  60%|██████    | 137/227 [00:23<00:15,  5.83it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([37, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  61%|██████    | 139/227 [00:23<00:14,  5.87it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  62%|██████▏   | 141/227 [00:23<00:13,  6.18it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  63%|██████▎   | 143/227 [00:24<00:13,  6.01it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  64%|██████▍   | 145/227 [00:24<00:13,  6.01it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  65%|██████▍   | 147/227 [00:24<00:13,  5.82it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([36, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  66%|██████▌   | 149/227 [00:25<00:13,  5.92it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  67%|██████▋   | 151/227 [00:25<00:12,  6.18it/s]

Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  67%|██████▋   | 153/227 [00:25<00:12,  5.76it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  68%|██████▊   | 155/227 [00:26<00:12,  5.62it/s]

Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  69%|██████▉   | 157/227 [00:26<00:12,  5.78it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  70%|███████   | 159/227 [00:26<00:11,  6.00it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  71%|███████   | 161/227 [00:27<00:11,  5.96it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([23, 128]) respectively.


Model training in progress...:  72%|███████▏  | 163/227 [00:27<00:10,  6.35it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([24, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([22, 128]) respectively.


Model training in progress...:  73%|███████▎  | 165/227 [00:27<00:09,  6.28it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  74%|███████▎  | 167/227 [00:28<00:10,  5.68it/s]

Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  74%|███████▍  | 169/227 [00:28<00:10,  5.57it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  75%|███████▌  | 171/227 [00:28<00:09,  5.63it/s]

Batch sizes for English and German are torch.Size([33, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  76%|███████▌  | 172/227 [00:29<00:09,  5.84it/s]

Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  77%|███████▋  | 174/227 [00:29<00:09,  5.65it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  78%|███████▊  | 176/227 [00:29<00:08,  6.07it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([33, 128]) respectively.


Model training in progress...:  78%|███████▊  | 178/227 [00:30<00:07,  6.24it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([34, 128]) respectively.


Model training in progress...:  79%|███████▉  | 180/227 [00:30<00:08,  5.85it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  80%|████████  | 182/227 [00:30<00:07,  5.78it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  81%|████████  | 184/227 [00:31<00:07,  5.99it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  82%|████████▏ | 186/227 [00:31<00:06,  6.08it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([25, 128]) respectively.


Model training in progress...:  83%|████████▎ | 188/227 [00:31<00:06,  5.63it/s]

Batch sizes for English and German are torch.Size([36, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  84%|████████▎ | 190/227 [00:32<00:05,  6.23it/s]

Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  85%|████████▍ | 192/227 [00:32<00:05,  5.98it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([31, 128]) respectively.


Model training in progress...:  85%|████████▌ | 194/227 [00:32<00:05,  5.65it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...:  86%|████████▋ | 196/227 [00:33<00:05,  5.75it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  87%|████████▋ | 198/227 [00:33<00:05,  5.62it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  88%|████████▊ | 200/227 [00:33<00:05,  5.30it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([28, 128]) respectively.


Model training in progress...:  89%|████████▉ | 202/227 [00:34<00:04,  5.74it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([36, 128]) respectively.


Model training in progress...:  90%|████████▉ | 204/227 [00:34<00:03,  6.14it/s]

Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([25, 128]) respectively.
Batch sizes for English and German are torch.Size([41, 128]) and torch.Size([45, 128]) respectively.


Model training in progress...:  91%|█████████ | 206/227 [00:34<00:03,  5.72it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([25, 128]) and torch.Size([26, 128]) respectively.


Model training in progress...:  92%|█████████▏| 208/227 [00:35<00:03,  6.21it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  93%|█████████▎| 210/227 [00:35<00:02,  6.86it/s]

Batch sizes for English and German are torch.Size([23, 128]) and torch.Size([23, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([32, 128]) respectively.


Model training in progress...:  93%|█████████▎| 212/227 [00:35<00:02,  6.44it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([27, 128]) respectively.


Model training in progress...:  94%|█████████▍| 214/227 [00:36<00:02,  6.04it/s]

Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([37, 128]) and torch.Size([38, 128]) respectively.


Model training in progress...:  95%|█████████▌| 216/227 [00:36<00:01,  5.81it/s]

Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 128]) and torch.Size([22, 128]) respectively.


Model training in progress...:  96%|█████████▌| 218/227 [00:36<00:01,  5.88it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 128]) and torch.Size([24, 128]) respectively.


Model training in progress...:  97%|█████████▋| 220/227 [00:37<00:01,  6.07it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([32, 128]) respectively.
Batch sizes for English and German are torch.Size([40, 128]) and torch.Size([37, 128]) respectively.


Model training in progress...:  98%|█████████▊| 222/227 [00:37<00:00,  5.69it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([28, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([30, 128]) respectively.


Model training in progress...:  99%|█████████▊| 224/227 [00:37<00:00,  5.48it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.


Model training in progress...: 100%|█████████▉| 226/227 [00:38<00:00,  5.76it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([27, 128]) respectively.
Batch sizes for English and German are torch.Size([27, 72]) and torch.Size([28, 72]) respectively.


Model training in progress...: 100%|██████████| 227/227 [00:38<00:00,  5.92it/s]
Model validation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 13.00it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([34, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([30, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([32, 128]) respectively.


Model validation in progress...:  75%|███████▌  | 6/8 [00:00<00:00, 13.69it/s]

Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([31, 128]) respectively.
Batch sizes for English and German are torch.Size([32, 128]) and torch.Size([35, 128]) respectively.
Batch sizes for English and German are torch.Size([26, 128]) and torch.Size([26, 128]) respectively.


Model validation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.39it/s]


Batch sizes for English and German are torch.Size([34, 128]) and torch.Size([33, 128]) respectively.
Batch sizes for English and German are torch.Size([28, 118]) and torch.Size([28, 118]) respectively.
Epoch: 9
Training loss: 3.127
Validation loss: 3.979


Model training is complete, and the model has 13898501 learnable parameters.




Model evaluation in progress...:  25%|██▌       | 2/8 [00:00<00:00, 13.32it/s]

Batch sizes for English and German are torch.Size([29, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([30, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([33, 128]) respectively.


Model evaluation in progress...:  50%|█████     | 4/8 [00:00<00:00, 12.24it/s]

Batch sizes for English and German are torch.Size([35, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([28, 128]) respectively.


Model evaluation in progress...: 100%|██████████| 8/8 [00:00<00:00, 13.30it/s]

Batch sizes for English and German are torch.Size([31, 128]) and torch.Size([29, 128]) respectively.
Batch sizes for English and German are torch.Size([24, 104]) and torch.Size([28, 104]) respectively.
Testing loss: 3.972







AttributeError: 'function' object has no attribute 'load'