# Practice session 1

## Training a toy sequence copy model with Sockeye

Now that we have installed Sockeye (see instructions in slides), we will use it to train a very small sequence-to-sequence model. Its task is quite simple: it has to learn to copy a sequence of numbers.

`1 3 4 1 1 8 7 9 6` $\rightarrow$ `1 3 4 1 1 8 7 9 6`

### 1. Generate training & validation data

Sockeye takes in parallel files with source and target data. It means that we need two files containing training data, where each line of the target file is a translation (in our case, simply a copy) of the source file line with the same number. Same goes for validation data.

**Task 1.** Create 4 files: `data/train.source`, `data/train.target`, `data/dev.source`, and `data/dev.target`. The train files should contain 99000 lines each, the dev files 1000 lines each. Source and target files should be identical. Each line should consist of random integers from 0 to 10, delimited by spaces. The number of integers in each line should also be determined randomly and be between 5 and 15.

In [None]:
### YOUR CODE ###

### 2. Train the model

We will now train a recurrent neural network to copy the integer sequences. It will be a 1-layer RNN model with a bidirectional LSTM as the encoder and a uni-directional LSTM as the decoder. The RNNs have 16 hidden units and we learn word embeddings of size 4.

**Task 2.** Run the command below. It may take 10 to 15 minutes to finish. Check what the log shows.

In [None]:
!python -m sockeye.train --source data/train.source \
                         --target data/train.target \
                         --validation-source data/dev.source \
                         --validation-target data/dev.target \
                         --encoder rnn --decoder rnn \
                         --num-layers 1:1 \
                         --num-embed 4 \
                         --rnn-num-hidden 16 \
                         --rnn-attention-type dot \
                         --use-cpu \
                         --metrics perplexity accuracy \
                         --initial-learning-rate 0.002 \
                         --checkpoint-interval 500 \
                         --max-num-checkpoint-not-improved 3 \
                         --max-num-epochs 20 \
                         --output copy_model

### 3. Look at the results

**Task 3.** Translate a line with the trained model. Is the input reproduced correctly?

In [None]:
!echo "1 3 3 5 1 9 0 8 8 0" | python -m sockeye.translate -m copy_model --use-cpu

**Task 4.** We have translated our input with the parameters of the best checkpoint. Using the parameter `--checkpoint` we can also translate with any other checkpoint before the model converged. 

In [None]:
!echo "1 3 3 5 1 9 0 8 8 0" | python -m sockeye.translate -m copy_model --use-cpu --checkpoint 2

**Task 5.** Look in the model folder, check what files you have there.