You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, Shamil Chollampatt.
In the Model and Training Details Section of your paper, you said that "Each of the source and target vocabularies consists of 30K most frequent BPE tokens from the source and target side of the parallel data, respectively.", but according to this line in the preprocessing scripts(i.e. training/preprocess.sh), it seems that you only use the target-end data to learn BPE codes and then, apply it to both source and target data.
The text was updated successfully, but these errors were encountered:
The BPE model is trained using 30,000 operations using the target side of the training data according to the line that you pointed to. The source/target vocabularies for the encoder-decoder model consist of 30,000 most frequent subwords (or BPE segmented tokens) from the source/target sides of the parallel data (see line) .
Hi, Shamil Chollampatt.
In the Model and Training Details Section of your paper, you said that "Each of the source and target vocabularies consists of 30K most frequent BPE tokens from the source and target side of the parallel data, respectively.", but according to this line in the preprocessing scripts(i.e. training/preprocess.sh), it seems that you only use the target-end data to learn BPE codes and then, apply it to both source and target data.
The text was updated successfully, but these errors were encountered: