Weird result on processing tokenization

### Description

I used my own dataset on the transformer model, but I got strange results as shown in the picture while processing tokens on t2t-datagen. I guess it is because of my large dataset? My dataset is WikiEd: http://romang.home.amu.edu.pl/wiked/wiked.v1.pdf

### Environment information
Ubuntu
tensorflow-gpu 1.13.1


![image](https://user-images.githubusercontent.com/32641072/54509608-17c86d80-4985-11e9-814c-a8a42c2ec6e8.png)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird result on processing tokenization #1497

Description

Environment information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Weird result on processing tokenization #1497

Description

Description

Environment information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions