Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

translate_enzh_wmt8k dev data path error #445

@zyshin

Description

@zyshin
> t2t-datagen \
>   --data_dir=$DATA_DIR \
>   --tmp_dir=$TMP_DIR \
>   --problem=$PROBLEM
INFO:tensorflow:Generating problems:
    translate:
      * translate_enzh_wmt8k
INFO:tensorflow:Generating data for translate_enzh_wmt8k.
INFO:tensorflow:Found vocab file: /home/translate/t2t_data/vocab.enzh-en.8192
INFO:tensorflow:Found vocab file: /home/translate/t2t_data/vocab.enzh-zh.8192
INFO:tensorflow:Not downloading, file already found: /tmp/t2t_datagen/training-parallel-nc-v12.tgz
INFO:tensorflow:Found vocab file: /home/translate/t2t_data/vocab.enzh-en.8192
INFO:tensorflow:Found vocab file: /home/translate/t2t_data/vocab.enzh-zh.8192
INFO:tensorflow:Not downloading, file already found: /tmp/t2t_datagen/dev.tgz
Traceback (most recent call last):
  File "/home/translate/env/bin/t2t-datagen", line 212, in <module>
    tf.app.run()
  File "/home/translate/env/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/translate/env/bin/t2t-datagen", line 178, in main
    generate_data_for_registered_problem(problem)
  File "/home/translate/env/bin/t2t-datagen", line 208, in generate_data_for_registered_problem
    task_id=task_id)
  File "/home/translate/env/lib/python3.5/site-packages/tensor2tensor/data_generators/problem.py", line 638, in generate_data
    self.generator(data_dir, tmp_dir, False), dev_paths)
  File "/home/translate/env/lib/python3.5/site-packages/tensor2tensor/data_generators/translate_enzh.py", line 88, in generator
    "wmt_enzh_tok_%s" % tag)
  File "/home/translate/env/lib/python3.5/site-packages/tensor2tensor/data_generators/translate.py", line 246, in compile_data
    line1, line2 = lang1_file.readline(), lang2_file.readline()
  File "/home/translate/env/lib/python3.5/site-packages/tensorflow/python/lib/io/file_io.py", line 177, in readline
    self._preread_check()
  File "/home/translate/env/lib/python3.5/site-packages/tensorflow/python/lib/io/file_io.py", line 79, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512, status)
  File "/home/translate/env/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /tmp/t2t_datagen/dev/newsdev2017-zhen-src.en.sgm; No such file or directory

Changing line 52 of file data_generators/enzh_wmt8k.py from
dev/newsdev2017-zhen-src.en.sgm and dev/newsdev2017-zhen-ref.zh.sgm to
dev/newsdev2017-enzh-src.en.sgm and dev/newsdev2017-enzh-ref.zh.sgm
will make it work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions