Failed to train #1

albertma-evotec · 2020-10-09T15:44:36Z

I ran the preprocess.sh and get the following output.

[2020-10-09 10:52:15,369 INFO] Extracting features...
[2020-10-09 10:52:15,371 INFO]  * number of source features: 0.
[2020-10-09 10:52:15,371 INFO]  * number of target features: 0.
[2020-10-09 10:52:15,371 INFO] Building `Fields` object...
[2020-10-09 10:52:15,371 INFO] Building & saving training data...
[2020-10-09 10:52:15,372 INFO] Reading source and target files: data/ChEMBL/src-train data/ChEMBL/tgt-train.
[2020-10-09 10:52:15,810 INFO] Splitting shard 0.
[2020-10-09 10:52:16,380 INFO] Building shard 0.
[2020-10-09 10:53:27,915 INFO]  * saving 0th train data shard to data/ChEMBL/.train.0.pt.
[2020-10-09 10:53:59,229 INFO] Building & saving validation data...
[2020-10-09 10:53:59,231 INFO] Reading source and target files: data/ChEMBL/src-val data/ChEMBL/tgt-val.
[2020-10-09 10:53:59,267 INFO] Splitting shard 0.
[2020-10-09 10:53:59,331 INFO] Building shard 0.
[2020-10-09 10:54:08,047 INFO]  * saving 0th valid data shard to data/ChEMBL/.valid.0.pt.
[2020-10-09 10:54:11,926 INFO] Building & saving vocabulary...
[2020-10-09 10:54:15,444 INFO]  * reloading data/ChEMBL/.train.0.pt.
[2020-10-09 10:54:20,820 INFO]  * tgt vocab size: 34.
[2020-10-09 10:54:20,820 INFO]  * src vocab size: 50.
[2020-10-09 10:54:20,820 INFO]  * merging src and tgt vocab...

But then the subsequent training.sh failed to run and gave me this

Traceback (most recent call last):                                                                          
  File "train.py", line 118, in <module>                                                                    
    main(opt)                                                                                               
  File "train.py", line 51, in main
    single_main(opt, 0)
  File "/home/UK/ama/Development/SyntaLinker/onmt/train_single.py", line 100, in main
    first_dataset = next(lazily_load_dataset("train", opt))
  File "/home/UK/ama/Development/SyntaLinker/onmt/inputters/inputter.py", line 551, in lazily_load_dataset
    yield _lazy_dataset_loader(pt, corpus_type)
  File "/home/UK/ama/Development/SyntaLinker/onmt/inputters/inputter.py", line 538, in _lazy_dataset_loader
    dataset = torch.load(pt_file)
  File "/home/UK/ama/.conda/envs/SyntaLinker/lib/python3.6/site-packages/torch/serialization.py", line 419, in load
    f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/ChEMBL/ChEMBL.train.pt'

These are the only files in the data.ChEMBL directory

total 304132
drwxr-xr-x 2 ama domain users      4096 Oct  9 10:54 .
drwxr-xr-x 3 ama domain users      4096 Oct  8 11:01 ..
-rw-r--r-- 1 ama domain users   6511683 Oct  8 11:01 src-test.txt
-rw-r--r-- 1 ama domain users  52028301 Oct  8 11:01 src-train
-rw-r--r-- 1 ama domain users   6502030 Oct  8 11:01 src-val
-rw-r--r-- 1 ama domain users   8071432 Oct  8 11:01 tgt-test.txt
-rw-r--r-- 1 ama domain users  64500626 Oct  8 11:01 tgt-train
-rw-r--r-- 1 ama domain users   8060092 Oct  8 11:01 tgt-val
-rw-r--r-- 1 ama domain users 146295349 Oct  9 10:54 .train.0.pt
-rw-r--r-- 1 ama domain users  18167444 Oct  9 10:54 .valid.0.pt
-rw-r--r-- 1 ama domain users      1355 Oct  9 10:54 .vocab.pt

The text was updated successfully, but these errors were encountered:

danielkovtun · 2021-05-27T17:43:37Z

I get the same thing. In the train_single_model.py script, there is a call to this inputter function that tries to lazy load a pt file that doesn't exist -- never committed to repository and not generated by preprocessing script

Any help would be appreciated.

YuYaoYang2333 · 2021-05-29T01:09:11Z

I guess this error may be caused by your filename error? You can try to check them again.

caiyingchun · 2021-10-13T02:16:12Z

I fixed this error by add dataset=ChEMBL into preprocess.sh.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to train #1

Failed to train #1

albertma-evotec commented Oct 9, 2020

danielkovtun commented May 27, 2021

YuYaoYang2333 commented May 29, 2021

caiyingchun commented Oct 13, 2021

Failed to train #1

Failed to train #1

Comments

albertma-evotec commented Oct 9, 2020

danielkovtun commented May 27, 2021

YuYaoYang2333 commented May 29, 2021

caiyingchun commented Oct 13, 2021