Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to train #1

Open
albertma-evotec opened this issue Oct 9, 2020 · 3 comments
Open

Failed to train #1

albertma-evotec opened this issue Oct 9, 2020 · 3 comments

Comments

@albertma-evotec
Copy link

I ran the preprocess.sh and get the following output.

[2020-10-09 10:52:15,369 INFO] Extracting features...
[2020-10-09 10:52:15,371 INFO]  * number of source features: 0.
[2020-10-09 10:52:15,371 INFO]  * number of target features: 0.
[2020-10-09 10:52:15,371 INFO] Building `Fields` object...
[2020-10-09 10:52:15,371 INFO] Building & saving training data...
[2020-10-09 10:52:15,372 INFO] Reading source and target files: data/ChEMBL/src-train data/ChEMBL/tgt-train.
[2020-10-09 10:52:15,810 INFO] Splitting shard 0.
[2020-10-09 10:52:16,380 INFO] Building shard 0.
[2020-10-09 10:53:27,915 INFO]  * saving 0th train data shard to data/ChEMBL/.train.0.pt.
[2020-10-09 10:53:59,229 INFO] Building & saving validation data...
[2020-10-09 10:53:59,231 INFO] Reading source and target files: data/ChEMBL/src-val data/ChEMBL/tgt-val.
[2020-10-09 10:53:59,267 INFO] Splitting shard 0.
[2020-10-09 10:53:59,331 INFO] Building shard 0.
[2020-10-09 10:54:08,047 INFO]  * saving 0th valid data shard to data/ChEMBL/.valid.0.pt.
[2020-10-09 10:54:11,926 INFO] Building & saving vocabulary...
[2020-10-09 10:54:15,444 INFO]  * reloading data/ChEMBL/.train.0.pt.
[2020-10-09 10:54:20,820 INFO]  * tgt vocab size: 34.
[2020-10-09 10:54:20,820 INFO]  * src vocab size: 50.
[2020-10-09 10:54:20,820 INFO]  * merging src and tgt vocab...

But then the subsequent training.sh failed to run and gave me this

Traceback (most recent call last):                                                                          
  File "train.py", line 118, in <module>                                                                    
    main(opt)                                                                                               
  File "train.py", line 51, in main
    single_main(opt, 0)
  File "/home/UK/ama/Development/SyntaLinker/onmt/train_single.py", line 100, in main
    first_dataset = next(lazily_load_dataset("train", opt))
  File "/home/UK/ama/Development/SyntaLinker/onmt/inputters/inputter.py", line 551, in lazily_load_dataset
    yield _lazy_dataset_loader(pt, corpus_type)
  File "/home/UK/ama/Development/SyntaLinker/onmt/inputters/inputter.py", line 538, in _lazy_dataset_loader
    dataset = torch.load(pt_file)
  File "/home/UK/ama/.conda/envs/SyntaLinker/lib/python3.6/site-packages/torch/serialization.py", line 419, in load
    f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/ChEMBL/ChEMBL.train.pt'

These are the only files in the data.ChEMBL directory

total 304132
drwxr-xr-x 2 ama domain users      4096 Oct  9 10:54 .
drwxr-xr-x 3 ama domain users      4096 Oct  8 11:01 ..
-rw-r--r-- 1 ama domain users   6511683 Oct  8 11:01 src-test.txt
-rw-r--r-- 1 ama domain users  52028301 Oct  8 11:01 src-train
-rw-r--r-- 1 ama domain users   6502030 Oct  8 11:01 src-val
-rw-r--r-- 1 ama domain users   8071432 Oct  8 11:01 tgt-test.txt
-rw-r--r-- 1 ama domain users  64500626 Oct  8 11:01 tgt-train
-rw-r--r-- 1 ama domain users   8060092 Oct  8 11:01 tgt-val
-rw-r--r-- 1 ama domain users 146295349 Oct  9 10:54 .train.0.pt
-rw-r--r-- 1 ama domain users  18167444 Oct  9 10:54 .valid.0.pt
-rw-r--r-- 1 ama domain users      1355 Oct  9 10:54 .vocab.pt

@danielkovtun
Copy link

I get the same thing. In the train_single_model.py script, there is a call to this inputter function that tries to lazy load a pt file that doesn't exist -- never committed to repository and not generated by preprocessing script

Any help would be appreciated.

@YuYaoYang2333
Copy link
Owner

I guess this error may be caused by your filename error? You can try to check them again.

@caiyingchun
Copy link

I fixed this error by add dataset=ChEMBL into preprocess.sh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants