Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to train the model using classy #4

Closed
cnut1648 opened this issue Jul 28, 2022 · 4 comments
Closed

Failure to train the model using classy #4

cnut1648 opened this issue Jul 28, 2022 · 4 comments

Comments

@cnut1648
Copy link

Hi, nice work!

I would like to reproduce your training process. I installed the dependencies and downloaded the dataset following the README. I have renamed aida-train-kilt.jsonl to train.jsonl etc. I used the following command in the root directory

classy train qa data/aida -n my-model-name --profile aida-longformer-large-gam -pd extend

and got the following error

Error executing job with overrides: ['device=cuda', 'exp_name=my-model-name', 'data.datamodule.dataset_path=data/aida']
Traceback (most recent call last):
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/scripts/cli/train.py", line 620, in <lambda>
    lambda cfg: _main_mock(cfg, blames=blames if args.print else None)
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/scripts/cli/train.py", line 208, in _main_mock
    train(cfg)
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/scripts/model/train.py", line 22, in train
    pl_data_module.prepare_data()
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 474, in wrapped_fn
    fn(*args, **kwargs)
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/data/data_modules.py", line 161, in prepare_data
    shuffle_and_store_dataset(
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/utils/data.py", line 39, in shuffle_and_store_dataset
    samples = shuffle_dataset(dataset_path, data_driver)
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/utils/data.py", line 29, in shuffle_dataset
    samples = load_dataset(dataset_path, data_driver)
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/utils/data.py", line 22, in load_dataset
    return list(data_driver.read_from_path(dataset_path))
  File "/home/ICT2000/jxu/miniconda3/envs/extend/lib/python3.8/site-packages/classy/data/data_drivers.py", line 620, in read
    yield QASample(**json.loads(line))
TypeError: __init__() missing 2 required positional arguments: 'context' and 'question'

I see that in extend/data you have another data_drivers, but classy still used their version of it. Since I am new to classy I am not sure what should I proceed next.
Thank you!

@edobobo
Copy link
Collaborator

edobobo commented Jul 28, 2022

Hi, thanks a lot!

Actually, you should rename the files with ".aida" extension (e.g. train.aida).

@cnut1648
Copy link
Author

Thanks! I somehow missed that 😂. I also found out that it seems that I need to provide the full path of profile otherwise it would complain that the profile is not found.
I have another question though, would you mind sharing with us how you debugged this classy application? Since this is CLI I'm not sure how to debug using IDE like Pycharm or VS Code. I asked in litus-ai/classy#87 but haven't received feedback yet. It would be great since me and other researchers can debug if we have encountered issues.
Thank you again!

@edobobo
Copy link
Collaborator

edobobo commented Aug 3, 2022

I usually use PyCharm and select classy as the entry point in the run configuration.
Basically I set the "Script path" to be, for example "/home/edobobo/miniconda3/envs/extend/bin/classy".

@cnut1648
Copy link
Author

cnut1648 commented Aug 4, 2022

Oh I see. Thanks for the help @edobobo !!

@cnut1648 cnut1648 closed this as completed Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants