Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading Tox21 through the dataloader. #25

Closed
cmvcordova opened this issue Mar 9, 2022 · 1 comment
Closed

Loading Tox21 through the dataloader. #25

cmvcordova opened this issue Mar 9, 2022 · 1 comment

Comments

@cmvcordova
Copy link

cmvcordova commented Mar 9, 2022

Hey guys,

I was trying to load the Tox21 dataset through the get_data_loaders() function with the following call:

Tox21_dataloader, _, _ = get_data_loaders(featurizer, batch_size=32, task_name='Tox', dataset_name ='Tox21')

Said call however, returned the following error:

ValueError: Please select a label name. You can use tdc.utils.retrieve_label_name_list('tox21') to retrieve all available label names.
In call to configurable 'data' (<function get_data_split at 0x7fb3f49b3e50>)

When adding the assay_name = 'nr-ar' parameter to the call, as hinted by tox21-nr-ar.gin, like this:

Tox21_dataloader, _, _ = get_data_loaders(featurizer, batch_size=32, task_name='Tox', dataset_name ='Tox21', assay_name = 'NR-AR')

I got the following error:

TypeError: get_data_loaders() got an unexpected keyword argument 'assay_name'

When I unpacked the get_data_loaders() function arguments using inspect.getfullargspec(get_data_loaders) with the inspect python package, I got:

FullArgSpec(args=['featurizer'], varargs=None, varkw=None, defaults=None, kwonlyargs=['batch_size', 'num_workers', 'cache_encodings', 'task_name', 'dataset_name'], kwonlydefaults={'num_workers': 0, 'cache_encodings': False, 'task_name': None, 'dataset_name': None}, annotations={'return': typing.Tuple[torch.utils.data.dataloader.DataLoader, torch.utils.data.dataloader.DataLoader, torch.utils.data.dataloader.DataLoader], 'featurizer': <class 'src.huggingmolecules.featurization.featurization_api.PretrainedFeaturizerMixin'>, 'batch_size': <class 'int'>, 'num_workers': <class 'int'>, 'cache_encodings': <class 'bool'>, 'task_name': <class 'str'>, 'dataset_name': <class 'str'>})

Which doesn't seem to have an option to specify additional parameters in the call.

Maybe I'm overthinking this. Could I get a few pointers on how to successfully load the NR-AR Tox21 dataset with a base get_data_loaders() call?

Regards,

César Miguel

@panpiort8
Copy link
Collaborator

Sorry for delay :) You are right: kwargs are missing. Adding them in the above MR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants