-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError while using DataSilo._load_data for loading data from dict #373
Comments
Hey @lucalila thanks for using FARM. Lets figure this out, but I need more information, since it is a unique use I havent seen before. Could you describe the need for parsing the data as dicts inside the data_silo with |
Hi @Timoeller , thank you for the quick reply! Sure, you're right, I can use the file option as well. Currently, I'm preprocessing my file(s) and the idea was to pass the data straight into a datasilo instead of saving the preprocessed file and load it again - just for convenience. To do so, I came across the |
Hey @lucalila you are totally right: data_silo._load_data(train_dicts=dicts) should work on dicts as well. But there is a little issue with naming the keys inside the dict. Thanks for pointing out the missing label tensor name btw. That's how I realized the name conversion through the parameter "label_column_name" in the TextClassificationProcessor is not applied when the data silo loads dicts directly. So you have to name your label key "text_classification_label" as in Could you report back if that works for your case? |
Nice, this is working for me! Thanks a lot!! |
Sorry, not sure what I did this morning, but with the same config as above and the new label key, I'm here now:
Maybe you can help one more time? |
Describe the bug
I'm using the multilabel text classification prediction head for training a custom dataset. Earlier when using the FARM example code with my data, everything worked fine. Now, when trying to re-train in a new environment with the same dataset, I'm facing the following issue:
Error message
To Reproduce
What I changed is basically the following: I'm not loading my dataset from a local dir anymore but instead I'm passing a dict based on the suggestion in #127
Any ideas what I'm missing? Thanks for any help.
Additional context
Seems like my data_silo is missing some information. E.g., in the first version I had:
When loading from dict, I've got:
System:
The text was updated successfully, but these errors were encountered: