-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connecting to files in S3 should recognize parquet as a valid file format #32
Comments
So there is an option to override the file types in the config. I'm trying to test it out but Can you run connect_hub(hub_path, file_format = "parquet") and if so does it work? Agreed though that overriding it manually isn't ideal. I'm not sure whether just always looking for parquet files to open is something we would want to allow. It moves away from the config but off the top of my head, I can't think why that would be a problem. |
@annakrystalli it does work, thank you for that! (though if you continue to have trouble using connect_hub() against these files, please let me know.
I've been thinking about this too. The config files, validations, etc. are critical to the functioning of a hub, but I'm not as clear on how strict our tools should be on the other end. I'm gonna close this because there's already a way to use hubData with parquet files on a .csv-only hub (and it was right there in the docs, sorry for missing that!) |
Great!
Will do! Excited to test it out 😃 |
When working with a cloud-enabled hub that doesn't have
parquet
specified as a valid file_format inadmin.json
, hubData is unable to connect to the model-output files in S3.I assume that's because hubData checks files against the listed file_formats.
But for hubs on the cloud, I imagine we'd want hubData to recognize parquet files, regardless of the formats accepted by the hub during the submission process.
The text was updated successfully, but these errors were encountered: