New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filenames to parquet reading exceptions #15429
Comments
This one is tricky to do because it'd require some refactoring as the function that actually experiences the error doesn't necessarily have the filename so it's not so simple. As a workaround you can do something like
|
@deanm0000 doing that will ruin the performance though |
#10481 was recently accepted. (expose filepath/name as a column via bulk reader methods) Just linking for reference as it seems that work would be a stepping stone for this. |
@deanm0000 I don't see how that's a workaround? Eg. if I have code that looks like
How does your suggestion help me identify which input parquet is at fault? @cmdlineluser I can see how that ticket would require the same groundwork. |
When you do If you do @ion-elgreco I only mean it as a troubleshooting step not a all-the-time replacement of the normal way of using |
Description
I construct large, lazy queries sourced from scan_parquet across multiple files. Sometimes the input files are malformed. Sample exceptions below. They would be much more useful if they had the filename of the dodgy parquet so that I could easily tell to which of the many input files the error pertains.
The text was updated successfully, but these errors were encountered: