-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_parquet binary not working any more (since 0.20.0) #13099
Comments
You should pass the s3 path directly to the reader. This is an intended change. Try: |
@stinodego can we raise more informative here? We essentially get an object we don't expect right? |
OK - thanks! Yes, that works now! It would be good however to at least rework some of the documentation / type hints for parameter see https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.read_parquet.html |
They are supported, but only for local filesystems. I think I actually made a mistake in rewriting the function and your input should still work. Will take a look later today or tomorrow. You can use Ritchie's workaround for now - it's the recommended usage anyway. |
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Log output
Issue description
The issue was introduced in polars 0.20.0 - it used to work until 0.19.19
read_parquet function supposedly works (and did work) with binary input (from reading s3 data)
Now the
normalize_filepath
function is called on the binary input (which does not make sense)also
read_parquet
which - by the signature - supports BinaryIO internally callsscan_parquet
which does not (any more).Expected behavior
reading parquet given as binary content works
Installed versions
The text was updated successfully, but these errors were encountered: