What happens?
Hi,
I am running into this error message and am unsure what to do.
_duckdb.NotImplementedException: Not implemented Error: Pushdown Filter Type BLOOM_FILTER is not currently supported in PyArrow Scans
It seems I can either disable pushdown filtering through joins, or materialize the query before doing my join. I would rather not do either, since it is a lot of data that I am processing.
Reading the source files with a non-pyarrow reader is unfortunately not an option at this time.
To Reproduce
Unfortunately, I don't really have a reproducer, but I think the maintainers will know what the issue is anyways from the error message. It seems to me that Pyarrow do support bloom filter, but the pushdown is just not implemented?
OS:
Ubuntu 22
DuckDB Package Version:
1.5.2
Python Version:
3.11
Full Name:
Jakob Stigenberg
Affiliation:
Qubos Systematic
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
No - I cannot share the data sets because they are confidential
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?
What happens?
Hi,
I am running into this error message and am unsure what to do.
It seems I can either disable pushdown filtering through joins, or materialize the query before doing my join. I would rather not do either, since it is a lot of data that I am processing.
Reading the source files with a non-pyarrow reader is unfortunately not an option at this time.
To Reproduce
Unfortunately, I don't really have a reproducer, but I think the maintainers will know what the issue is anyways from the error message. It seems to me that Pyarrow do support bloom filter, but the pushdown is just not implemented?
OS:
Ubuntu 22
DuckDB Package Version:
1.5.2
Python Version:
3.11
Full Name:
Jakob Stigenberg
Affiliation:
Qubos Systematic
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
No - I cannot share the data sets because they are confidential
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?