Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: InternalException raised when read_parquet #13285

Closed
2 tasks done
ncclementi opened this issue Aug 2, 2024 · 3 comments
Closed
2 tasks done

bug: InternalException raised when read_parquet #13285

ncclementi opened this issue Aug 2, 2024 · 3 comments

Comments

@ncclementi
Copy link

What happens?

Trying to run the same code reported in #13120 now I encounter a new error. I'm using the latest nightly from 07/31: '1.0.1-dev3542'. I assume this includes the fix from #13182
cc: @Mytherin

To Reproduce

import duckdb 
duckdb.sql("load spatial;")

sql = """SELECT
      id,
      names.primary as primary_name,
      height,
      geometry
    FROM read_parquet('s3://overturemaps-us-west-2/release/2024-06-13-beta.1/theme=buildings/type=*/*', filename=true, hive_partitioning=1)
    WHERE primary_name IS NOT NULL
    AND bbox.xmin > -84.36
    AND bbox.xmax < -82.42
    AND bbox.ymin > 41.71
    AND bbox.ymax < 43.33;
"""

tdb = duckdb.sql(sql)
duckdb.sql("SELECT * FROM tdb LIMIT 10;")
---------------------------------------------------------------------------
InternalException                         Traceback (most recent call last)
File [~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/core/formatters.py:711](http://localhost:8889/lab/tree/docs/posts/ibis-lonboard-geo/overture_maps_data/~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/core/formatters.py#line=710), in PlainTextFormatter.__call__(self, obj)
    704 stream = StringIO()
    705 printer = pretty.RepresentationPrinter(stream, self.verbose,
    706     self.max_width, self.newline,
    707     max_seq_length=self.max_seq_length,
    708     singleton_pprinters=self.singleton_printers,
    709     type_pprinters=self.type_printers,
    710     deferred_pprinters=self.deferred_printers)
--> 711 printer.pretty(obj)
    712 printer.flush()
    713 return stream.getvalue()

File [~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/lib/pretty.py:419](http://localhost:8889/lab/tree/docs/posts/ibis-lonboard-geo/overture_maps_data/~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/lib/pretty.py#line=418), in RepresentationPrinter.pretty(self, obj)
    408                         return meth(obj, self, cycle)
    409                 if (
    410                     cls is not object
    411                     # check if cls defines __repr__
   (...)
    417                     and callable(_safe_getattr(cls, "__repr__", None))
    418                 ):
--> 419                     return _repr_pprint(obj, self, cycle)
    421     return _default_pprint(obj, self, cycle)
    422 finally:

File [~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/lib/pretty.py:787](http://localhost:8889/lab/tree/docs/posts/ibis-lonboard-geo/overture_maps_data/~/mambaforge/envs/ibis-geo/lib/python3.11/site-packages/IPython/lib/pretty.py#line=786), in _repr_pprint(obj, p, cycle)
    785 """A pprint that just redirects to the normal repr function."""
    786 # Find newlines and replace them with p.break_()
--> 787 output = repr(obj)
    788 lines = output.splitlines()
    789 with p.group():

InternalException: INTERNAL Error: Information loss on integer cast: value 18.943993 outside of target range [18.000000, 18.000000]
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors

OS:

MacOS - Sonoma 14.5

DuckDB Version:

1.0.1-dev3542

DuckDB Client:

Python

Full Name:

Naty Clementi

Affiliation:

Voltron Data

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a nightly build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@carlopi
Copy link
Contributor

carlopi commented Aug 2, 2024

Thanks for raising, I have seen a couple of similar problems that have been already fixed in the last 24 hours, possibly you need to way another 12 hours for the next set of nightlies.

@ncclementi
Copy link
Author

Thanks sounds good, I'll wait for the next nightly and I'll report back, if it is fixed I'll close it.

@ncclementi
Copy link
Author

Closing, I can confirmed it's fixed now in the later nightlies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants