Skip to content

Adding parquet files error: Failed to create checkpoint because of error: Could not read enough bytes from file #129

@alicia-peters-dev

Description

@alicia-peters-dev

What happens?

Hi,

I have been experiencing some issues inserting data from parquet files into duckdb. In most cases this works well, but I have a few cases where I have consistent errors, that look something like this:

TransactionContext Error: Failed to commit: Failed to create checkpoint because of error: Could not read enough bytes from file "/tmp/.db": attempted to read 262144 bytes from location 4869745866884263936

In this particular case I am exporting a large table from BigQuery as parquet files (5000, totally ~125GB). Some time around loading the 2000th file into duckdb I will get an error thrown like the above.

In order to add the files, I have tried both of the following statements:

COPY db.{table} 
FROM '{file}' (FORMAT parquet)

and

INSERT INTO db.{table}
SELECT {select_clause} FROM read_parquet('{file}')

I've also tried v1.3.2 and v1.4.1 of the python package, but the error still happens.

To Reproduce

N/a

OS:

linux

DuckDB Package Version:

1.3.2

Python Version:

3.9

Full Name:

Alicia Peters

Affiliation:

Benchling

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have not tested with any build

Did you include all relevant data sets for reproducing the issue?

No - Other reason (please specify in the issue body)

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions