Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuckDB erroneously pushes incomplete results to S3 following query failure #12038

Closed
2 tasks done
onderkalaci opened this issue May 14, 2024 · 0 comments · Fixed by #12031
Closed
2 tasks done

DuckDB erroneously pushes incomplete results to S3 following query failure #12038

onderkalaci opened this issue May 14, 2024 · 0 comments · Fixed by #12031

Comments

@onderkalaci
Copy link
Contributor

onderkalaci commented May 14, 2024

What happens?

Even-though a query fails, the httpfs extension pushing the incomplete results to S3.

To Reproduce

This is a simplified example where the generate_series fail for negative values, but due to destructor pushing results to S3, some incomplete data is written to S3.

For parquet:

COPY (
	SELECT sqrt(generate_series.generate_series)
	FROM generate_series(1000000,-12, -1)
     ) TO 's3://mybucket/tmp/ft5/ft202.parquet';
Out of Range Error: cannot take square root of a negative number

SELECT count(*) FROM
read_parquet('s3://mybucket/tmp/ft5/ft202.parquet');
Invalid Input Error: No magic bytes found at end of file 's3://mybucket/tmp/ft5/ft202.parquet'

for csv/json, see that results are wrong:

D COPY (SELECT sqrt(generate_series.generate_series) FROM generate_series(20000,-20000, -1)) TO 's3://mybucket/tmp/ft5/ft17.json';
Out of Range Error: cannot take square root of a negative number
D select count(*) FROM read_json('s3://mybucket/tmp/ft5/ft17.json');
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│        18432 │
└──────────────┘
D 

OS:

Both MacOs, and Ubuntu 22.04.4 LTS on WSL2

DuckDB Version:

0.10.2, main

DuckDB Client:

CLI

Full Name:

Onder KALACI

Affiliation:

CrunchyData

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a source build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants