Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (or various memory errors) on WAL application #5759

Closed
2 tasks done
paulewog opened this issue Dec 21, 2022 · 7 comments · Fixed by #5824
Closed
2 tasks done

Segmentation fault (or various memory errors) on WAL application #5759

paulewog opened this issue Dec 21, 2022 · 7 comments · Fixed by #5824
Assignees

Comments

@paulewog
Copy link

paulewog commented Dec 21, 2022

What happens?

When applying a WAL file from an INSERT INTO ... SELECT FROM ... GROUP BY, a variety of errors occur, including:

  • corrupted double-linked list (Linux / AMD64)
  • Segmentation fault (OSX)
  • Also the below set of logs (OSX):
duckdb(31324,0x119e43600) malloc: *** error for object 0x6: pointer being freed was not allocated
duckdb(31324,0x119e43600) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

If the group-by is removed, it does not occur.

If the inserts are broken up using LIMIT/OFFSET with breaks inbetween, this also works.

Will upload the CSV.

To Reproduce

CREATE TABLE temp AS SELECT column0, last(column1) FROM 'anon-data.csv.gz' GROUP BY column0;

OS:

OSX, Linux AMD64

DuckDB Version:

  • 0.6.1
  • Also tried latest completed build off master - a3ea165

DuckDB Client:

CLI

Full Name:

Paul Ellsworth

Affiliation:

Tenable, Inc.

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@paulewog
Copy link
Author

Data used to reproduce:
anon-data.csv.gz

@paulewog
Copy link
Author

paulewog commented Dec 21, 2022

The group-by clause seems to make the problem show up very repeatedly; however, it sometimes shows up even without it; for example:

$ rm -f test.db ; rm -f test.db.wal ; duckdb test.db "create table test as select * FROM 'anon-data.csv.gz';"
duckdb(40732,0x111832600) malloc: *** error for object 0x7fc4cb809404: pointer being freed was not allocated
duckdb(40732,0x111832600) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

@Mytherin
Copy link
Collaborator

Thanks for the report! I can verify that this seems to be causing a crash in the FSST compression algorithm. @samansmink could you perhaps pick this up after the holidays?

@paulewog
Copy link
Author

Thanks, @Mytherin :) Is there a way to bypass FSST?

@samansmink
Copy link
Contributor

@paulewog you can use this pragma intended for testing:

pragma force_compression="dictionary";

this will disable all other compression schemes except dictionary encoding. Note that this will also disable the other compression schemes though, so your storage size will increase quite a bit.

Currently you can only pick one compression scheme to force or disable them all using:

pragma force_compression="uncompressed";

@paulewog
Copy link
Author

Ok, thank you! I'll test it out, if only to validate :)

@paulewog
Copy link
Author

(I can confirm that pragma force_compression="dictionary"; does make things pass. :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants