Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault after large INSERT #9584

Closed
1 task done
tiwo opened this issue Nov 6, 2023 · 1 comment · Fixed by #9640
Closed
1 task done

Segfault after large INSERT #9584

tiwo opened this issue Nov 6, 2023 · 1 comment · Fixed by #9640

Comments

@tiwo
Copy link

tiwo commented Nov 6, 2023

What happens?

Inserting rows from a collection of Parquet files: INSERT INFO ... FROM (SELECT read_parquet(...)) fails:

After the progress bar reads 100%, this fails with a segmentation fault in RowVersionManager::CommitAppend.

Expected behaviour is that the statement is succesfull, or fails with an error message.


DuckDB v0.9.1, the cldebug build, running under gdb: (click to reveal)
(gdb) run occ.db
Starting program: /ddsf/duckdb occ.db
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
v0.9.1 401c8061c6
Enter ".help" for usage hints.
D
D SELECT COUNT(1) from read_parquet('data/*');
[New Thread 0x7ffff73ff640 (LWP 102197)]
[New Thread 0x7ffff6bfe640 (LWP 102198)]
[New Thread 0x7ffff63fd640 (LWP 102199)]
[New Thread 0x7ffff5bfc640 (LWP 102200)]
[New Thread 0x7ffff53fb640 (LWP 102201)]
[New Thread 0x7ffff4bfa640 (LWP 102202)]
[New Thread 0x7ffff43f9640 (LWP 102203)]
100% ▕████████████████████████████████████████████████████████████▏
┌────────────┐
│  count(1)  │
│   int64    │
├────────────┤
│ 2613427383 │
└────────────┘
D
D create table "occ"("i" int8, "tax" int8, "lon" double, "lat" double);
D insert into "occ" by name (select "gbifid" "i", "specieskey" "tax", "decimallatitude" "lat", "decimallongitude" "lon" from read_parquet('data/*') where decimallattitude is not null and decimallongitude is not null and specieskey>0);
100% ▕████████████████████████████████████████████████████████████▏

Thread 1 "duckdb" received signal SIGSEGV, Segmentation fault.
0x0000555557f8b79a in duckdb::RowVersionManager::CommitAppend (this=0x55555b33c810, commit_id=72, row_group_start=0, count=0) at /home/gbsim/duckdb/
:124
124                     info->CommitAppend(commit_id, vstart, vend);
(gdb) bt
#0  0x0000555557f8b79a in duckdb::RowVersionManager::CommitAppend (this=0x55555b33c810, commit_id=72,
    row_group_start=0, count=0) at /home/gbsim/duckdb/src/storage/table/row_version_manager.cpp:124
#1  0x0000555557f83e2a in duckdb::RowGroup::CommitAppend (this=0x7fffd741c300, commit_id=72, row_group_start=0,
    count=0) at /home/gbsim/duckdb/src/storage/table/row_group.cpp:641
#2  0x0000555557f87f98 in duckdb::RowGroupCollection::CommitAppend (this=0x55555af15780, commit_id=72, row_start=0,
    count=2366250032) at /home/gbsim/duckdb/src/storage/table/row_group_collection.cpp:419
#3  0x0000555557ffa048 in duckdb::DataTable::CommitAppend (this=0x55555aee46b0, commit_id=72, row_start=0,
    count=2366250032) at /home/gbsim/duckdb/src/storage/data_table.cpp:819
#4  0x0000555558050a6a in duckdb::CommitState::CommitEntry<true> (this=0x7fffffff7d50,
    type=duckdb::UndoFlags::INSERT_TUPLE, data=0x7ffff7fb5008 "\260F\356ZUU")
    at /home/gbsim/duckdb/src/transaction/commit_state.cpp:274
#5  0x000055555804c4b1 in operator() (__closure=0x7fffffff7d48, type=duckdb::UndoFlags::INSERT_TUPLE,
    data=0x7ffff7fb5008 "\260F\356ZUU") at /home/gbsim/duckdb/src/transaction/undo_buffer.cpp:142
#6  0x000055555804f65e in duckdb::UndoBuffer::IterateEntries<duckdb::UndoBuffer::Commit(duckdb::UndoBuffer::IteratorState&, duckdb::optional_ptr<duckdb::WriteAheadLog>, duckdb::transaction_t)::<lambda(duckdb::UndoFlags, duckdb::data_ptr_t)> >(duckdb::UndoBuffer::IteratorState &, struct {...} &&) (this=0x55555aeda0e0, state=..., callback=...)
    at /home/gbsim/duckdb/src/transaction/undo_buffer.cpp:45
#7  0x000055555804c594 in duckdb::UndoBuffer::Commit (this=0x55555aeda0e0, iterator_state=..., log=..., commit_id=72)
    at /home/gbsim/duckdb/src/transaction/undo_buffer.cpp:142
#8  0x000055555804b485 in duckdb::DuckTransaction::Commit[abi:cxx11](duckdb::AttachedDatabase&, unsigned long, bool) (
    this=0x55555aeda060, db=..., commit_id=72, checkpoint=true)
    at /home/gbsim/duckdb/src/transaction/duck_transaction.cpp:133
#9  0x000055555804a2c4 in duckdb::DuckTransactionManager::CommitTransaction[abi:cxx11](duckdb::ClientContext&, duckdb::Transaction*) (this=0x55555ae095d0, context=..., transaction_p=0x55555aeda060)
    at /home/gbsim/duckdb/src/transaction/duck_transaction_manager.cpp:215
#10 0x000055555804bbf6 in duckdb::MetaTransaction::Commit[abi:cxx11]() (this=0x55555aef0800)
    at /home/gbsim/duckdb/src/transaction/meta_transaction.cpp:60
#11 0x000055555804cc19 in duckdb::TransactionContext::Commit (this=0x55555ae09880)
    at /home/gbsim/duckdb/src/transaction/transaction_context.cpp:48
#12 0x0000555557ce2a1e in duckdb::ClientContext::EndQueryInternal (this=0x55555ae09690, lock=..., success=true,
    invalidate_transaction=false) at /home/gbsim/duckdb/src/main/client_context.cpp:182
#13 0x0000555557ce2fe0 in duckdb::ClientContext::CleanupInternal (this=0x55555ae09690, lock=...,
    result=0x55555afa19a0, invalidate_transaction=false) at /home/gbsim/duckdb/src/main/client_context.cpp:216
#14 0x0000555557ce1e6e in duckdb::ClientContext::FetchInternal (this=0x55555ae09690, lock=..., executor=...,
    result=...) at /home/gbsim/duckdb/src/main/client_context.cpp:101
#15 0x0000555557ce36d6 in duckdb::ClientContext::FetchResultInternal (this=0x55555ae09690, lock=..., pending=...)
    at /home/gbsim/duckdb/src/main/client_context.cpp:272
#16 0x0000555557cf9011 in duckdb::PendingQueryResult::ExecuteInternal (this=0x55555ae7f280, lock=...)
    at /home/gbsim/duckdb/src/main/pending_query_result.cpp:64
#17 0x0000555557cf90d3 in duckdb::PendingQueryResult::Execute (this=0x55555ae7f280)
    at /home/gbsim/duckdb/src/main/pending_query_result.cpp:71
#18 0x0000555557cf9a1a in duckdb::PreparedStatement::Execute (this=0x55555af166e0, values=...,
    allow_stream_result=false) at /home/gbsim/duckdb/src/main/prepared_statement.cpp:85
#19 0x00005555576a8c8d in duckdb_shell_sqlite3_print_duckbox (pStmt=0x55555ae80f20, max_rows=40, max_width=0,
    null_value=0x7fffffffce54 "", columnar=0)
    at /home/gbsim/duckdb/tools/sqlite3_api_wrapper/sqlite3_api_wrapper.cpp:244
#20 0x000055555768b2fa in exec_prepared_stmt (pArg=0x7fffffffcd40, pStmt=0x55555ae80f20)
    at /home/gbsim/duckdb/tools/shell/shell.c:12901
#21 0x000055555768c146 in shell_exec (pArg=0x7fffffffcd40,
    zSql=0x55555ae9f120 "insert into \"occ\" by name (select \"gbifid\" \"i\", \"specieskey\" \"tax\", \"decimallatitude\" \"lat\", \"decimallongitude\" \"lon\" from read_parquet('data/*') where decimallatitude is not null and decimallongitude"..., pzErrMsg=0x7fffffffcbb8) at /home/gbsim/duckdb/tools/shell/shell.c:13236
#22 0x0000555557699af3 in runOneSqlLine (p=0x7fffffffcd40,
    zSql=0x55555ae9f120 "insert into \"occ\" by name (select \"gbifid\" \"i\", \"specieskey\" \"tax\", \"decimallatitude\" \"lat\", \"decimallongitude\" \"lon\" from read_parquet('data/*') where decimallatitude is not null and decimallongitude"..., in=0x0, startline=4) at /home/gbsim/duckdb/tools/shell/shell.c:19654
#23 0x000055555769a09f in process_input (p=0x7fffffffcd40) at /home/gbsim/duckdb/tools/shell/shell.c:19772
#24 0x000055555769bf4d in main (argc=2, argv=0x7fffffffe078) at /home/gbsim/duckdb/tools/shell/shell.c:20585

To Reproduce

I have not succeeded in reproducing this with synthetic parquet data, or by just selecting one of the columns, but can reproduce this issue like this:

I'm using DuckDB 0.9.1 from the git repository (git checkout v0.9.1), running on a x86_64 virtual server with 32 GB RAM and 8 virtual cores.

  • wget https://gbif-open-data-eu-central-1.s3.amazonaws.com/occurrence/2023-11-01/occurrence.parquet/{000000...002459} to download the dataset to a local directory; this presumes bash-like expansion to download 2460 parquet files totaling 2613427383 = 1.2 · 231 rows, a 210 GB download. The resulting DuckDB database, after discarding most columns, would be some 55 GB.

  • Create a table: CREATE TABLE "occ"("i" INT8, "tax" INT8, "lon" DOUBLE, "lat" DOUBLE);

  • (Now, SELECT COUNT(1) from read_parquet('data/*'); should count 2613427383 rows.)

  • The following statement triggers the stack overflow, after reading to 100%:

INSERT INTO "occ" BY NAME (
    SELECT "gbifid" "i", "specieskey" "tax", "decimallatitude" "lat", "decimallongitude" "lon" from read_parquet('data/*')
    WHERE "decimallatitude" IS NOT NULL AND "decimallongitude" IS NOT NULL AND "specieskey">0
);

Things I've experimented with:

  • SET memory_limit='250MB', the same segmentation fault occurs.
  • ✔ The same INSERT statement, repeated (from a bash loop) for every of the parquet files. Succeeds.
  • SET threads TO 1, the statement succeeds.
  • ✔ Extracting only one column (the ID) to a one-column table, succeeds.

OS:

Ubuntu 22.04 on a x86_64 virtual server.

DuckDB Version:

git checkout v0.9.1 and v0.9.2-dev256 2e6a379

DuckDB Client:

included DuckDB shell

Full Name:

Thomas Wissen

Affiliation:

none

Have you tried this on the latest main branch?

I have tested with a main build

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • Yes, I have
@Mytherin
Copy link
Collaborator

Thanks for the very detailed report! The issue happens because of the varying row group sizes in the many parquet files combined with the order-preserving parallel insertion triggering an edge case. I've pushed a fix for the issue in #9640.

Mytherin added a commit that referenced this issue Nov 11, 2023
Fix #9584: Correctly bail-out on committing appends of 0 rows, which can happen in certain edge cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants