Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: INTERNAL Error: INTERNAL Error: Could not find node in column segment tree #3789

Closed
2 tasks done
koenvo opened this issue Jun 7, 2022 · 19 comments · Fixed by #8618
Closed
2 tasks done
Assignees

Comments

@koenvo
Copy link

koenvo commented Jun 7, 2022

What happens?

Inserting and deleting same rows in transaction give 'INTERNAL Error' (seems related to #1091 ). This error does not occur when I only execute "Step 2"

To Reproduce

con = duckdb.connect("test.duckdb")
con.execute("""CREATE TABLE table1 (
          column1 varchar(36) NOT NULL,
          column2 varchar(250) NOT NULL PRIMARY KEY,
        );""")

# Step 1: insert data
con.execute(
    "INSERT INTO table1(column1, column2) values(?, ?)",
    [
        "bla",
        "1",
    ],
)
con.close()

con = duckdb.connect("test.duckdb")
con.begin()
# Step 2: insert and remove data in transaction
con.execute(
    "INSERT INTO table1(column1, column2) values(?, ?)",
    [
        "bla",
        "2",
    ],
)
con.execute(
    "DELETE FROM table1 WHERE column1 = ?",
    ["bla"],
)

RuntimeError: INTERNAL Error: INTERNAL Error: Could not find node in column segment tree!

Environment (please complete the following information):

  • OS: macOS Montery M1
  • DuckDB Version: 0.3.3 / 0.3.4 / 0.3.5
  • DuckDB Client: Python

Before Submitting

  • Have you tried this on the latest master branch?
  • Python: pip install duckdb --upgrade --pre
  • R: install.packages("https://github.com/duckdb/duckdb/releases/download/master-builds/duckdb_r_src.tar.gz", repos = NULL)
  • Other Platforms: You can find binaries here or compile from source.
  • Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
@koenvo koenvo changed the title RuntimeError: INTERNAL Error: INTERNAL Error: Could not find node in column segment tree! RuntimeError: INTERNAL Error: INTERNAL Error: Could not find node in column segment tree Jun 7, 2022
@hannes
Copy link
Member

hannes commented Jun 8, 2022

I can confirm that this issue appears on my installation, too. Its clearly a bug.

@Mytherin Mytherin self-assigned this Jun 10, 2022
@koenvo
Copy link
Author

koenvo commented Jul 27, 2022

There seem to be also problems around UPDATE queries. But the UPDATE query causes corrupt data. Let me try to come up with a script to reproduce.

@koenvo
Copy link
Author

koenvo commented Aug 30, 2022

Not entirely sure in which version it's fixed, but I just checked duckdb-0.4.1.dev2356 and it's working corrently for the INSERT/DELETE use case. Still getting corrupt data on UPDATE within transaction. Will try to reproduce it in isolated script.

@jwills
Copy link
Contributor

jwills commented Apr 19, 2023

Hey all, I had a dbt-duckdb user who ran into this one on the 0.7.1 release and I created a simple script that can reproduce it locally for me:

import duckdb

con = duckdb.connect("test.duckdb")
con.execute("""CREATE TABLE table1 (
          column1 varchar(36) NOT NULL,
          column2 varchar(250) NOT NULL,
        );""")

# Step 1: insert data
con.execute(
    "INSERT INTO table1(column1, column2) values(?, ?)",
    [
        "bla",
        "1",
    ],
)
con.close()

con = duckdb.connect("test.duckdb")
con.begin()
# Step 2: insert and remove data in transaction
con.execute(
    "INSERT INTO table1(column1, column2) values(?, ?)",
    [
        "bla",
        "2",
    ],
)
con.execute(
    "UPDATE table1 SET column2 = ? FROM table1 s WHERE s.column1 = ?",
    ["3", "bla"],
)
con.commit()
con.close()

yields the following:

jwills@Joshs-MBP ~ % python3 repro.py
Traceback (most recent call last):
  File "/Users/jwills/repro.py", line 29, in <module>
    con.execute(
duckdb.InternalException: INTERNAL Error: Could not find node in column segment tree!
Attempting to find row number "4611686018427388000" in 1 nodes
Node 0: Start 0, Count 1

Note that this is specific to using the UPDATE ... FROM syntax in this way (if you just do a regular UPDATE ... SET, everything works fine.) I'm going to sync up my fork of DuckDB and see if I can figure out where/how this is happening myself, but if anyone more competent than me wants to take a look, I would be grateful! 🙇

@jwills
Copy link
Contributor

jwills commented Apr 19, 2023

So I think whatever is going wrong is going wrong around here: https://github.com/duckdb/duckdb/blob/master/src/storage/data_table.cpp#L1112

like, we should be trying to do the update against the data in the transaction local storage based on the value of the row_number, but we're falling back to the underlying storage index instead. 🤔

@jwills
Copy link
Contributor

jwills commented Apr 19, 2023

K so I'm thinking that fixing this will require updating the Update method above to work more like the Delete method here: https://github.com/duckdb/duckdb/blob/master/src/storage/data_table.cpp#L966

We will need to iterate through the row identifiers and route the operation to either the local storage for ids[pos] > MAX_ROW_ID or the segment for the regular row ids, using blocks to hopefully minimize the amount of context switching involved.

@Mytherin does that sound right? I think I can take a crack at it, but I don't know if there any unseen performance/transaction dragons down in here that I should be aware of. 🙇

@Mytherin
Copy link
Collaborator

Yes that sounds right, see this commit that made the same change for the DELETE for the same reason/bug.

@jwills
Copy link
Contributor

jwills commented Apr 21, 2023

Alright perfect, thank you so much! As it so happens I found a good workaround for this use case (instead of doing the INSERT and then the UPDATE, do the UPDATE and then the INSERT!) but I will come back around to it when I have time!

@raiderrobert
Copy link

raiderrobert commented Jun 21, 2023

I've run into this same issue now, I believe.

Attempting to find row number "276" in 0 nodes

Is there any fix for this on an existing database. Or do I need to rebuild from source data?

@begelundmuller
Copy link

I'm not sure if it's related, but I have seen the same error message in connection with a different issue: #8420 (comment)

INTERNAL Error: Could not find node in column segment tree!
Attempting to find row number "0" in 0 nodes

@nicku33
Copy link
Contributor

nicku33 commented Aug 17, 2023

I believe I just had this issue even with different duckdb Connections.

@hannes
Copy link
Member

hannes commented Aug 18, 2023

Managed to reproduce this with @jwills' script above, thanks!

@marhar
Copy link

marhar commented Sep 12, 2023

Here is a pure sql test case.
Note that three things seem to be required to cause the error:

  • one row inserted outside of transaction
  • one row inserted inside of transaction
  • update must have where clause
.echo on

-- works: two rows inserted outside transaction
CREATE OR REPLACE TABLE t(a int);
INSERT INTO t(a) values(1);
INSERT INTO t(a) values(2);
UPDATE t SET a = 3 where a > 0;

-- works: two rows inserted inside transaction
CREATE OR REPLACE TABLE t (a text);
BEGIN;
INSERT INTO t(a) values(4);
INSERT INTO t(a) values(5);
UPDATE t SET a = 6 where a > 0;
COMMIT;

-- fails: two rows, one inside, one outside, with where clause
CREATE OR REPLACE TABLE t (a text);
INSERT INTO t(a) values(7);
BEGIN;
INSERT INTO t(a) values(8);
UPDATE t SET a = 9;
UPDATE t SET a = 10 where a > 0;
COMMIT;

Error:

UPDATE t SET a = 10 where a > 0;
Error: near line 23: INTERNAL Error: Could not find node in column segment tree!
Attempting to find row number "36028797018960000" in 1 nodes
Node 0: Start 0, Count 1

@hannes
Copy link
Member

hannes commented Sep 12, 2023

@marhar I can't reproduce this on the latest main, what version did you try? Cheers

@marhar
Copy link

marhar commented Sep 13, 2023

@hannes I must have made a build mistake... I did "git pull; make" which gave me this version which exhibited the error.

v0.8.2-dev2842 6421a36e94

But when I deleted and redownloaded the repot, I can confirm everything works, and the version reports

v0.8.2-dev4376 312b995450

I will download the equivalent python and try to reproduce the original code, but from my perspective it seems the problem is fixed.

Conclusion: https://imgs.xkcd.com/comics/git.png

Cheers

@Stongtong
Copy link

same problem i meet, "Attempting to find row number "0" in 0 node"
duck db version : 0.7.1
running sql : insert into ${table_name} select * from read_parquet(./parquet_file)
client : java 11
I have not reproduce it yet

@Mytherin
Copy link
Collaborator

Mytherin commented Nov 1, 2023

Can you try this in the latest DuckDB version?

@Stongtong
Copy link

@Mytherin Upgrading version is hard in production environment. however, I will still try latest DuckDB, thanks

@NickCrews
Copy link

NickCrews commented May 3, 2024

EDIT: I filed #11924 as a separate issue, @Stongtong perhaps you want to see that issue

I just ran into this on 0.10.1. Error:

Attempting to find row number "36028797019081271" in 3 nodes
Node 0: Start 36028797018960000, Count 0Node 1: Start 36028797019082880, Count 0Node 2: Start 36028797019205760, Count 0

Sorry, I'm not gonna be able to reduce my flow into a simpler reproducer. But perhaps some generalities will be helpful still: The offending SQL was INSERT OR REPLACE INTO addresses SELECT <big complex thing generated from ibis>, and the relevant DDL for the destination table is

CREATE TABLE addresses(
    address__id UUID PRIMARY KEY NOT NULL,
    person__id UUID NOT NULL,
    street1 VARCHAR,
    street2 VARCHAR,
    city VARCHAR,
    state VARCHAR,
    zipcode VARCHAR,
    country VARCHAR,
    mailing_status VARCHAR,
    is_mailing BOOLEAN,
    is_voting BOOLEAN,
    latitude DOUBLE,
    longitude DOUBLE,
    last_updated DATE,
    source VARCHAR,
);

This is consistent, if I delete the whole db and run everything again, it happens 2 out of 2 times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.