-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INTERNAL Error: Attempted to access index N within vector of size N #10950
Comments
Can you edit the repro to: SELECT t1.Transaction_ID
FROM transactions t1
WHERE t1.Transaction_ID IN
(SELECT t2.Referred_Transaction_ID
FROM transactions t2
WHERE t2.Transaction_ID IN (123606, 123602, 131522, 123604, 131470)
AND t2.Transaction_ID NOT IN (SELECT t2_filter.Transaction_ID FROM transactions t2_filter)) Thanks for the report |
I get this error randomly when querying CSV files (using
The pattern matches two files. Although I can load the files without issue:
|
We rewrote our query to use an |
I ran into |
UPDATE: I have a reproducer for python! Hopefully this helps to identify a root cause! This reproducer demonstrates one flavor of this issue on both v0.9.2 and v0.10.1 as well as the latest pre-release version (v0.10.2.dev213). conn = duckdb.connect()
rel: duckdb.DuckDBPyRelation = conn.read_csv(
"/tmp/youtube_videos_sm.csv",
header=True,
null_padding=True,
)
rel.show() Result:
Shrinking the file further (by removing rows) seems to cause a different CSV dialect to be sniffed (i.e different delimiter or quote character or escape character), and the |
I @tboddyspargo I've managed to fix the issue the author had, but could not manage to reproduce your issue. I also looked at your csv file and there seem to be 2 types of separators, both |
You're right! I didn't notice it before, but there's inconsistency with quoting and escaping quotes in the
It seems I may have too aggressively truncated the file without re-testing on other versions. Here are the results of my repro with youtube_videos_sm.csv:
With the issue (presumably in the sniffer) addressed after 0.9.2, I think the |
hey @tboddyspargo I just encountered a similar case Original error: "Attempted to access index 1 within vector of size 1"", possibly due to long strings and formatting. I seem to be blocked on this, do you know what I should do to the file to resolve this? Any help would be really appreciated, thank you |
Based on this issue, you may just want to try with the latest pre-release version to start with. If that doesn't address the issue, then sharing a reproducer that fails on the latest pre-release would help the maintainers to effectively triage and investigate. |
What happens?
Running the given query causes
INTERNAL Error: Attempted to access index 3 within vector of size 3
after which point the database will produce the error'Error: FATAL Error: Failed: database has been invalidated because of a previous fatal error. The database must be restarted prior to being used again.'
To Reproduce
Run the following querying.
That query is a very stripped down version of the query which caused the error for me. So although it looks odd, and pointless, it still produces the
INTERNAL ERROR
.Use the following database:
bug.db.zip
Just to note something odd: If you export as CSV from that database, then reimport into a new database, the query will succeed.
OS:
x64, aarch64
DuckDB Version:
0.9.2, 0.10.0, nightly on 2024-03-01
DuckDB Client:
Java, CLI
Full Name:
David Corcoran
Affiliation:
Topaz (https://topaz.technology/)
We use DuckDB as an in memory pivoting database. We migrated from originally using MonetDB for the same job.
Have you tried this on the latest nightly build?
I have tested with a nightly build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
The text was updated successfully, but these errors were encountered: