Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to load metadata pointer when connecting #11814

Open
2 tasks done
jovanglig opened this issue Apr 24, 2024 · 1 comment
Open
2 tasks done

Failed to load metadata pointer when connecting #11814

jovanglig opened this issue Apr 24, 2024 · 1 comment

Comments

@jovanglig
Copy link

jovanglig commented Apr 24, 2024

What happens?

I get the following message when trying to connect to a persisted duckdb file through Python:

InternalException: INTERNAL Error: Failed to load metadata pointer (id 189, idx 48, ptr 3458764513820541117)

I believe the file to be corrupted, I had a Docker container running Mage.ai importing some data through an API and Mage.ai cancelled the job as it was approaching max allocated container memory, while importing the data to Duckdb.
Perhaps I should have used transactions?

Anyone had this issue? I had a similar issue before where some data was corrupted after ingestion but I could still connect regardless.

To Reproduce

con = duckdb.connect(database='../mage-arkham/arkham_database.duckdb', read_only=False)

OS:

macOS Sonoma 14.4.1

DuckDB Version:

0.10.2

DuckDB Client:

Python

Full Name:

Jovan Gligorevic

Affiliation:

Bitpanda

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@kannenberg
Copy link

Hi guys,

We also encountered the same error in a process mining application that handles a large volume of events.

In our case, the application ingests data online. As the data is updated at the source (ERPs and other applications), it is sent to the database via a pipeline.

This means there are read and write operations happening concurrently (although not exactly at the same time, as DuckDB has some architectural restrictions in this regard).

After several days of normal operation, we received this same error.

There was nothing we could do; it seems the DuckDB file was indeed corrupted. We had to recreate the database.

We are also using version 0.10.2.

It is a much more stable and complete version than the previous ones, but this problem is new to us.

Is there anything we can do to mitigate this problem?

Thanks for your attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants