Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lmdb database corrupted after mdb_dump / mdb_load #8873

Closed
peterthomassen opened this issue Feb 27, 2020 · 7 comments
Closed

lmdb database corrupted after mdb_dump / mdb_load #8873

peterthomassen opened this issue Feb 27, 2020 · 7 comments
Labels
Milestone

Comments

@peterthomassen
Copy link
Contributor

  • Program: Authoritative
  • Issue type: Bug report

Short description

When dumping and restoring the lmdb database with the below commands, the database is at least partly corrupted. I get correct DNS responses, but adding a zone via the API fails.

Environment

  • Operating system: Ubuntu bionic
  • Software version: 4.2.1
  • Software source: PowerDNS repository

Steps to reproduce

  1. Provision some zones (few thousand here, did not investigate precondition)
  2. stop pdns
  3. In storage directory: shopt -s extglob; for file in pdns.lmdb pdns.lmdb-+([0-9]); do echo $file; mdb_dump -f /backup/$file.dump -n -a $file; done
  4. Purge storage, then: for file in /backup/*.dump; do echo $file; mdb_load -f $file -n $(basename $file .dump); done
  5. start pdns
  6. provision a zone using API (tested slave zone here)

Expected behaviour

regular operation

Actual behaviour

Feb 27 14:56:17 ubuntu [webserver] 36858d61-afa4-44a9-83e7-ba9ab22b29f2 HTTP ISE for "/api/v1/servers/localhost/zones": STL Exception: putting data: MDB_KEYEXIST: Key/data pair already exists
Feb 27 14:56:17 ubuntu [webserver] 36858d61-afa4-44a9-83e7-ba9ab22b29f2 10.16.3.4:52338 "POST /api/v1/servers/localhost/zones HTTP/1.1" 500 167

Other information

This post by lmdb developer @hyc says that files should be binary compatible as long as you stay on Intel/AMD (little-endian). Word size does not appear to matter anymore. So, directly using the backup should work.

@hyc
Copy link

hyc commented Feb 27, 2020

The post you referenced is from a Monero-specific forum and is specific to the way Monero builds LMDB. It does not apply to LMDB in general. By default, 32bit and 64bit LMDB files are not compatible.

There was a bug in mdb_dump/load dealing with backslash-escaped characters, fixed in the recent (January 30) release, perhaps you've encountered that. Also, if your DBs use custom sort functions, mdb_load may be changing the sort order on you. The fix for this is in mdb.master but not yet in a public 0.9 release.

@Habbie
Copy link
Member

Habbie commented Feb 27, 2020

Hi @hyc, thanks for dropping by!

A quick grep for mdb_set_compare shows nothing, so my other suspect was DUPSORT, but as you don't mention that, my money is on the backslashes.

The post you referenced is from a Monero-specific forum and is specific to [..] Monero [..]

Ah, I thought it would be something like that, I posted something along those lines at desec-io/desec-ns#27 (comment)

@hyc
Copy link

hyc commented Feb 28, 2020

The obvious thing to do is take a DB, dump/load it, dump/load that, and do a binary diff across all 3 results. The first two will probably be different if the original DB has been subject to interleaved adds and deletes, but the second two ought to be identical.

@peterthomassen
Copy link
Contributor Author

The post you referenced is from a Monero-specific forum and is specific to the way Monero builds LMDB. It does not apply to LMDB in general. By default, 32bit and 64bit LMDB files are not compatible.

Ah, thank you for clarifying!

There was a bug in mdb_dump/load dealing with backslash-escaped characters, fixed in the recent (January 30) release, perhaps you've encountered that.

Yes! I built mdb_load from your sources, and things do work now. Is it expected that with the new version, mdb_load runtime is much faster (comparable to mdb_dump, ~1 second in my case) than before? The buggy version needed 27 seconds to load the exported files.

The file size of the restored database has changed as well (62 MB with old version, 77 MB with new version).

Just trying to make sure that nothing else is off.

@peterthomassen
Copy link
Contributor Author

not a pdns bug

@hyc
Copy link

hyc commented Mar 2, 2020

I'm surprised there is such a difference in performance, but I suppose it's plausible. If the keys are being misinterpreted due to the bug, then they may no longer appear to be in sorted order, which will cause inserts to have a lot more seek overhead than otherwise. The difference in size could be due to some keys being seen as duplicates and overwriting each other, or simply due to inserting into previously split pages (whereas an in-order load would never revisit already used pages).

@peterthomassen
Copy link
Contributor Author

Ok, thanks for the heads-up. While hard to quantify, it sounds like the effects are in the right direction, and thus no reason for concern. Yay! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants