Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Berkeley) DB is inconsistent #917

Closed
ifel opened this issue Oct 28, 2019 · 8 comments
Closed

(Berkeley) DB is inconsistent #917

ifel opened this issue Oct 28, 2019 · 8 comments

Comments

@ifel
Copy link
Contributor

ifel commented Oct 28, 2019

Sometimes the RPM db gets corrupted in a way, a package look installed to some rpm commands like rpm -qa and not to the others like rpm -e. It also affects yum. I'm under impression, there is data of the package in different tables, and some commands use the one table and the others another, so, if the data is inconsistent (there is an entry in one place and not in the other) it fails this way.
Interesting notice - only bdb and ndb backends affected, but not lmdb. I don't know the internals, either the DB structure is different or, lmdb writes data to the disk only in case if all the operations succeeded, so essentially it's a single transaction, and if rpm process got killed or something after updating one place and before updating another, nothing will be written. But it's a fact, I can see this only on ndb and bdb.

Example:
The package is installed on the host:

rpm -qa | grep fb-smc-observer-bootstrap-file

fb-smc-observer-bootstrap-file-20191027-040347.x86_64

RPM itself cannot remove it as "it's not installed"

rpm -e fb-smc-observer-bootstrap-file

error: package fb-smc-observer-bootstrap-file is not installed

But it is installed:

rpm -qa | grep fb-smc-observer-bootstrap-file

fb-smc-observer-bootstrap-file-20191027-040347.x86_64

Rebuild DB fixes the DB

rpm --rebuilddb

Yum can find the package now

yum remove fb-smc-observer-bootstrap-file

Resolving Dependencies
There are unfinished transactions remaining. You might consider running yum-complete-transaction, or "yum-complete-transaction --cleanup-only" and "yum history redo last", first to finish them. If those don't work you'll have to try removing/installing packages by hand (maybe package-cleanup can help).
--> Running transaction check
---> Package fb-smc-observer-bootstrap-file.x86_64 0:20191027-040347 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

==================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================
Package Arch Version Repository Size

Removing:
fb-smc-observer-bootstrap-file x86_64 20191027-040347 installed 110 k

Transaction Summary

Remove 1 Package

Installed size: 110 k
Is this ok [y/N]: n
Exiting on user command
Your transaction was saved, rerun it with:
yum load-transaction /tmp/yum_save_tx.2019-10-28.09-13.GN2dE6.yumtx

And RPM can remove it now

rpm -e fb-smc-observer-bootstrap-file

warning: file /var/facebook/smc/SmcObserver.json: remove failed: No such file or directory

rpm -qa | grep fb-smc-observer-bootstrap-file

@pmatilai
Copy link
Member

Yes, rpmdb updates are not transactional in any of the backends, and if killed forcefully in mid-transaction this can and will happen. Hysterical as it is. LMDB isn't really any safer against this (in the current implementation), my guess is that it just completes faster so there's less chance of killing it mid-flight...

The plan is to finally address this in the current development cycle.

@mlschroe
Copy link
Contributor

Note that from a high level perspective, rpm does not need transactions. It only has to database operations "insert header into database" and "remove header from database" and only does one operation at a time.

It should be up to the database to keep the index table in sync. Now, rpm's index tables are somewhat complex, so rpm manages its indexes by itself. That's where database transactions come into play. But it's also sufficient to not do any transaction rollbacks but instead just detect if the indexes are out of sync and then rebuild the indexes. That's the strategy used by the ndb backend.

@pmatilai
Copy link
Member

Yes, database transactions in rpm are only interesting for ensuring the indexes are in sync. The current BDB and LMDB backends have nothing to guarantee that, I know NDB is supposed to but seems it not all the way there (based on this ticket).

@ifel
Copy link
Contributor Author

ifel commented Oct 29, 2019

Thanks for the comments.

I know NDB is supposed to but seems it not all the way there

Is this something ndb support from the very beginning or it was added recently. The version of the RPM we use is a bit old (Fri Apr 27 13:27:02 2018 +0300). Any chance it's been fixed since that?

@mlschroe
Copy link
Contributor

NDB supports detection of outdated indices since the beginning, but currently rpm doesn't make use of that option.

So no, it's not yet fixed. (But it's not hard to do so, unlike with bdb)

@mlschroe
Copy link
Contributor

You see this with ndb but not lmdb because ndb fsyncs the main database after each operation. So after a system crash you'll have a correct main database but outdated indices. With lmdb there's no fsync at all, so you get a outdated main database with matching indices.

Ndb's behavior is way better if it automatically rebuilds the index databases.

@pmatilai
Copy link
Member

But it's not hard to do so, unlike with bdb

Ndb-style detecting out-of-sync indices with bdb might be hard, but doing it the database transaction route is not hard, now that commit 0508e9a paved the way. Enabling transactions in bdb now that we finally can is actually tempting for multiple reasons (including automatic recovery), the downside is that it makes it somewhat incompatible with all the older versions, so do we really want to change something that's on it's slow way out...

@ffesti ffesti added this to Needs triage in Ticket Review (Outdated) Mar 3, 2020
@pmatilai pmatilai moved this from Needs triage to No in Ticket Review (Outdated) Mar 4, 2020
@pmatilai pmatilai changed the title DB is inconsistent (Berkeley) DB is inconsistent Mar 4, 2020
@pmatilai
Copy link
Member

pmatilai commented Mar 4, 2020

Commit 40269d4 added automatic index regenration for ndb if it gets out of sync, and sqlite does transactions so its supposedly not possible to go out of sync.
That leaves BDB, and the sad situation is that the last thing we want to do at this point when we're basically just about to deprecate BDB, is to change it in an incompatible manner. Which means we cannot do anything about this, on Berkeley DB backend, unfortunately.

@pmatilai pmatilai closed this as completed Mar 4, 2020
Ticket Review (Outdated) automation moved this from No to Closed Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants