Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg: sqlite error while executing PRAGMA user_version; in file pkgdb.c:2357: database disk image is malformed #2072

Closed
grahamperrin opened this issue Sep 10, 2022 · 12 comments

Comments

@grahamperrin
Copy link
Contributor

A VirtualBox guest with FreeBSD 12.3-RELEASE-p7 stopped responding whilst pkg-install(8) awaited a y/n response:

VirtualBox, UFS

A forced stop of the guest was required, the UFS file system is reportedly clean, I'm left with a malformed image:

image

In a situation such as this, is recovery possible?

Output from ls -ahlrt /var/backups:

total 10144
-rw-r--r--   1 root  wheel   1.7K Oct 23  2020 aliases.bak2
-rw-r--r--   1 root  wheel   779B Oct 23  2021 group.bak2
-rw-------   1 root  wheel   3.0K Oct 23  2021 master.passwd.bak2
-rw-r--r--   1 root  wheel   779B Dec 27  2021 group.bak
-rw-r--r--   1 root  wheel   1.7K Dec 27  2021 aliases.bak
-rw-------   1 root  wheel   3.0K Dec 27  2021 master.passwd.bak
-rw-r--r--   1 root  wheel   747B Sep  5 03:01 kern.geom.conftxt.bak
-rw-r--r--   1 root  wheel   172B Sep  5 03:01 gpart.ada0.bak
-rw-r--r--   1 root  wheel    56K Sep  5 03:01 boot.ada0p1.bak
drwxr-x---   2 root  wheel   512B Sep  5 04:39 .
-rw-r--r--   1 root  wheel   9.8M Sep  5 04:40 pkg.sql.xz
drwxr-xr-x  26 root  wheel   512B Sep 10 10:07 ..

TIA

@grahamperrin
Copy link
Contributor Author

-rw-r--r-- 1 root wheel 9.8M Sep 5 04:40 pkg.sql.xz

That's a few days old.

I'm almost certain that this morning, I completed an upgrade of various packages

@bapt
Copy link
Member

bapt commented Sep 12, 2022

nothing we can do about it, your database is corrupted, the last backup is 2 days old

@grahamperrin
Copy link
Contributor Author

grahamperrin commented Sep 12, 2022

Thanks, that's unlucky. I did my best to tune UFS for data to be at minimal risk (sync and so on); I normally prefer ZFS, it just happened to be UFS in this case. I'm glad that it's not production data.

If I were to restore from the backup, to have a database that is (a) non-corrupt but (b) not a true reflection of what's installed, I guess that would be a terrible idea. Terrible, true?


Afterthought: restore, pkg prime-origins | sort, save, then use the list for a forced (re)installation of everything that was present at the time of the backup. Still a terrible idea?

@grahamperrin
Copy link
Contributor Author

… forced stop …

… database is corrupted …

Should the rollback journal guard against corruption in this situation? Forced stop analogous to loss of power.

https://www.sqlite.org/tempfiles.html#rollbackjrnl

(Do I misunderstand what's there?)

@bapt
Copy link
Member

bapt commented Sep 14, 2022

just restore your 1 day old database and rerun pkg upgrade, you will be in a sane situation again.

I don't know how you ended up in the corrupted situation this is very rare and yes the rollback should have happened, not being on your box and able to reproduce it is hard do diagnose.

I have been trying to reproduce your case and I can't find a way to actually corrupt, by killing abruptly VMs or physical machines.

@bapt bapt closed this as completed Sep 14, 2022
@grahamperrin
Copy link
Contributor Author

just restore your 1 day old database and rerun pkg upgrade, you will be in a sane situation again. ...

Why did I not think of that? :-) thank you.

... can't find a way to actually corrupt, by killing abruptly VMs or physical machines.

As you say, extremely rare, however you might increase the risk of corruption by doing a couple of things that I did not:

  • disable soft updates
  • mount the volume normally, without sync.

Happy to discuss elsewhere, if ever the mood takes you. Somewhere file system-related.

@grahamperrin
Copy link
Contributor Author

I restored once, then encountered an error following pkg upgrade.

Restored again, what's pictured below is (I think) the same error:

image

@bapt
Copy link
Member

bapt commented Sep 19, 2022 via email

@grahamperrin

This comment was marked as outdated.

@grahamperrin

This comment was marked as outdated.

@grahamperrin
Copy link
Contributor Author

xzcat /var/backups/pkg.sql.xz | pkg shell

After a few difficulties, I did some weird shit | voodoo | methodical secret sauce that seems to bring reliability to this particular VM (touch wood, it now runs for extended periods without inexplicably dying).

Then this, as root:

  1. sh
  2. cd /var/db/pkg/
  3. rm local.sqlite
  4. xzcat /var/backups/pkg.sql.xz | pkg shell
  5. wait for a while … no visible output, unlike the three runs in the hidden comment above
  6. pkg upgrade ⋯

Are sh, and removal of local.sqlite, prerequisite to a successful run of (4) the given command?


#2072 (comment)

@grahamperrin
Copy link
Contributor Author

#2009

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants