Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite: database is locked #17859

Closed
edsantiago opened this issue Mar 20, 2023 · 7 comments · Fixed by #17953 or #18339
Closed

sqlite: database is locked #17859

edsantiago opened this issue Mar 20, 2023 · 7 comments · Fixed by #17953 or #18339
Labels
bugweek flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. sqlite Bugs pertaining to the sqlite database backend

Comments

@edsantiago
Copy link
Collaborator

Almost certainly related to #17858 but I will let someone else decide and possibly merge.

  Podman start after signal kill
  ...
  podman [options] --db-backend sqlite --storage-driver vfs ps -q
  Error: switching journal to WAL mode: database is locked

Unfortunately, this is also happening a lot in cleanup, where for long and horrible complicated reasons my flake logger cannot detect:

$ podman [options] rm -fa -t 0
Error: adding container id to database: database is locked

https://api.cirrus-ci.com/v1/artifact/task/4709781731016704/html/int-podman-fedora-37-rootless-host-sqlite.log.html#t--netns--1

https://api.cirrus-ci.com/v1/artifact/task/6409078293921792/html/int-podman-fedora-37-rootless-host.log.html#t--Podman-start-after-signal-kill--1

@edsantiago edsantiago added flakes Flakes from Continuous Integration sqlite Bugs pertaining to the sqlite database backend labels Mar 20, 2023
@edsantiago
Copy link
Collaborator Author

Here's one in podman run --rm (f37 root):

# podman run --rm -d --name JRhbDObiOp quay.io/libpod/testimage:20221018 top
Error: adding container id to database: database is locked

(there's another one in that same log, with restart)

Here's one in podman-remote

@vrothberg
Copy link
Member

@mheon @baude FYI

vrothberg added a commit to vrothberg/libpod that referenced this issue Mar 21, 2023
The symptoms in containers#17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as containers#17859 also mentions an error
that the DB is locked during a transaction.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
@vrothberg
Copy link
Member

Opened #17867 as an anticipated fix.

vrothberg added a commit to vrothberg/libpod that referenced this issue Mar 21, 2023
The symptoms in containers#17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as containers#17859 also mentions an error
that the DB is locked during a transaction.

[NO NEW TESTS NEEDED] - existing tests cover the code.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Mar 21, 2023
The symptoms in containers#17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as containers#17859 also mentions an error
that the DB is locked during a transaction.

[NO NEW TESTS NEEDED] - existing tests cover the code.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

<MH: Cherry-picked on top of my branch>

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Mar 21, 2023
The symptoms in containers#17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as containers#17859 also mentions an error
that the DB is locked during a transaction.

[NO NEW TESTS NEEDED] - existing tests cover the code.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

<MH: Cherry-picked on top of my branch>

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
@edsantiago
Copy link
Collaborator Author

Sorry:

Podman start after signal kill
...
# podman [options] run --http-proxy=false --name test1 -d quay.io/libpod/alpine:latest top
# podman [options] ps -q
Error: cannot connect to database: database is locked

In f36 root container, and this is in my #17831 PR which includes all sqlite fixes so far as of Monday.

@edsantiago edsantiago reopened this Mar 27, 2023
@edsantiago
Copy link
Collaborator Author

Another one, f37 rootless host (not container).

vrothberg added a commit to vrothberg/libpod that referenced this issue Mar 28, 2023
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix containers#17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.

[NO NEW TESTS NEEDED] as it's a hard to reproduce race.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Mar 28, 2023
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix containers#17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.

[NO NEW TESTS NEEDED] as it's a hard to reproduce race.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Mar 28, 2023
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix containers#17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.

[NO NEW TESTS NEEDED] as it's a hard to reproduce race.

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
@edsantiago edsantiago added the kind/bug Categorizes issue or PR as related to a bug. label Mar 28, 2023
@edsantiago
Copy link
Collaborator Author

Sorry. Seen in my no-flake-retries PR, f37 rootless, the usual symptom (in podman ps):

$ podman [options] ps -q
Error: checking if DB config table exists: database is locked

This version of the CI run DOES NOT include the WAL diffs, it's just hammering at sqlite. Reopening, sorry.

@edsantiago edsantiago reopened this Apr 19, 2023
@vrothberg
Copy link
Member

Cc: 🚑 @mheon

vrothberg added a commit to vrothberg/libpod that referenced this issue Apr 25, 2023
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] mattn/go-sqlite3#274 (comment)

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Apr 25, 2023
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] mattn/go-sqlite3#274 (comment)

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Apr 25, 2023
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] mattn/go-sqlite3#274 (comment)

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Apr 25, 2023
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] mattn/go-sqlite3#274 (comment)

Fixes: containers#17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bugweek flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. sqlite Bugs pertaining to the sqlite database backend
Projects
None yet
2 participants