Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix-build yields SQLiteError due to failed constraint when inserting into NARs #3898

Closed
terlar opened this issue Aug 5, 2020 · 15 comments · Fixed by #7616
Closed

nix-build yields SQLiteError due to failed constraint when inserting into NARs #3898

terlar opened this issue Aug 5, 2020 · 15 comments · Fixed by #7616
Labels

Comments

@terlar
Copy link

terlar commented Aug 5, 2020

Describe the bug

Running on a CI pipeline with two parallel nix-build breaks with SQLiteError sporadically when using nix-2.4pre7805_984e5213.
(/nix/store/i188a271sni33gqxs0nwrkgnpljfhwwf-nix-2.4pre7805_984e5213.drv)

I see errors such as this:

$ nix-build -A eu-west-1.platform --argstr env dev
error (ignored): error: --- SQLiteError --- nix-build
executing SQLite statement 'insert or replace into NARs(cache, hashPart, namePart, url, compression, fileHash, fileSize, narHash, narSize, refs, deriver, sigs, ca, timestamp, present) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 1)': constraint failed (in '/home/jenkins/.cache/nix/binary-cache-v6.sqlite')
while evaluating the attribute 'paths' of the derivation 'cbpf-platform-env' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:7:14:
while evaluating the attribute 'paths' of the derivation 'cbpf' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:7:14:
while evaluating the attribute 'buildCommand' of the derivation 'cbpf-deps' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:7:14:
while evaluating 'concatMapStrings' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/lib/strings.nix:31:25, called from /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:329:9:
while evaluating anonymous function at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:329:31, called from undefined position:
while evaluating 'escapeShellArg' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/lib/strings.nix:296:20, called from /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/pkgs/build-support/trivial-builders.nix:331:19:
while evaluating the attribute 'buildInputs' of the derivation 'redoc' at /home/jenkins/agent/workspace/platform_master/tools/redoc-cli/api.nix:8:3:
while evaluating the attribute 'buildInputs' of the derivation 'platform-docs' at /home/jenkins/agent/workspace/platform_master/platform/docs.nix:34:3:
while evaluating the attribute 'buildInputs' of the derivation 'platform-get-api-docs' at /home/jenkins/agent/workspace/platform_master/platform/get-api-docs.nix:14:3:
while evaluating the attribute 'installPhase' of the derivation 'banking-errands-docs' at /home/jenkins/agent/workspace/platform_master/meta/drv/docs.nix:11:3:
while evaluating the attribute 'iam.remoteStateEnv' at /home/jenkins/agent/workspace/platform_master/meta/drv/component.nix:72:3:
while evaluating 'optionalTraceIf' at /home/jenkins/agent/workspace/platform_master/meta/fns.nix:150:38, called from /home/jenkins/agent/workspace/platform_master/meta/drv/component.nix:54:6:
while evaluating 'optionalTraceIf' at /home/jenkins/agent/workspace/platform_master/meta/fns.nix:150:38, called from /home/jenkins/agent/workspace/platform_master/meta/drv/component.nix:38:8:
while evaluating the attribute 'grid.components.iam.env' at /home/jenkins/agent/workspace/platform_master/meta/forkPoint.nix:53:3:
while evaluating 'optionalTrace' at /home/jenkins/agent/workspace/platform_master/meta/fns.nix:144:30, called from /home/jenkins/agent/workspace/platform_master/meta/forkPoint.nix:50:6:
while evaluating 'fileContents' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/lib/strings.nix:680:18, called from /home/jenkins/agent/workspace/platform_master/meta/forkPoint.nix:48:11:
while evaluating 'removeSuffix' at /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/lib/strings.nix:443:5, called from /nix/store/sp6h92qaz2rdppy43g0k36w1chp857ss-nixpkgs-0cfa467f8771ca585b3e122806c8337f5025bd92/lib/strings.nix:680:24:

error: --- SQLiteError --- nix-build
executing SQLite statement 'insert or replace into NARs(cache, hashPart, timestamp, present) values (?, ?, ?, 0)': constraint failed (in '/home/jenkins/.cache/nix/binary-cache-v6.sqlite')

I have not seen this before switching to nixUnstable.

Steps To Reproduce

Unfortunately I cannot reliably reproduce it, it happens a few times every day.

Expected behavior

No SQLite insert statement errors.

nix-env --version output
nix-env (Nix) 2.4pre7805_984e5213

@terlar terlar added the bug label Aug 5, 2020
@stale
Copy link

stale bot commented Feb 12, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label Feb 12, 2021
@terlar
Copy link
Author

terlar commented Apr 5, 2022

I still see this at times on Nix 2.7

@stale stale bot removed the stale label Apr 5, 2022
@cmm
Copy link
Member

cmm commented Jul 23, 2022

just got this with Nix 2.10.3, wat do?

@terlar
Copy link
Author

terlar commented Jul 24, 2022

  • Do you run stuff in parallel as well?
  • Do you run multi-user?

I didn't see this recently after switching to multi-user. The time when I did see this I also had a work-around and that was to set a different XDG_CACHE_HOME for each parallel execution, in that way there was no conflict within the binary-cache-v6.sqlite file.

@cmm
Copy link
Member

cmm commented Jul 24, 2022

in parallel (3 concurrent jobs on one builder and 10 on the other) and, I guess, multi-user? (as in, the client and the builders all run NixOS).
I've seen the problem just once yesterday and it hasn't reappeared so far 🤷

how do you change the execution environment?

@pmiddend
Copy link

pmiddend commented Dec 1, 2022

I also experienced this error just now. Restarting my CI pipeline worked.

@avdv
Copy link
Member

avdv commented Dec 13, 2022

Just hit this too, nix (Nix) 2.11.0:

direnv allow
direnv: loading ~/source/da/projection/.envrc                                                                                                                                 
direnv: using nix
direnv: ([/nix/store/p2jkp6yr9i6dv6154l366x4kp8l3fryi-direnv-2.32.1/bin/direnv export zsh]) is taking a while to execute. Use CTRL-C to give up.
error: executing SQLite statement 'insert or replace into NARs(cache, hashPart, timestamp, present) values (8, '28cjd25sgsrfm9iz71r493n02yif9cjx', 1670927875, 0)': constraint failed (in '/root/.cache/nix/binary-cache-v6.sqlite')

@roberth
Copy link
Member

roberth commented Dec 17, 2022

I have a suspicion that this is about the cache foreign key. Nix has no delete statements on the referenced BinaryCaches table, yet the id values aren't near 0. This suggests that the insert or replace into BinaryCaches may not work as expected. We might replace this "upsert" into BinaryCaches by two statements if this turns out to be the cause of the problem. Performance will not be affected, because inserts are very rare.

To hopefully confirm this theory, I'm improving (#7473) the sqlite error message to include the detailed message. This should tell us which constraint is violated. If my theory holds up, a PR to replace insert or replace should be sufficient. If not, the error message will tell us why.

@cransom
Copy link

cransom commented Dec 19, 2022

I've been hitting this as well and reproduced it with your much improved sqlite error message.

error: Trying to register a realisation of 'sha256:1b9c6991d0fe9a63637677f165669adbf3374894240e44f26ca2547e33aee596!out', but we already have another one locally.
       Local:  /nix/store/34whqq7d6s0l1j2nf6spdgq1fcsjlis1-holdings-web-ci-env
       Remote: /nix/store/34whqq7d6s0l1j2nf6spdgq1fcsjlis1-holdings-web-ci-env

@elaforge
Copy link

As another data point, we see this very frequently, nix 2.10, multiuser nix-daemon install, and up to 2 nix-store -r going on simultaneously.

@roberth
Copy link
Member

roberth commented Jan 9, 2023

Caught it with the improved error message. No surprises here, so I'll make a PR to work around the apparently broken BinaryCaches upsert. There is no valid reason for the binary cache id to update.

new info:

FOREIGN KEY constraint failed

Full message:

error: executing SQLite statement 'insert or replace into NARs(cache, hashPart, namePart, url, compression, fileHash, fileSize, narHash, narSize, refs, deriver, sigs, ca, timestamp, present) values (30, '1ffxxni0z0zp5qg69h3fapilggmvflgs', 'options.json.drv', 'nar/efa0dbeda9fda8e13de5d489ea5a0afb36764f5b7969ee74809d688b9512dd33.nar.xz', 'xz', 'sha256:0cyx2aaqns4xh1sfwsbrbd7pcdpv19dfm2flwlyy3a7xm7nxp87g', 1248, 'sha256:1nrzgwphqksirshsjpv4l4rn3lqizhm3ragzyxxyha2m06gbvn7v', 2792, '01w2kavfs1lmciscd0d8fqv22887411y-options.json 9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh bfp7hjhlq91pp5falqz1bynz3l3lgj1f-brotli-1.0.9.drv iyc092w3nzbfjhpgs9pgpgg8vjjyzpxk-base.json kv0w4lpyy8674rn5zyl763z4wj3x3swp-python3-3.10.9-env.drv pn1aif4piypy12ydns6m26cs7sbggzv5-stdenv-darwin.drv vqb0s6db0cqm8k5z70nz97sap21zdf8i-mergeJSON.py y590cp3rmadv696cjny32wck41cjyn5h-bash-5.1-p16.drv', NULL, 'hercules-ci.cachix.org-1:LdQ97nQdTgr9o+0bTn2rXKK4BfczrrpzsUe+6HZh3wtB7zmkYGetsm3A1jnEQdsEeHdR+HvJqzZSlVYB0Hh0Cw==', '', 1673254211, 1)': constraint failed, FOREIGN KEY constraint failed (in '/private/var/root/.cache/nix/binary-cache-v6.sqlite')

@CodeWithOz
Copy link

@roberth nice work! 👏🏾 any idea or estimate for when a fix will be released? I just encountered this right now.

@CodeWithOz
Copy link

And is there a workaround that can be easily applied temporarily?

@CodeWithOz
Copy link

Strangely enough I ran the command that uses nix (direnv reload in my case) one more time and the error went away. 🤷🏾‍♂️

figsoda added a commit to nix-community/nurl that referenced this issue Jan 12, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Jan 17, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
roberth added a commit to hercules-ci/nix that referenced this issue Jan 17, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
@roberth
Copy link
Member

roberth commented Jan 17, 2023

This should be fixed by

Based on the code I've changed, the problem should have occurred at most once per (week, machine or user, cache) combination.

roberth added a commit to hercules-ci/nix that referenced this issue Jan 21, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
roberth added a commit to hercules-ci/nix that referenced this issue Jan 31, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Jan 31, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
roberth added a commit to hercules-ci/nix that referenced this issue Jan 31, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Jan 31, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
roberth added a commit to hercules-ci/nix that referenced this issue Feb 1, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Feb 1, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
roberth added a commit to hercules-ci/nix that referenced this issue Feb 7, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Feb 7, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.
edolstra added a commit that referenced this issue Feb 13, 2023
Fix foreign key error inserting into NARs #3898
github-actions bot pushed a commit that referenced this issue Feb 13, 2023
github-actions bot pushed a commit that referenced this issue Feb 13, 2023
Fixes #3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.

(cherry picked from commit fb94d5c)
roberth added a commit to hercules-ci/nix that referenced this issue Feb 14, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Feb 14, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.

(cherry picked from commit fb94d5c)
roberth added a commit to hercules-ci/nix that referenced this issue Feb 14, 2023
roberth added a commit to hercules-ci/nix that referenced this issue Feb 14, 2023
Fixes NixOS#3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.

(cherry picked from commit fb94d5c)
github-actions bot pushed a commit that referenced this issue Feb 17, 2023
(cherry picked from commit 2ceece3)
(cherry picked from commit cabf1d9)
github-actions bot pushed a commit that referenced this issue Feb 17, 2023
Fixes #3898

The entire `BinaryCaches` row used to get replaced after it became
stale according to the `timestamp` column. In a concurrent scenario,
this leads to foreign key conflicts as different instances of the
in-process `state.caches` cache now differ, with the consequence that
the older process still tries to use the `id` number of the old record.

Furthermore, this phenomenon appears to have caused the cache for
actual narinfos to be erased about every week, while the default
ttl for narinfos was supposed to be 30 days.

(cherry picked from commit fb94d5c)
(cherry picked from commit 8449b3c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants