Skip to content

Commit

Permalink
Add option to verify build determinism
Browse files Browse the repository at this point in the history
Passing "--option build-repeat <N>" will cause every build to be
repeated N times. If the build output differs between any round, the
build is rejected, and the output paths are not registered as
valid. This is primarily useful to verify build determinism. (We
already had a --check option to repeat a previously succeeded
build. However, with --check, non-deterministic builds are registered
in the DB. Preventing that is useful for Hydra to ensure that
non-deterministic builds don't end up getting published at all.)
  • Loading branch information
edolstra committed Nov 9, 2015
1 parent 96c2ebf commit 8fdd156
Show file tree
Hide file tree
Showing 3 changed files with 76 additions and 12 deletions.
12 changes: 12 additions & 0 deletions doc/manual/command-ref/conf-file.xml
Expand Up @@ -616,6 +616,18 @@ flag, e.g. <literal>--option gc-keep-outputs false</literal>.</para>
</varlistentry>


<varlistentry xml:id="conf-build-repeat"><term><literal>build-repeat</literal></term>

<listitem><para>How many times to repeat builds to check whether
they are deterministic. The default value is 0. If the value is
non-zero, every build is repeated the specified number of
times. If the contents of any of the runs differs from the
previous ones, the build is rejected and the resulting store paths
are not registered as “valid” in Nix’s database.</para></listitem>

</varlistentry>


</variablelist>

</para>
Expand Down
62 changes: 53 additions & 9 deletions src/libstore/build.cc
Expand Up @@ -791,13 +791,19 @@ class DerivationGoal : public Goal
temporary paths. */
PathSet redirectedBadOutputs;

/* Set of inodes seen during calls to canonicalisePathMetaData()
for this build's outputs. This needs to be shared between
outputs to allow hard links between outputs. */
InodesSeen inodesSeen;

BuildResult result;

/* The current round, if we're building multiple times. */
unsigned int curRound = 1;

unsigned int nrRounds;

/* Path registration info from the previous round, if we're
building multiple times. Since this contains the hash, it
allows us to compare whether two rounds produced the same
result. */
ValidPathInfos prevInfos;

public:
DerivationGoal(const Path & drvPath, const StringSet & wantedOutputs,
Worker & worker, BuildMode buildMode = bmNormal);
Expand Down Expand Up @@ -1238,6 +1244,10 @@ void DerivationGoal::inputsRealised()
for (auto & i : drv->outputs)
if (i.second.hash == "") fixedOutput = false;

/* Don't repeat fixed-output derivations since they're already
verified by their output hash.*/
nrRounds = fixedOutput ? 1 : settings.get("build-repeat", 0) + 1;

/* Okay, try to build. Note that here we don't wait for a build
slot to become available, since we don't need one if there is a
build hook. */
Expand Down Expand Up @@ -1420,6 +1430,9 @@ void replaceValidPath(const Path & storePath, const Path tmpPath)
}


MakeError(NotDeterministic, BuildError)


void DerivationGoal::buildDone()
{
trace("build done");
Expand Down Expand Up @@ -1519,6 +1532,15 @@ void DerivationGoal::buildDone()

deleteTmpDir(true);

/* Repeat the build if necessary. */
if (curRound++ < nrRounds) {
outputLocks.unlock();
buildUser.release();
state = &DerivationGoal::tryToBuild;
worker.wakeUp(shared_from_this());
return;
}

/* It is now safe to delete the lock files, since all future
lockers will see that the output paths are valid; they will
not create new lock files with the same names as the old
Expand Down Expand Up @@ -1552,6 +1574,7 @@ void DerivationGoal::buildDone()
% drvPath % 1 % e.msg());

st =
dynamic_cast<NotDeterministic*>(&e) ? BuildResult::NotDeterministic :
statusOk(status) ? BuildResult::OutputRejected :
fixedOutput || diskFull ? BuildResult::TransientFailure :
BuildResult::PermanentFailure;
Expand Down Expand Up @@ -1678,10 +1701,13 @@ int childEntry(void * arg)

void DerivationGoal::startBuilder()
{
startNest(nest, lvlInfo, format(
buildMode == bmRepair ? "repairing path(s) %1%" :
buildMode == bmCheck ? "checking path(s) %1%" :
"building path(s) %1%") % showPaths(missingPaths));
auto f = format(
buildMode == bmRepair ? "repairing path(s) %1%" :
buildMode == bmCheck ? "checking path(s) %1%" :
nrRounds > 1 ? "building path(s) %1% (round %2%/%3%)" :
"building path(s) %1%");
f.exceptions(boost::io::all_error_bits ^ boost::io::too_many_args_bit);
startNest(nest, lvlInfo, f % showPaths(missingPaths) % curRound % nrRounds);

/* Right platform? */
if (!canBuildLocally(*drv)) {
Expand All @@ -1693,6 +1719,7 @@ void DerivationGoal::startBuilder()
}

/* Construct the environment passed to the builder. */
env.clear();

/* Most shells initialise PATH to some default (/bin:/usr/bin:...) when
PATH is not set. We don't want this, so we fill it in with some dummy
Expand Down Expand Up @@ -1873,6 +1900,8 @@ void DerivationGoal::startBuilder()
PathSet dirs2 = tokenizeString<StringSet>(settings.get("build-extra-chroot-dirs", string("")));
dirs.insert(dirs2.begin(), dirs2.end());

dirsInChroot.clear();

for (auto & i : dirs) {
size_t p = i.find('=');
if (p == string::npos)
Expand Down Expand Up @@ -2582,6 +2611,11 @@ void DerivationGoal::registerOutputs()

ValidPathInfos infos;

/* Set of inodes seen during calls to canonicalisePathMetaData()
for this build's outputs. This needs to be shared between
outputs to allow hard links between outputs. */
InodesSeen inodesSeen;

/* Check whether the output paths were created, and grep each
output path to determine what other paths it references. Also make all
output paths read-only. */
Expand Down Expand Up @@ -2753,6 +2787,16 @@ void DerivationGoal::registerOutputs()

if (buildMode == bmCheck) return;

if (curRound > 1 && prevInfos != infos)
throw NotDeterministic(
format("result of ‘%1%’ differs from previous round; rejecting as non-deterministic")
% drvPath);

if (curRound < nrRounds) {
prevInfos = infos;
return;
}

/* Register each output path as valid, and register the sets of
paths referenced by each of them. If there are cycles in the
outputs, this will fail. */
Expand Down
14 changes: 11 additions & 3 deletions src/libstore/store-api.hh
Expand Up @@ -87,10 +87,17 @@ struct ValidPathInfo
Path deriver;
Hash hash;
PathSet references;
time_t registrationTime;
unsigned long long narSize; // 0 = unknown
time_t registrationTime = 0;
unsigned long long narSize = 0; // 0 = unknown
unsigned long long id; // internal use only
ValidPathInfo() : registrationTime(0), narSize(0) { }

bool operator == (const ValidPathInfo & i) const
{
return
path == i.path
&& hash == i.hash
&& references == i.references;
}
};

typedef list<ValidPathInfo> ValidPathInfos;
Expand All @@ -114,6 +121,7 @@ struct BuildResult
MiscFailure,
DependencyFailed,
LogLimitExceeded,
NotDeterministic,
} status = MiscFailure;
std::string errorMsg;
//time_t startTime = 0, stopTime = 0;
Expand Down

7 comments on commit 8fdd156

@copumpkin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Have you seen NixOS/nixpkgs#9731? My original idea was to have hydra run (at least some) packages meta.reproducible with --check, but this thing sounds better.

Was also thinking that it might be nice to try to maximize variation in builders across a fleet in Hydra. So if I have build-repeat 3 and a fleet of different build machines, hydra would run the same build on many of them. That might take more work than it's worth, though.

@edolstra
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about having Hydra do a build in parallel on different machines, but that also multiplies the amount of I/O / network traffic, which is currently a limiting factor. So it's cheaper to run a build multiple times on the same machine.

@Ericson2314
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edolstra Maybe with you plan for the build machines to use cache.nixos.org directly (as described in mailing list), that would that be less of a problem? Would be nice to try on different machines long term.

@vcunat
Copy link
Member

@vcunat vcunat commented on 8fdd156 Nov 11, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I had tried, finding loads of non-deterministic packages is rather easy, especially if you disable chrooting, but fixing much of what you find is very time-consuming.

@domenkozar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copumpkin
Copy link
Member

@copumpkin copumpkin commented on 8fdd156 Nov 11, 2015 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vcunat
Copy link
Member

@vcunat vcunat commented on 8fdd156 Nov 11, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do I. Such things always make me sad, and I never really understood why (the split).

Please sign in to comment.