Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly store the outputs of CA derivations − take 2 #4330

Merged
merged 3 commits into from Dec 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 18 additions & 0 deletions src/libstore/binary-cache-store.cc
Expand Up @@ -443,6 +443,24 @@ StorePath BinaryCacheStore::addTextToStore(const string & name, const string & s
})->path;
}

std::optional<const Realisation> BinaryCacheStore::queryRealisation(const DrvOutput & id)
{
auto outputInfoFilePath = realisationsPrefix + "/" + id.to_string() + ".doi";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does doi mean?

BTW this URL format means that there can be only one mapping per realisation, which is probably fine for a binary cache, but it does mean you can't take the union of two binary caches (e.g. a mirror of multiple upstream binary caches) - or at least not of their /realisations directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doi stands for “derivation output info” (it's indeed quite obscure).

Yes, I agree on the uniqueness constraint that might be somehow too strong. But I think we'll have time to rework this schema − it's experimental anyways, so nothing is set in any way, and it's not really clear to me atm how we could have a sensible implementation if we drop it.

auto rawOutputInfo = getFile(outputInfoFilePath);

if (rawOutputInfo) {
return {Realisation::fromJSON(
nlohmann::json::parse(*rawOutputInfo), outputInfoFilePath)};
} else {
return std::nullopt;
}
}

void BinaryCacheStore::registerDrvOutput(const Realisation& info) {
auto filePath = realisationsPrefix + "/" + info.id.to_string() + ".doi";
upsertFile(filePath, info.toJSON(), "application/json");
}

ref<FSAccessor> BinaryCacheStore::getFSAccessor()
{
return make_ref<RemoteFSAccessor>(ref<Store>(shared_from_this()), localNarCache);
Expand Down
7 changes: 7 additions & 0 deletions src/libstore/binary-cache-store.hh
Expand Up @@ -33,6 +33,9 @@ private:

protected:

// The prefix under which realisation infos will be stored
const std::string realisationsPrefix = "/realisations";

BinaryCacheStore(const Params & params);

public:
Expand Down Expand Up @@ -99,6 +102,10 @@ public:
StorePath addTextToStore(const string & name, const string & s,
const StorePathSet & references, RepairFlag repair) override;

void registerDrvOutput(const Realisation & info) override;

std::optional<const Realisation> queryRealisation(const DrvOutput &) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a std::vector<Realisation> since there could be multiple realisations for a particular derivation+output-name. Even if we don't really support multiple realisations at the moment, it's probably good to express the possibility in the interface. When we need a particular realisation for a dependency (when building) we can just pick the first one for now. But in the future we could have an algorithm that tries to pick realisations that minimise the amount of closure rewriting needed to get a consistent input closure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're okay with it, I'll rather just change the daemon protocol to allow passing a list as it's the only place in which we can't break backwards compat for free, and it makes more sense to me to make it clear that we can only return one realisation for the time being as long as there's nothing preventing us from changing that in the future


void narFromPath(const StorePath & path, Sink & sink) override;

BuildResult buildDerivation(const StorePath & drvPath, const BasicDerivation & drv,
Expand Down
27 changes: 22 additions & 5 deletions src/libstore/build/derivation-goal.cc
Expand Up @@ -493,8 +493,9 @@ void DerivationGoal::inputsRealised()
if (useDerivation) {
auto & fullDrv = *dynamic_cast<Derivation *>(drv.get());

if ((!fullDrv.inputDrvs.empty() && derivationIsCA(fullDrv.type()))
|| fullDrv.type() == DerivationType::DeferredInputAddressed) {
if (settings.isExperimentalFeatureEnabled("ca-derivations") &&
((!fullDrv.inputDrvs.empty() && derivationIsCA(fullDrv.type()))
|| fullDrv.type() == DerivationType::DeferredInputAddressed)) {
/* We are be able to resolve this derivation based on the
now-known results of dependencies. If so, we become a stub goal
aliasing that resolved derivation goal */
Expand Down Expand Up @@ -2094,6 +2095,20 @@ struct RestrictedStore : public LocalFSStore, public virtual RestrictedStoreConf
/* Nothing to be done; 'path' must already be valid. */
}

void registerDrvOutput(const Realisation & info) override
{
if (!goal.isAllowed(info.id.drvPath))
throw InvalidPath("cannot register unknown drv output '%s' in recursive Nix", printStorePath(info.id.drvPath));
next->registerDrvOutput(info);
}

std::optional<const Realisation> queryRealisation(const DrvOutput & id) override
{
if (!goal.isAllowed(id.drvPath))
throw InvalidPath("cannot query the output info for unknown derivation '%s' in recursive Nix", printStorePath(id.drvPath));
return next->queryRealisation(id);
}

void buildPaths(const std::vector<StorePathWithOutputs> & paths, BuildMode buildMode) override
{
if (buildMode != bmNormal) throw Error("unsupported build mode");
Expand Down Expand Up @@ -3391,9 +3406,11 @@ void DerivationGoal::registerOutputs()
drvPathResolved = writeDerivation(worker.store, drv2);
}

if (useDerivation || isCaFloating)
for (auto & [outputName, newInfo] : infos)
worker.store.linkDeriverToPath(drvPathResolved, outputName, newInfo.path);
if (settings.isExperimentalFeatureEnabled("ca-derivations"))
for (auto& [outputName, newInfo] : infos)
worker.store.registerDrvOutput(Realisation{
.id = DrvOutput{drvPathResolved, outputName},
.outPath = newInfo.path});
}


Expand Down
11 changes: 11 additions & 0 deletions src/libstore/ca-specific-schema.sql
@@ -0,0 +1,11 @@
-- Extension of the sql schema for content-addressed derivations.
-- Won't be loaded unless the experimental feature `ca-derivations`
-- is enabled

create table if not exists Realisations (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "realisations" seems a bit confusing. What does it realise? In Nix, "realise" is already a vague and overloaded term for "substituting or building a derivation". There is also a terminology mismatch between registerDrvOutput and queryRealisation - shouldn't that be registerRealisation or queryDrvOutput?

How about CADerivers or CADerivationOutputs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's indeed a terminology mismatch and we need to settle on one thing.

I really like Realisation tbh because these are precisely witnesses that a derivation has been realised (for the definition you give above). But it's indeed a quite overloaded term so if you don't like it we can find something else.

I tend to use DerivationOutput only for the “symbolic” foo.drv!out because it's also a very overloaded term and it's easy to get lost if we abuse of it.

(also, they aren't really Derivers (though they could be used as such) because the intent is to associate derivation outputs (foo.drv!out) to their output paths and not the other way around)

TL;DR, I'm fine changing the naming, but for my own sanity (and other's too) I think we need to find clear and distinct names for

  1. foo.drv!out
  2. The relation (foo.drv!out, /nix/store/123-foo-out)

drvPath text not null,
outputName text not null, -- symbolic output id, usually "out"
outputPath integer not null,
primary key (drvPath, outputName),
foreign key (outputPath) references ValidPaths(id) on delete cascade
);
22 changes: 22 additions & 0 deletions src/libstore/daemon.cc
Expand Up @@ -868,6 +868,28 @@ static void performOp(TunnelLogger * logger, ref<Store> store,
break;
}

case wopRegisterDrvOutput: {
logger->startWork();
auto outputId = DrvOutput::parse(readString(from));
auto outputPath = StorePath(readString(from));
auto resolvedDrv = StorePath(readString(from));
store->registerDrvOutput(Realisation{
.id = outputId, .outPath = outputPath});
logger->stopWork();
break;
}

case wopQueryRealisation: {
logger->startWork();
auto outputId = DrvOutput::parse(readString(from));
auto info = store->queryRealisation(outputId);
logger->stopWork();
std::set<StorePath> outPaths;
if (info) outPaths.insert(info->outPath);
worker_proto::write(*store, to, outPaths);
break;
}

default:
throw Error("invalid operation %1%", op);
}
Expand Down
3 changes: 3 additions & 0 deletions src/libstore/dummy-store.cc
Expand Up @@ -60,6 +60,9 @@ struct DummyStore : public Store, public virtual DummyStoreConfig
BuildResult buildDerivation(const StorePath & drvPath, const BasicDerivation & drv,
BuildMode buildMode) override
{ unsupported("buildDerivation"); }

std::optional<const Realisation> queryRealisation(const DrvOutput&) override
{ unsupported("queryRealisation"); }
};

static RegisterStoreImplementation<DummyStore, DummyStoreConfig> regDummyStore;
Expand Down
4 changes: 4 additions & 0 deletions src/libstore/legacy-ssh-store.cc
Expand Up @@ -333,6 +333,10 @@ struct LegacySSHStore : public Store, public virtual LegacySSHStoreConfig
auto conn(connections->get());
return conn->remoteVersion;
}

std::optional<const Realisation> queryRealisation(const DrvOutput&) override
// TODO: Implement
{ unsupported("queryRealisation"); }
};

static RegisterStoreImplementation<LegacySSHStore, LegacySSHStoreConfig> regLegacySSHStore;
Expand Down
1 change: 1 addition & 0 deletions src/libstore/local-binary-cache-store.cc
Expand Up @@ -87,6 +87,7 @@ class LocalBinaryCacheStore : public BinaryCacheStore, public virtual LocalBinar
void LocalBinaryCacheStore::init()
{
createDirs(binaryCacheDir + "/nar");
createDirs(binaryCacheDir + realisationsPrefix);
if (writeDebugInfo)
createDirs(binaryCacheDir + "/debuginfo");
BinaryCacheStore::init();
Expand Down