Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0005] Nix encryption #5

Closed
wants to merge 6 commits into from
Closed

Conversation

@edolstra
Copy link
Member

@edolstra edolstra commented Mar 28, 2017

A proposal for supporting secrets in the Nix store.

Rendered: https://github.com/edolstra/rfcs/blob/nix-encryption/rfcs/0005-nix-encryption.md

edolstra added 4 commits Mar 28, 2017
Copy link
Member

@zimbatm zimbatm left a comment

Well written and thought out first draft 👍


* A command `nix decrypt` to decrypt files containing encrypted
strings produced by `encryptString`. This command searches for
`<{|nixcrypt:...|}>` fragments and decrypts them using a decryption

This comment has been minimized.

@zimbatm

zimbatm Mar 28, 2017
Member

Has <{|nixcrypt: been selected to make it easier to do find-and-replace at the decryption phase or are there other considerations? I'm wondering if nixcrypt::<base64-salt> could be enough. It's a bit of a detail though.

passing something like `--config /nix/store/.../foo.conf` to the
daemon. Instead, we have to decrypt the configuration file to some
suitably secure temporary location in a pre-start script, and then
pass that location (e.g. `--config /tmp/.../foo.conf`).

This comment has been minimized.

@zimbatm

zimbatm Mar 28, 2017
Member

The biggest drawback here is if another file depends on that file.

{
nginxConf = writeFile "nginx.conf" "include ${vserverConf};";
vserverConfg = writeFile "vserver.conf" " my secret is here ";
}

Would it be possible to taint the derivation output with an "encrypted" attribute and then have a mechanism that would prevent referencing it? (unless if it's to decrypt it) The tainting could also be used as a signal to not push it to public stores.

This comment has been minimized.

@teh

teh Mar 31, 2017

I agree. We often have chains of config files where changing a leaf bubbles up correctly all the way to the top thanks to nix. I'm not sure how many of our files contain secrets though. So this might actually be an OK price to pay in practice.

# Decrypt the configuration file.
${config.nix.package}/bin/nix-store \
--decrypt ${toString <nixos-store-key>} \
${configFile} > /run/secret/foo.conf

This comment has been minimized.

@zimbatm

zimbatm Mar 28, 2017
Member

Since the configuration has to be generated at runtime, what difference does it make to get the secret from the nix store or another source like hashicorp vault, /run/keys/* or an environment variable?

This comment has been minimized.

@arianvp

arianvp Mar 29, 2017
Member

Exactly my thought. Currently, nixops supports delivering keys out of band a well and saves them in /run/keys . I think the actual hard part is not solved by this RFC as it gives the same level of flexibility as what nixops currently already delivers! The real hard part is that we need to rewrite dozens of NixOS modules that directly store secrets such that they can retrieve their secrets from disk. And we should discourage or maybe totally deprecate any kind of other usage ...

Perhaps a larger part of the proposal isn't actually the encryption builtin, but actually rewriting modules in such way that they support this paradigm (and the nixops / vault paradigm).

Actually, I am in favor to make the nixops feature more easy to use instead of introducing an encryption primitive. I think it clashes too much with reproducibility to store nonced encrypted blobs in the nix store.

This comment has been minimized.

@zimbatm

zimbatm Mar 29, 2017
Member

There are two things here I think: (1) an interface to declare that the secret should exist (or other kind of states actually). Potentially it could also have an initialization script if the secret can be dynamically generated. (2) a way to compose existing nix derivations with the state.

For (2) it would be cool if it integrated with the language. Potentially we could have a subset of nix that can be evaluated at runtime.

This comment has been minimized.

@danbst

danbst Apr 1, 2017

Will FUSE wrapper on encrypted files solve this problem of requiring rewrite of modules?

This comment has been minimized.

@Gabriel439

Gabriel439 Jun 6, 2017

@zimbatm @arianvp: My understanding is that the main benefit of this approach is that some programs might not be equipped to read secrets from a file (for example, a configuration file that only accepts a secret as an inline string field). I don't know any such programs off the top of my head that are like this, though, but I guess it could happen. Are there any concrete examples where this is the case?

However, I that most programs can accept secrets from protected files outside the /nix/store and we could easily fix most NixOS services to use this approach today without any changes to Nix.


The assumption is that `<nixos-store-key>` resolves to something like
`/var/lib/nixos/store-key`, which should contain a key generated by
`nix generate-key` and should obviously be readable by root only.

This comment has been minimized.

@zimbatm

zimbatm Mar 28, 2017
Member

Let's say I have 3 different servers with different configs and a Hydra that pushes their config to a binary cache. Wouldn't one compromised server be able to read all the server's config from the binary cache?

This comment has been minimized.

@Gabriel439

Gabriel439 Jun 6, 2017

@zimbatm: I believe this is safe in the presence of both binary caches and distributed builds.

For binary caches, the actual cached configuration only stores encrypted secrets, so even if you mirror the configuration to other /nix/stores they won't be able to decrypt these secrets unless they have access the same key.

For distributed builds, files on the NIX_PATH are not copied to distributed build slaves, so if the <nixos-store-key> path is missing on the build slave machine(s) then the build will just fail.

[summary]: #summary

We currently lack a way to store secret information in the Nix
store. The proposal is to add add a builtin function to Nix to allow

This comment has been minimized.

@arianvp

arianvp Mar 29, 2017
Member

s/add add/add

This comment has been minimized.

@zimbatm

zimbatm Aug 31, 2017
Member

thanks, fixed in e014c40

@domenkozar
Copy link
Member

@domenkozar domenkozar commented Apr 3, 2017

I really like the proposal, it doesn't fight Nix design but rather embraces it.

My biggest concern is usability. I'd like for us to have more conventions over configurations where possible.

So encryptString first argument would accept the name of the key to be used, for example:

builtins.encryptString "nixpkgs" cfg.password

Then in nixpkgs we would have encryptString = builtins.encryptString "nixpkgs" to use one key for all encrypted strings.

At runtime we could do:

nix generate-key <key-name> where key-name is required
and nix decrypt <key-name> where `key-name* is optional and if not given it would decrypt all encrypted keys in the store.

The keys would get stored in /nix/var/nix/keys/<key-name>.

I agree with @teh as with impure derivations, nothing would be able to reference the encrypted store path.


We could replace the nonce with a deterministic value, but it's not
entirely clear what the cryptographic implications are. At the very
least, it allows attackers to obverse that a secret has changed, or

This comment has been minimized.

@taktoa

taktoa Apr 3, 2017

typo: s/obverse/observe/g

@zimbatm zimbatm changed the title Nix encryption [RFC 005] Nix encryption Apr 4, 2017
@edolstra edolstra changed the title [RFC 005] Nix encryption [RFC 0005] Nix encryption Apr 10, 2017
configFile = pkgs.writeText "foo.conf"
''
# Store the password of the foo service in encrypted form.
password=${builtins.encryptString <nixos-store-key> cfg.password}

This comment has been minimized.

@steveeJ

steveeJ Apr 12, 2017

My understanding is that cfg.password is cleartext, but it's not clear to me where it is defined and/or how the user will provide this.

This comment has been minimized.

@Gabriel439

Gabriel439 Jun 6, 2017

I'm aware of two solutions to providing the password:

  • Approach 1: The user does not supply a secret at all and the derivation auto-generates the secret at startup. The secret is generated so that only root can access it and the user has to ssh into the machine as root to obtain the credential if they need it to log in.
  • Approach 2: The user logs into the machine after deploy and deposits the secret in a preconfigured location by the user after deploy and the program/service waits or fails until the secret is available.
@taktoa
Copy link

@taktoa taktoa commented Apr 12, 2017

I think it might be a good idea to have an optional libsecret backend for secret storage, since it seems to be the current state of the art library for secret storage across various desktop environments. Of course, since glib is in the dependencies of libsecret, we would not want Nix to necessarily depend on it, but there are a few ways to avoid this issue:

  1. Nix could delegate secret retrieval to a subprocess, the same way git credential currently works, so you could plug in any backend you want. One point in favor of this idea is that this is probably also just good hygiene since processes have isolated memory spaces (and there could be bugs in Nix).
  2. We could use conditional compilation to avoid the dependency on libsecret whenever we are bootstrapping nix, but enable the libsecret backend in all other cases, so that e.g.: there would be a nixBootstrap attribute and a nixFull attribute.

Having libsecret support would make it impossible to successfully evaluate a Nix expression that uses it until the keyring is unlocked. This may or may not be desirable.

@domenkozar
Copy link
Member

@domenkozar domenkozar commented Apr 14, 2017

How would NixOps integration look like? Run nix decrypt before activation, using supplied password from local machine?

''
# Decrypt the configuration file.
${config.nix.package}/bin/nix-store \
--decrypt ${toString <nixos-store-key>} \

This comment has been minimized.

@michalrus

michalrus May 2, 2017
Member

If somebody has access to the file system (before this solution), they also have access to /proc and can read this argv, stealing the key…

This comment has been minimized.

@edolstra

edolstra May 2, 2017
Author Member

No, because this passes the path to the key, not the key itself.

carefully audit that nothing in the daemon can leak the contents of
store paths.

* Users should be allowed to export closures that they built, even

This comment has been minimized.

@FRidh

FRidh May 6, 2017
Member

Perhaps instead of checking the ACL's of the store paths, use the evaluator to determine whether the user can copy the closure?

This comment has been minimized.

@FRidh

FRidh May 6, 2017
Member

Which would require tracking the original expressions or .drv's which we do not do.

@eternaleye
Copy link

@eternaleye eternaleye commented May 6, 2017

So, this was brought up in the #rust-crypto IRC channel (on moznet), and I figured I'd reproduce what I said here.

<eternaleye> If [you] want to do reproducible encryption, then you need a way of either making it so the nonce being repeated doesn't matter (MRAE), or you need a way to ensure that the nonce, while deterministic, is never the same for distinct AD/plaintext tuples (which one does by way of an SIV-like construct, which... is an MRAE.)
<eternaleye> So I'd suggest looking into AES-SIV and HS1-SIV
<eternaleye> Neither of which, sadly, is provided by libsodium
<eternaleye> You may be able to build HS1-SIV using what's in libsodium? But I'd be wary of doing so.

The issue here is that crypto_secretbox (like most other AEADs) fails very very badly if a nonce is ever repeated with a different AD and/or message. AEADs that do not fail in this manner (and instead meet the optimal security of deterministic encryption[1]) are called "Misuse-Resistant", or MRAE.

This definition can be found in (and is expanded upon by) several papers, most involving Rogaway. These include the initial paper on SIV, the paper that introduced the AEZ algorithm, a paper describing strong online-encryption constructions built upon MRAE, and a paper devoted entirely to the definition and its properties.

By comparison, crypto_secretbox is based on a stream cipher using the nonce directly as an IV; as a result, there is a loss of confidentiality if a nonce is reused - specifically, the XOR of the plaintexts of the two messages leaks.

Similarly, it uses Poly1305 as its authenticator, which relies on nonce uniqueness. As a result, integrity also suffers if nonces are reused.

HS1-SIV is an AEAD that meets the MRAE security definition, and is built from ChaCha20 and Poly1305, so if you trust DJB's cryptographic primitives, it's a reasonable option. It was submitted to the CAESAR competition; a detailed description can be found here.

Alternately, AES-SIV relies on nothing except the AES block cipher itself. As a result, rather than relying on two primitives, one can rely on only a single primitive. In some cases, this might be a better tradeoff. A detailed description can be found in RFC 5297; it uses CMAC internally, which is specified in RFC 4493.

In short, by using an MRAE, Nix can achieve both the goal of secure, encrypted secret data in the store and bitwise-deterministic outputs.

[1] Optimally, there is no loss of integrity regardless of any properties of the (nonce, AD, message) tuple, there is no loss of confidentiality so long as the same (nonce, AD, message) tuple is not encrypted multiple times, and the only loss of confidentiality in that case is the one-bit information leak "these two messages were identical."

@Ekleog
Copy link
Member

@Ekleog Ekleog commented May 7, 2017

Ok, so... First, disclaimer: I'm not a cryptographer.

That said, after looking a bit further into SIV via http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/siv/siv.pdf , it looks like it's possible to implement (a variant of) it in a simple way by using the fact that we need no associated cleartext data along with the encrypted data: in pseudocode,

encrypt(key1, key2, value) =
    let nonce = MAC(key1, value) in
    nonce || CIPHER(key2, value, nonce)
decrypt(key1, key2, value) =
    let nonce || cipher = value in
    let clear = DECIPHER(key2, cipher, nonce) in
    if MAC(key1, value) != nonce then
        FAIL()
    else
        clear

(note that using this is not optimal from a performance point of view, as CIPHER will likely store the nonce someplace we could reuse it; real SIV uses it only as an IV that can be checked then but this would require internal knowledge of the encryption algorithms, thing that libsodium doesn't provide us with)

The thing to note is that there is no real violation of (DE)CIPHER's requirement for a nonce is not reused twice: it will be reused multiple times, but always with the same keys and values, which means that it will necessarily return the same result, leaking no information about the cleartext.

Now, the issue remaining is what the paper I linked uses as MAC, CIPHER and DECIPHER operations: they use MAC(k, v) = AES-CMAC(k, C_k ^ v) (with v padded with 10* if need be and C_k a constant depending on k), and (DE)CIPHER that are AES-CTR.

Here is the leap of faith: I think implementing a SIV mode of operation using MAC(k, v) = Truncate-192(HMAC-SHA256(k, v)) and (DE)CIPHER that are XSalsa20 (that is, the primitives libsodium offers) should not be an issue, as the overall idea of the algorithm is preserved, only the MAC and (DE)CIPHER primitives are changed.
The reason why I am saying this is a leap of faith is that its security is based on my understanding of https://www.iacr.org/archive/eurocrypt2006/40040377/40040377.pdf , which defines the requirements of MAC and (DE)CIPHER for SIV mode.
From what I understand, the requirements are that MAC be a PRF from vectors of bitstrings (OK for Truncate-192(HMAC-SHA256(k, v)) as we know all our vectors are going to be exactly 1 bitstring, based on the fact we have no header), and (DE)CIPHER be an IV-based encryption scheme.
This second point is not really verified: technically, the nonce in XSalsa20 is not an IV. Now, I can't see in the proof of correctness where this assumption is used, then I may be wrong, especially since I haven't had time to read and understand the whole paper and every argument in it.

Now, to what attacks does such a scheme open way?

First one and most obvious: as nix provides an encryption oracle (by building a derivation containing to-be-encrypted data as a user), to attacks bruteforcing the content of the secret and comparing returned values. However, this is actually required by any deterministic encryption scheme we could put in place. Solutions to mitigate this could be either to use one (bunch of) symmetric key(s) per user (the preferred in my opinion, but most likely way harder to implement) or to rate-limit encryption to a point where bruteforcing is not an issue.

Second one is that it's possible to know when the same thing is encrypted twice. It is possible to mitigate this by adding some additional data in the encryption process, like the filename wherein the encrypted data will reside and the index of the encrypted data inside the file. Or just encrypt entire files at once, then the leaked information would only be whether two files are exactly equal.

Third, as the encryption process becomes fully deterministic, side-channel attacks may become easier to accomplish. I don't see any way of mitigating this, as it is an issue naturally associated with side-channel attacks.

Fourth, I may have completely misunderstood https://www.iacr.org/archive/eurocrypt2006/40040377/40040377.pdf and told complete nonsense.

As this is likely to be way easier to implement than a full-featured HS1-SIV (which, as I learned recently, was rejected after round2 of CAESAR, despite the judges not giving any compelling argument against its use) using only libsodium's primitives, I hope this may help!

@paragonie-scott
Copy link

@paragonie-scott paragonie-scott commented May 7, 2017

I think I'm missing something: Why do you want reproducible encryption?

Is this for unit tests? If so, that can be handled totally separately. Your nonce can be a counter (i.e. start with \x00\x00\x00... then \x01\x00\x00... then \x02\x00\x00 using sodium_increment(). If you have the same nonce-key pair, encryption becomes deterministic.

If you're concerned about weak RNGs, here's a simpler idea:

  1. Calculate crypto_generichash() of the message.
  2. Feed the hash into the kernel's entropy pool as an additional seed. (Write it to /dev/random.)
  3. Just use a random nonce as you normally would.

If your RNG is weak, you'll get NMR by virtue of the RNG state being seeded by the hash of the message. If the RNG is strong, feeding the hash of the message into your RNG doesn't harm anything. If you're concerned about the RNG leaking data about your message, you're only feeding a hash of the message and not the message itself.

A lot of these SIV schemes add a lot of additional complexity (and virtually lock you into violating the cryptographic doom principle) for a small win only in corner cases.

@grahamc
Copy link
Member

@grahamc grahamc commented May 7, 2017

For context I've solicited cryptography experts to review this proposal, and specifically asked my friend @paragonie-scott to take a look. @paragonie-scott, elsewhere in Nix is a strong desire and effort to have bitwise reproducibility. If that is at odds with having our mechanism remain secure, then it would be good to know that early.

Would it be a bad idea to use a hash of the secret input to create & cache the store path?

```

The above can also be extended to support access by non-root users,
e.g.

This comment has been minimized.

@grahamc

grahamc May 7, 2017
Member

This seems to be a critically important component, so individual services don't need (poorly) re-implement this behavior.

This comment has been minimized.

@nbp

nbp May 26, 2017
Member

I agree, and ideally this would need some way to get a path for each configuration file. For security reasons systemd services can switch users, and we would want these files to be read only by these users.

I would prefer to avoid FUSE as the first iteration, as this would imply trusting more code than necessary. (we are talking about the manipulation of secrets)

I think we could have something like:

{ security.encryptedFiles.file = {
    user = "foo"; /* uses users.users.<name>.uid or assert if missing */
    file = ...;
  };
  systemd.services.wpa_supplicant.serviceConfig.Exec =
     "wpa_supplicant -c ${security.encryptedFiles.file}";
}

The problem of this scheme is that the author of the module would have to think ahead that the file has to be secure.

For example, simple configuration can provide a default configuration based on pam, but leave complex option which could be provided via extraConfig to set password in the configuration files. In such case we might want as a user to retrofit security in an expression which did not expected it first.

To handle such cases, I think we should make a submodule which is global to NixOS to let people register secure files anywhere, by adding an option type which would be a submodule for declaring files. This submodule would then be in charge of finding the proper renaming of the files path, and these entries could be listed in the security.encryptedFiles option to let the activation script decipher these files as root, before changing the ownership. Thus a service would look like:

{ lib, config, ... }:
{ options.services.foo = {
    enable = ...;
    extraConfig = ...;
    configFile = lib.mkOption { type = lib.types.file; };
  };
  config = with config.services.foo; lib.mkIf enable {
    security.secureFiles = [ configFile ]; # Can be added by another module
    systemd.services.foo.serviceConfig.Exec =
       "command --conf ${configFile}";
    services.foo.configFile.content = extraConfig;
  };
}

Doing it that way, ensure that the security.maybeSecureFiles can be registered by anybody who needs to set the services.foo.configFile.secure = true;. Thus, the submodule lib.types.file will contain the logic to translate any file from a store path to a /var or /tmp path if the file has the secure flag. The decoding would be handled by the security.secureFiles option, which would decipher the files as root, and change the ownership as part of the activation script of NixOS.

Thus, instead of forcing something special for secure files only, we would force something special for files in general, which would make it less special, and easier to extend in a modular fashion.

```
security.encryptedFiles = [
{ file = httpdConf;
owner = "httpd";

This comment has been minimized.

@grahamc

grahamc May 7, 2017
Member

maybe group (string, default null) and groupReadable (boolean, default false) as options too.

@eternaleye
Copy link

@eternaleye eternaleye commented May 7, 2017

@Ekleog

First things first, "implementing a variant of SIV" is a rather fraught exercise. Google has been attempting this, with AES-GCM-SIV, and the result actually has significantly weaker security bounds than the AES-SIV construction they have based it on.

Second, the assumption that you "need no authenticated cleartext data" may not be true. The section of the RFC talking about reproducible encryption raises the question of leaking information about rollbacks; this can be mitigated by including more information in the authenticated data - because if anything in the AD or plaintext changes, the no-leaks property kicks in.

Third, your framing of what is done by SIV is not entirely correct. It omits the dbl operation, which while simple, is cryptographically important. I suspect your proposed variant has cryptographic weaknesses as a result.

I'm curious as to what side-channel attacks you feel might become relevant, as any reasonable implementation of AES-SIV or HS1-SIV will be constant-time.

Fourth, HS1-SIV was dropped from CAESAR round 3, yes. Given that asking the author if he had been told why resulted in being informed that he had not been told of any actual issues with the cipher mode, but that it had rather not been innovative enough (and was instead too well-trodden ground), this has caused me to change my view of CAESAR (to an "exploratory" or "experimental" competition rather than "select a future standard") somewhat more than it has changed my view of HS1-SIV.

@paragonie-scott

Reproducible encryption is valuable because it fits into the broader reproducibility story of Nix. In addition, MRAE algorithms are the best-in-breed option security-wise, which provably satisfy the maximal security properties possible for deterministic encryption, given the security of the underlying primitives. As a result, using them is prudent anywhere they can be used (i.e., where streaming/online encryption/decryption are not necessary. For that, Rogaway's related paper provides optimal security.)

In addition, the cryptographic doom principle is a heuristic. Moreover, it is one that applies when stronger results are not available. The security reduction from AES-SIV to AES does not contain assumptions that are subject to being invalidated by the cryptographic doom principle.

Finally, the mitigation you propose is suboptimal - in particular, it still uses a cipher mode that does not meet the optimal security guarantees, and if you don't want deterministic encryption with an MRAE, you can simply... include random data, sourced however you want (such as from /dev/[u]random) as AD. Because of the optimality properties of MRAE, this will effectively randomize the entire ciphertext.

And no, your assertion about getting NMR from seeding with the hash even if the RNG is weak does not hold. There are many forms of weakness in RNGs, including some that converge to a small set of output values. The Debian weak-keys bug was a case of this. In such a case, the problem isn't with the seeding of the RNG, it's that it can never output the full range of values no matter what the seed,.

@Ekleog
Copy link
Member

@Ekleog Ekleog commented May 7, 2017

@eternaleye

About the dbl operation, I guess you are taking it from http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/siv/siv.pdf . This operation is included in the C_k I talked about (because t = 0 so m = 1), and is thus a no-op when there is no AD (xor-ing with a k-dependent constant should have no importance when the result immediately goes into a PRF). What's more, https://www.iacr.org/archive/eurocrypt2006/40040377/40040377.pdf (the actual SIV paper) makes mention of it only in section 4 "Enriching a PRF to take Vectors of Strings as Input" if I read correctly, which is not needed thanks to the absence of AD in the use nix would make of SIV.

As for your proposal of actually using AD for including random data, then it's just as good to use a non-MRAE non-deterministic scheme, as nix is not lacking any entropy.

In my opinion there are two possibilities.

  1. Either use deterministic encryption, which gives bitwise reproducibility but loses the property that identical cleartexts have identical ciphertexts (by design).

  2. Or do not, and forget about all that stuff of SIV, just use crypto_secretbox_easy.

The question of how to implement deterministic encryption is another issue, important only if choice 1 is picked, and is limited mostly by the primitives libsodium offers, if there is no will to use another library. At best a library implementing AES-SIV (or HS1-SIV or whatever) would be used, but even openssl doesn't seem to offer an implementation of it. So the solution I proposed is a stop-gap solution, that achieves far less than SIV (by not proposing associated data that could be used as a nonce, basically not proposing MRAE but only deterministic encryption), but should be far easier to implement. It can also avoid the "cryptographic doom principle" by simply using AE like XSalsa20-Poly1305 instead of just XSalsa20, by simply changing the encryption algorithm, and thus checking the MAC twice (because why not? performance is not really an issue here afaict). That said, I'm not sure it's OK, especially due to the fact XSalsa20 is not an IV-based encryption but a nonce-based encryption, which means all the requirements of https://www.iacr.org/archive/eurocrypt2006/40040377/40040377.pdf are not strictly-speaking met, but libsodium doesn't have an implementation of AES...).

@eternaleye
Copy link

@eternaleye eternaleye commented May 7, 2017

@Ekleog

I'm not necessarily proposing that AD be used to include random data. Rather, by including more data besides just the secret, it acts as a confounder against rewind leakage. If any of the other data has changed in the meantime, the AD would change, changing the ciphertext - preventing the rewind from being visible. This preserves determinism (it just increases how many inputs the ciphertext depends on), but reduces the risk.

@Ekleog
Copy link
Member

@Ekleog Ekleog commented May 7, 2017

@eternaleye Point made, indeed! I was thinking of encrypting entire files at once to mitigate that threat by just including all relevant data in the encrypted data. Using AD has the advantage of spending less time encrypting stuff, at the drawback of requiring a significantly more complex implementation... if it's possible to find an enterprise-grade implementation of SIV that's clearly better :)

@Mic92 Mic92 mentioned this pull request Jan 16, 2018
2 of 8 tasks complete
@dasJ
Copy link
Member

@dasJ dasJ commented Apr 23, 2018

Bumping this RFC up again because I also really need it. Why is this PR stalled? What is the process to continue? It's still tagged as "status: draft" so what are the changes @edolstra still needs to make? Or is there already a mailing-list discussion that's not linked here? Is there any place more active where people actually talk about this PR?

@Scriptkiddi
Copy link

@Scriptkiddi Scriptkiddi commented Apr 27, 2018

I was also wondering what needs to be done here? Can I do anything to help?

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Apr 27, 2018

So. I've just re-read through the whole discussion, and here are, I think, all the points that have been raised and not answered yet (and that are relevant to the RFC in itself, as discussion has strayed a bit at times). Sorry if I missed something, please up it.

Summary

  1. @zimbatm (here) proposed that encrypted files are tagged so they cannot be used except for being decrypted, so that an encrypted file could not be referenced in an unencrypted file, in order to reduce the likelihood of misconfiguration.
  2. A point I just noticed: if <nixos-store-key> is a path, it means "${<nixos-store-key>}" would put the key into the store. Would it be possible to make it a string, in order to reduce the likelihood of this happening?
  3. @domenkozar (here) proposed that keys are referred to by name -- which would also solve the issue just above.
  4. @taktoa (here) noticed a typo: s/obverse/observe/ line 245
  5. The question of reproducible encryption ended up with the answers that there are cryptographic solutions to do it, but they are not easy-to-use in libsodium. The question of whether we would want it or not anyway stays open.
  6. @0xABAB and @Nadrieril (here) raised the questions of the possibility to extend the cryptographic primitive, as primitives come and go, and having an easy way to replace them is often good engineering.

Additionally, @nbp (here) started thinking about how to integrate this into nixpkgs. I don't think it's necessarily an important point to debate right now, but it will sure become after this RFC has been completed, so putting it here so as not to forget it.

So could we just try to get to an agreement on these 6 points, merge the RFC, maybe adapt a bit the current implementation, and finally have a way to remove some passwords for the nix-store? :)

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Apr 27, 2018

So now my opinions about these 6 points:

  1. I think this is not required for a first iteration. At worst, what will happen is that modules badly written will try to start with encrypted data, and will fail. That's not a security issue, and won't affect modules that were working before. So I don't think that's an urgent issue, contrarily to finally getting an encryption builtin.
  2. I think the idea of referring to keys by name would solve this issue.
  3. Well, same as 2.
  4. Simple typo, I don't think there's much space left for debate.
  5. I don't think we need deterministic encryption for the first iteration. It'd sure be a nice-to-have, but there is no need like there is a need for at least encryption. And with extensible cryptographic primitives, it's easy to add it afterwards, so I'd say let's postpone this debate for later.
  6. For extensibility of the cryptographic primitive, I think this RFC as defined implicitly caters for it. Indeed, changing the cryptographic primitive will require changing the key. So the choice of cryptographic primitive could just be embedded into the key file, in a format maybe like (for the first libsodium-using iteration): chacha20-poly1305:[base64-encoded key] (note that this is only for illustration purposes, as I don't think the RFC needs to describe this format). This way the implementation would be easily extensible, with exactly the same syntax for the user.

So, to me, the required changes in the current RFC are:

  • s/obverse/observe/ line 245
  • Change the encryption primitive to take, instead of the path, a name (that will be looked for under $NIX_STATE_DIR/keys/[key_name])
  • Write anywhere that extensibility of the cryptographic primitive will be handled by changing the format of the key.
@lschuermann
Copy link
Member

@lschuermann lschuermann commented Sep 30, 2018

I'm not quite sure whether this PR already implies this:

I think it's quite important to not only encrypt secrets using builtins.encryptString, but to also provide a utility for encrypting before evaluating / building the config. This could then allow using the ciphertext in the configuration itself. In addition to having them encrypted in the nix-store, this would allow me to push my config to a public version control, without exposing all of my secrets. (As far as I know a pretty common issue)

This could then go along with the ability to copy a key to another machine, and being able to use multiple keys (addressable by name, etc.). For example, multiple machines could share the WiFi configuration + WiFi secrets encryption key.

I'm already doing something similar with openssl (still world-readable in the nix store), but it would be much better if this were somehow integrated.

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Oct 1, 2018

@lschuermann This would be solved by doing nix repl then builtins.encryptString <nixos-key> "mysecret", and using the output of this command instead of the value.

However, your message is important in that it reminds us that in nixpkgs, we must not only accept eg. a password argument that will be transparently encrypted, but we must also support a encryptedPassword that'd assume encryption has already been performed.

tl;dr: I don't think there's anything to change in the RFC for this, but it'll be good to keep in mind when adapting modules to the new system :)

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Oct 5, 2018

@edolstra Could you handle the 3 concerns pointed out in #5 (comment) and merge this RFC? I think no disagreement and a few +1's after ~5/6 months is enough to say that everyone agree. Or do you want me to do a PR to your RFC branch with the changes?

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Oct 5, 2018

(also, I can co-author if there is a need for it)

@edolstra
Copy link
Member Author

@edolstra edolstra commented Oct 7, 2018

@Ekleog The main issue with this RFC is the lack of deterministic encryption, which is incompatible with the goal of binary reproducibility. I'll need to think a bit more about what to do about that.

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Dec 29, 2018

@edolstra

So, after having consulted a friend who actually does cryptography for a living and who proofread and validated this message, here are the conclusions we came up with.

We assume here that the key is read as the user who requested the build, in a manner similar to builtins.fetchGit, as this avoids providing a local attacker with an encryption oracle.

  1. If we want fully deterministic encryption, encryptString call per encryptString call (ignoring implementation details currently), then this, by design, means that an attacker can:

    • Know when a secret changes
    • Know when a secret comes back to a previous value
  2. If we want fully-random encryption (which is currently implemented), then this means that each file that contains a secret as well as all their reverse-dependencies must be rebuilt at each configuration rebuild (as each nonce would be randomly generated from a random seed changing at every build, without any kind of reproducibility)

  3. If we want deterministic encryption that takes the context in account (see below for the implementation details), then:

    • An attacker cannot see when a secret changes or comes back to a previous value, as all encrypted values in the configuration change at each configuration change (except when all values come back to a previously-built configuration, like a rollback)
    • Files containing secrets need to be rebuilt only if the configuration changed, though they still need to be rebuilt even if their specific content didn't change

In my opinion, the third option is by far the best of those. However, it may also be the hardest to implement, as it would require to have a way to hash the configuration before expanding builtins.encryptString, which means quite a lot of special-casing.

Possible implementations for 3 are:

  • Use AES-SIV (or any other cipher that provides nonce-misuse resistance), pass 0 as the nonce and the configuration as associated data

  • What I'll call cached random encryption, which can be implemented in two ways:

    • Either generate a random nonce for each configuration hash, and use this nonce to seed the CSPRNG used to generate encryption nonces
    • Or cache the nonce used by encryptString based on (hash of the configuration, hash of the encrypted item), which would avoid having to add another CSPRNG in the story

    By re-using the same nonces for the same (data, context) pair (either by caching the nonces themselves or by caching a nonce used to seed the nonce-generating CSPRNG), we have reproducible builds when the two configurations are exactly identical.
    Note that for this scheme to avoid the security pitfalls of point 1, the cache must be readable only by root, as hashed secrets will be inside. Ideally, it could even be encrypted in a non-reproducible way with the key provided by the user, to protect against an attack that would allow to read files as root (and only that)

Between these options, cached random encryption:

  • Is more complex to implement
  • Uses only standard well-defined cryptographic primitives already present in libsodium

While AES-SIV:

  • Adds another dependency that will potentially have seen less peer review than libsodium
  • Is stateless

Finally, note that there is no need to actually choose one of these options to move forward. The key file could just start with random: to indicate that it will generate completely random nonces at each rebuild, and later extensions could add key files starting with siv: or deterministic:, leaving total control to the users over how they want things to be encrypted. This is also good for cryptographic agility anyway, for when the picked cipher will be proven insecure.

Do you have any preference among these choices?

@edolstra
Copy link
Member Author

@edolstra edolstra commented Dec 31, 2018

@Ekleog Thanks for the analysis!

Note that for this scheme to avoid the security pitfalls of point 1, the cache must be readable only by root, as hashed secrets will be inside.

It doesn't avoid the issues of point 1, right? Since attackers can still obverse when secret values change or revert back to a previous value.

Regarding AES-SIV: assuming pure evaluation mode (where you build a NixOS configuration from a Git repository), the configuration could be the Git revision of the repo. This provides a cheap value that changes if any evaluation input changes. It does have the problem of issue 2 (since any change to the Git revision triggers a rebuild of all encrypted files and their reverse dependencies), but at least it's deterministic.

@7c6f434c
Copy link
Member

@7c6f434c 7c6f434c commented Dec 31, 2018

Do I understand correctly that option 3 is supposed to have a non-locality in evaluation, i.e. evaluation of a service expression will depend on how that expression will be used?

@Ekleog
Copy link
Member

@Ekleog Ekleog commented Jan 6, 2019

@edolstra

It doesn't avoid the issues of point 1, right? Since attackers can still obverse when secret values change or revert back to a previous value.

AES-SIV and cached-random-nonce encryption are two implementations of scheme 3, which uses the non-locality @7c6f434c mentions to avoid part of the issue of scheme 1 by re-doing all encryption from random nonces as soon as any part of the configuration changes by randomizing the nonces.

If I dare simplify SIV after having not read the paper for a few months, it is more or less similar to doing ENCRYPT(key1, data, nonce = MAC(key2, some-concat-like(data, authenticated data))) (disclaimer: I haven't re-checked it for a while, don't use this for serious stuff, yadda yadda… and especially, this simplification is only valid for our case where the nonce given to SIV is always 0, it doesn't hold for cases where the nonce actually tries to be random). Where cached-random-nonce encryption saves a random nonce in a database owned by root, SIV uses a nonce from a “virtual database” that is defined by x -> MAC(key2, x).

Regarding AES-SIV: assuming pure evaluation mode (where you build a NixOS configuration from a Git repository), the configuration could be the Git revision of the repo. This provides a cheap value that changes if any evaluation input changes. It does have the problem of issue 2 (since any change to the Git revision triggers a rebuild of all encrypted files and their reverse dependencies), but at least it's deterministic.

This would be scheme 3 with non-locality solved by taking the git revision instead of something built in the evaluator that would somehow magically be able to evaluate until the builtins.encryptString and hash the expression before evaluating these calls. The issues it raises in exchange are:

  • It cannot handle building a revision with un-committed changes (could be solved by making a NAR of the git repository and taking its hash as the ever-changing data)
  • It requires that imports are all restricted to said git repository, as otherwise changes in those imports would not trigger changes in the encrypted passwords, which would weaken resilience against eg. an attacker knowing when a secret changes
  • In particular, it doesn't trigger a rebuild of files with encrypted contents upon channel upgrade (which are imports not restricted to the git repository)

It could be an intermediate solution between schemes 1 (deterministic) and 3 (deterministic-with-context), that would be deterministic with some context but not the full context, likely much more easy to implement than the deterministic-with-context scheme.


@7c6f434c Indeed that is why it sounds pretty hard to pull out, and is why I personally think we should provide fully-random encryption as a stopgap and later on add more deterministic schemes with a different key prefix (aes-random-nonce: vs. aes-siv: prefixing the private key data) when we actually find a way to do it properly.

@shlevy
Copy link
Member

@shlevy shlevy commented Jan 17, 2019

Closing all draft RFCs. Please feel free to reopen when this is ready for general community review and the RFC shepherd process.

@shlevy shlevy closed this Jan 17, 2019
@Ekleog
Copy link
Member

@Ekleog Ekleog commented Jan 23, 2019

For information: AES-GCM-SIV is in BoringSSL, so there's an enterprise-grade SIV implementation nowadays.

@nixos-discourse
Copy link

@nixos-discourse nixos-discourse commented Feb 10, 2019

This pull request has been mentioned on Nix community. There might be relevant details there:

https://discourse.nixos.org/t/hashing-plain-text-password-directly-in-configuration-nix/2093/3

@anka-213
Copy link

@anka-213 anka-213 commented Sep 13, 2019

@Ekleog How about if we would instead write the encrypted value into the configuration and encrypt the secret with an external tool. This way the building of the configuration file can be completely deterministic and work just like it does today, while the encryption can use all the randomness it needs. The decryption process would work the same as in this RFC.

The two main downsides I can see with this are:

  1. It is still possible for an attacker to see when the secret changes. However, we can still make as many "fake" changes as we want by simply re-encrypting the same secret.
  2. It can be less convenient, since you need to use an external tool and copy-paste the result into your configuration each time you want to add or change (or re-encrypt) a secret.

Problem 1 is made somewhat worse by problem 2, since this significantly lowers the chance that the user would regularly re-encrypt secrets to obfuscate when the secrets are actually changed. However, I still think that this may be a decent compromise since it allows both full randomness for encryption and full determinism for building and is still very simple, requiring little change to current systems.

What do you think?

@7c6f434c
Copy link
Member

@7c6f434c 7c6f434c commented Sep 13, 2019

@anka-213 well, if we want secrets pre-encrypted in a uniform way, given that a single quoted string is a valid Nix file that can be imported, one could have a FUSE tool so that the plaintext secrets can be managed in one virtual directory, and the other one is automatically populated with the encrypted versions under the same names. The configuration just imports the encrypted entries. I guess as the services would need to access the secrets automatically, the security is still limited by normal system access control measures, so the tool can store the data in plaintext and reencrypt as often as desired.

@Mic92
Copy link
Contributor

@Mic92 Mic92 commented Sep 20, 2019

@anka-213 I also thought that we could put encrypted values in environment.etc. files and then have a decryption action hook that would create a plain-text file in a directory in /run/, that has encryption placeholder replaced. Sounds a lot simpler and more sound then esoteric AES modes.

environment.etc."foobar.conf".source = ''
  password = @password@
'';
environment.etc."foobar.conf".secrets = {
  password = "nixencrypt::Zm9vYmFyIGFkZmFkZmFkZmE=";
};

Then the pseudo code for decryption would be:

decrypt() {
 # ...
}

sed -i \
  -e "s!@password@!$(decrypt nixencrypt::Zm9vYmFyIGFkZmFkZmFkZmE=)!" \
  $out/foobar.conf > /run/etc-static/foobar.conf
@pinpox
Copy link

@pinpox pinpox commented Oct 21, 2020

Any progress on this? Some way of storing sensitive information is badly needed!

@ryantm
Copy link
Member

@ryantm ryantm commented Oct 21, 2020

@pinpox
Copy link

@pinpox pinpox commented Oct 22, 2020

@pinpox have a look at https://github.com/Mic92/sops-nix

Thanks, that looks like a good tool, I think that's all I need for now! Would be nice to have a "native" way to handle secrets in nix/nixos though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.