New Groups support #42

jagerman · 2023-08-03T00:52:19Z

Adds a group info config type, which tracks distributed info for v2 groups.

This requires introducing/using a few new concepts not currently used for user config messages:

Multiple decryption keys. User config doesn't do this at all (rather it generates a single decryption key from the private key for each namespace).

This doesn't yet add support for sharing and distributing those keys, just for having being able to load a config with a list of multiple possible keys.

Config signing and verification. For user configs this isn't done, since only the owner can actually encrypt/decrypt a config message, just being able to decrypt it is authentication enough.

This required various modifications to make the config library properly prevent modifications when we can't modify, and to properly follow that through in terms of merging, updates, etc.

TODO:

storage server subaccount auth signature generation
storage server subaccount authentication helper
C API for user groups group_info
convo info volatile entry for group
support for wrapping/unwrapping a config in a protobuf wrapper, for existing config types (moved to PR Protobuf handling #51 instead of here)
add members/keys size() functions for group types

include/session/config/groups/info.h

include/session/config/groups/info.hpp

include/session/config/groups/members.hpp

src/config/groups/members.cpp

frtget · 2023-08-16T11:26:56Z

should have done everything in a single review, sorry about that :)

src/config/groups/info.cpp

Adds a group info config type, which tracks distributed info for v2 groups. This requires introducing/using a few new concepts not currently used for user config messages: - Multiple decryption keys. User config doesn't do this at all (rather it generates a single decryption key from the private key for each namespace). This doesn't yet add support for sharing and distributing those keys, just for having being able to load a config with a list of multiple possible keys. - Config signing and verification. For user configs this isn't done, since only the owner can actually encrypt/decrypt a config message, just being able to decrypt it is authentication enough. This required various modifications to make the config library properly prevent modifications when we can't modify, and to properly follow that through in terms of merging, updates, etc.

Fixes a bug where construction-provided keys were loaded in reversed priority.

Tests to follow.

The first part of this assert shouldn't fail if we are starting from a fresh, blank config (in which case we don't actually serialize our own message anymore).

It doesn't really make sense to have this as we either want signatures (for shared messages) or don't (for personally encrypted messages). The only place we were passing it as `true` was in a place that also didn't pass a verifier or signer (during dumping), and so already wasn't adding/checking signatures.

This makes Keys construction, loading, and rekeying take the Info and Members object to update their keys, thus removing the requirement for application code to worry about key management at all (they just feed the keys in, and they propagate to info/members). Also adds a state dump (similar to base config `dump()`) to Keys.

Messages to a new group can't go into 0 (since that is publicly depositable), so reserve a namespace (11) for messages.

Adds method for encrypting/decrypting a message. This supports both compressed+encrypted and just plain encrypted. Abstracts the zstd compression implementation from base.cpp into internal.cpp, and uses it inside the new group message encryption Compression is only used if beneficial (that is, only if compression actually reduces the message size).

This adds methods to `Keys` that generates subaccount tokens and signatures as needed to do storage server subaccount authentication (which currently requires testnet as the subaccount code is not yet active on mainnet), along with test code to test it. Also adds a tests/swarm-auth-test binary that spits out storage requests for store/retrieve testing.

It adds a nlohmann::json dependency, and isn't really part of the normal test suite.

- Adds C wrappers for all the swarm authentication methods - Adds C wrapper for querying if the keys object has admin permission - Adds C wrapper for key_supplement to generate supplemental key messages - Changes the swarm `subaccount` value to be base64 instead of hex (since it isn't really a pubkey) - Makes the vector passed into key_supplement const&

- Add keys.size() return the number of keys in the object. - Add C API for retrieving keys & key size - Fix copy-and-paste error in groups_members_size C API name

Also adds tests to verify that auth_data and secretkey data propagate as expected. Many thanks to Harris for identifying and tracking this down.

- Added config_groups_keys to config/base.h - Looped in test call into config user groups C api unit test

Adds current hash tracking to groups::Keys; this adds a `current_hashes` method to group::Keys that works similarly to the base config current_hashes (though returns a set instead of vector). (The C version is `groups_keys_current_hashes`) Testing this also triggered a bug in that we weren't probably re-loading the verified-signature state on a dump->load cycle, which caused an assertion failure on merge (because the current state couldn't be successfully serialized-then-deserialized), also fixed here. Also renames the recently added C config_group_keys to config_get_keys to better reflect its purpose.

This adds an `.invited` flag for all group types (legacy, new, and communities) that can be used to track a invited-but-not-yet-joined room by session clients. Also adds `.name` to new groups data so that the name from an invitation can be stored before accepting (after accepting the name will come from the shared group config message). Add `setKicked()` and `kicked()` methods which clear both auth_data and secretkey and report whether both are empty, respectively. (For the C API these are `ugroups_group_set_kicked` and `ugroups_group_is_kicked` and take a pointer to the `ugroups_group_info`).

The key lists weren't being set properly when loading a Keys object from a dump; these sets the key lists properly.

src/config/groups/keys.cpp

frtget · 2023-10-05T15:21:36Z

include/session/config/groups/keys.hpp

+    ///
+    /// Generates a supplemental key message for one or more session IDs.  This is used to
+    /// distribute existing active keys to a new member so that that member can access existing
+    /// keys, configs, and messages.  Only admins can call this.


I'm not sure what you meant by "access[ing] existing keys" here. Removed in #64 because it seems redundant.

frtget · 2023-10-05T15:22:18Z

include/session/config/groups/keys.hpp

+    /// including a proper signature by an admin of the group.  The signing value must have read
+    /// permission, but parameters can be given to also require write or delete permissions.  A
+    /// subaccount signing value should always be checked for validity using this before creating a
+    /// group that would depend on it.


"creating a group" as in adding a group to user configs ?

frtget · 2023-10-05T15:23:17Z

include/session/config/groups/keys.hpp

+    ///
+    /// Inputs:
+    /// - `groupid` -- the group id/pubkey, in hex, beginning with "03".
+    /// - `session_ed25519_secretkey` -- the user's Session ID secret key.


Why don't we just take the session ID instead of the secret key ? We only use it to extract the pk and then remove the sign to get the xpk and use that.

frtget · 2023-10-05T15:25:25Z

src/config/base.cpp

+            _sign_sk.data(),
+            32,  // Just the seed part of the value, not the last half (which is just the pubkey)
+            reinterpret_cast<const unsigned char*>(key.data()),
+            std::min<size_t>(key.size(), 64));


I think we should assert that key is smaller than 64 instead in order to avoid using the same key in different contexts by mistake by using several keys that start with the same 64 characters. I can add that to #64

frtget · 2023-10-05T15:30:56Z

src/config/groups/keys.cpp

+    crypto_sign_ed25519_sk_to_pk(T.data(), user_ed25519_sk.data());
+    bool neg = T[31] & 0x80;
+    T[31] &= 0x7f;
+    if (0 != crypto_scalarmult_ed25519_noclamp(to_unsigned(token.data() + 4), k.data(), T.data()))


Is there a reason why we don't want to clamp here ?
Tor and Lokinet both clamp blinding factors but I'm not sure whether it really makes the DH key exchange more secure.
I don't think it protects against small subgroup attacks because a is a multiple of the cofactor so ±akB should either be the identity element or be in the main subgroup anyway.
Do you think we should clamp just in case we decide to reuse this blinded keypair for X25519 key exchange in the future for some reason ?

It seems the person who implemented the key blinding in tor wasn't sure either.
@jagerman I see you're the one who added the code that clamps the blinding factor in lokinet so I'm curious to know the reason why it's done if you know it.

The short answer is that it "doesn't work" with clamping half the time, because when we have to use -a instead of a, clamping it changes the value and breaks the -A = -aG relationship that we're depending on. There's a fair bit to unpack behind this though, so let me elaborate.

I wish we didn't need that relationship at all: but unfortunately it's an artifact of a suboptimal choice (which unfortunately followed Signal) of having Session IDs be the X25519 pubkey conversions of Ed25519, rather than direct Ed25519 keys. Unfortunately that Ed->X conversion loses the sign bit from the Ed25519 pubkey, which is what causes all this trouble: when reversing the Ed->X conversion we get ourselves to Ed=±sqrt(...) and there's no way to know which one was the correct Ed pubkey. So what we're doing here is effectively always signing as if the real pubkey that yielded the Session ID was in fact always the positive of the two possible Ed25519 pubkeys, when in reality half the time it is negative.

In order to do that when we actually have a negative pubkey, however, we have to effectively produce a signature from -a rather than from a -- but that won't work if we clamp -a.

To illustrate:

#!/usr/bin/python3 from nacl.signing import SigningKey import nacl.bindings as sodium for seed31 in (b'\0', b'\4'): seed = SigningKey(b"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0" + seed31) pubkey = seed.verify_key # This gives the private scalar, i.e. "a" for keypair a/A that yields aG = A. (That's not its primary usage in # libsodium, and thus the function name here is understandably confusing). a = seed.to_curve25519_private_key().encode() print(f"\nPubkey: {pubkey.encode().hex()}") Ac = sodium.crypto_scalarmult_ed25519_base(a) An = sodium.crypto_scalarmult_ed25519_base_noclamp(a) print(f"A noclamp: {An.hex()}") print(f"A clamped: {Ac.hex()}") if Ac[31] & 0x80: a_neg = sodium.crypto_core_ed25519_scalar_negate(a) Ac_neg = sodium.crypto_scalarmult_ed25519_base(a_neg) An_neg = sodium.crypto_scalarmult_ed25519_base_noclamp(a_neg) print(f"-A noclamp: {An_neg.hex()}") print(f"-A clamped: {Ac_neg.hex()}")

produces:

Pubkey: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29 A noclamp: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29 A clamped: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29 Pubkey: fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b A noclamp: fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b A clamped: fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf329b -A noclamp: fd50b8e3b144ea244fbf7737f550bc8dd0c2650bbc1aada833ca17ff8dbf321b -A clamped: d1cd8406d8f93617b10a91a20c4126c08bb4f0fb23807fcfd25e939b7a9924ea

(Keep in mind these are encoded in little-endian, and so the sign bit is the most significant bit of the last byte, i.e. the second-last hex digit).

So in short, without clamping we can keep up the illusion that we were actually the positive alternative of the two pubkeys that could have given the X25519, even when we aren't.

Do you think we should clamp just in case we decide to reuse this blinded keypair for X25519 key exchange in the future for some reason ?

I really wish we could, because you're right, we're okay for Ed25519 signatures, but definitely don't want to go back to X again with this derived key.

We are talking about different things. I was talking about clamping the blinding factor k, not clamping t.

After some more digging it turns out that the security provided by clamping scalars is not preserved by multiplication operations on scalars because they are done modulo l (the order of the main subgroup).
For a point P with a torsion component (i.e. not in the main subgroup):
(-a mod l).P ≠ -(a.P)
and (k*t mod l).P ≠ k.(t.P)
Which means that even though a was already clamped during key generation, -a mod l and ±ak mod l as private key scalars are vulnerable to small subgroup attacks and clamping k is useless.

I think the reason why it was done in tor is that the developers weren't sure about the impact of clamping on the security of their key blinding scheme at the time and it seems the reason why it was added to lokinet was to make the crypto identical to tor's.

My understanding is that if we really wanted to make sure that this derived scalar is safe against such attacks we would have to use its torsion safe representative (which is an element of GF(8l)) or do arithmetic operations modulo the order of the entire elliptic curve group 8l (which is effectively the same thing when a is a multiple of the cofactor) so that:
kt.P = ±k.(a.P)
P can be written as Q + T where Q is a point in the main subgroup and T is a point of small order.
±k.(a.(Q + T)) = ±k.(a.Q + a.T) = ±k.(a.Q + ∞) = ±k.(a.Q) because a is a multiple of the order of T.
The torsion component is absorbed and the resulting point does not leak any information about our private scalars.

But that's not supported by libsodium and, in my opinion, not necessary because if we ever decide to reuse this keypair for DH we would have to use either the "noclamp" scalar multiplication function which fails when given a point that is not in the main subgroup or the ristretto functions which, by design, can't decode input bytes to a point outside of the main subgroup.
(Everything else clamps the input scalar and breaks the kt.B = b.(k.T) relationship we need where b/B is the other party's keypair.)
So, as far as I know, it would still be safe to use this derived scalar as is without caring about any of the above. Unless we switch to another crypto library.

In conclusion, clamping k is useless and there is not much we can do for now except maybe adding some comments to make it clear that these derived keys should never be reused for DH without extra precautions because they are vulnerable to small subgroup attacks.

frtget · 2023-10-05T16:38:51Z

include/session/config/groups/keys.hpp

+    /// - `rekey()` is called, returning the new keys config.
+    /// - `info.push()` is called to get the new info config (re-encrypted with the new key)
+    /// - `members.push()` is called to get the new members config (using the new key)
+    /// - all three new configs are pushed (ideally all at once, in a single batch request).


Shouldn't this be a sequence request instead of a batch request ?

venezuela01 · 2023-11-21T11:48:24Z

We need more experts to review the design and implementation. I hope @KeeJef will publish more internal design drafts before an actual software release. We have seen examples in the past where in-house developed code was vulnerable to algebraic attacks due to subtle mistakes by the core team. This situation further evidences the need for more external volunteers like @frtget to provide valuable opinions.