Mixnet: Periodical topology update + Test refactoring #51

youngjoon-lee · 2024-01-18T07:11:07Z

This PR is the last piece of Mixnet specification. This covers the deterministic periodical topology reconstruction.

For the simplicity, this Python spec doesn't describe how network connections are established and maintained. Instead, this focuses on showing how entropy is provided and used to build topologies. Managing network connections is an implementation detail.

danielSanchezQ · 2024-01-18T15:02:50Z

mixnet/fisheryates.py

 class FisherYates:
    @staticmethod
-    def shuffle(elements: List, entropy: bytes) -> List:
+    def set_seed(seed: bytes) -> None:
+        random.seed(a=seed, version=2)
+
+    @staticmethod
+    def shuffle(elements: List) -> List:


I think the old style is better. IIRC from the expects, the entropy changes over time. In this case we would set the whole system seed when calling set_seed, which is not what we want. It should be reset on every call to maintain integrity with other random calls. (This is python specific, but it actually fixates the intention that the entropy is just for the shuffling).

Previously, yes, the entropy was designed to be changed over time. But, from the recent discussion, the entropy is going to be injected less frequently, for example, 5 days. But, the topology update is going to be triggered more frequently, e.g. 30 minutes. So, the seed, which was set when the entropy was injected, should be maintained over subsequent topology updates until a new entropy is injected. According to the discussion, I've updated the spec: https://www.notion.so/Mixnet-Specification-807b624444a54a4b88afa1cc80e100c2?pvs=4#be6b9cafb4b94334b9f154d3a072204c

I think this is the topic that we need to make a sync in #51 (comment)

Yes. I think in this case there is a missunderstanding here. set_seed should be instead set_entropy then we store the entropy and do not set the seed inmidiately. Then we use the old code. Otherwise is gonna be impossible to make simulations for example.

Yes. If you remove the mixnet_epoch concept. We can use the old code. Also, not setting the seed immediately looks reasonable (#51 (comment)).

mixnet/mixnet.py

danielSanchezQ · 2024-01-18T15:03:51Z

mixnet/mixnet.py

+        next_epoch_ts = datetime.now() + timedelta(seconds=self.mixnet_epoch_sec)
+
+        while True:
+            time.sleep(1 / 1000)


Why is there a sleep here? nitpick: I think schedules are defined already when checking it below. Probably can be abstracted into some scheduler.
Not necessary to change now. But we could have a blackboxed structure than awaits till next epoch.

mixnet/mixnet.py

madxor · 2024-01-18T16:03:28Z

mixnet/mixnet.py

+        self.next_topology = None
+
+    def run(self) -> None:
+        next_epoch_ts = datetime.now() + timedelta(seconds=self.mixnet_epoch_sec)


Thinking at loud here.

I don't know if this is the best way to count epochs, it may work here when all node instances have their clocks synchronized. However, when one node will connect later (or will disconnect for some time) he might loose track of time here. It might be better to have something like self.genesis_timestamp from which all nodes calculated current epoch:
epoch_number = ceiling(datetime.now() - params.genesis_timestamp / params.epoch_len_in_sec)
and slot number as (relative to the epoch):
slot_number_in_epoch = ceiling((datetime.now() - epoch_number * params.epoch_len_in_sec) / params.slot_len_in_sec)
Or something similar.

This way the calculation should be independent from the connectivity conditions of nodes and can be influenced only by clock drifts.

Another things is that the epoch and slot information should be confirmed by reading them from the consensus. But that might be out of the scope for this specification effort.

Yes, related to this comment . It would be enough so this structure have a channel (a python corutine that receive values can be used) that schedules the epochs, something like (note sintax may be different):

while next_epoch := yield:

or

while next_epoch := await epoch_notifier:

or

async for epoch in epoch_stream:

Yes. I missed this point. I'll fix it.

Another things is that the epoch and slot information should be confirmed by reading them from the consensus. But that might be out of the scope for this specification effort.

I'm also thinking about this. Here, we only use mixnet_epoch_sec that defines the interval of topology updates, which is going to be shorter than consensus_epoch_sec. So, if we design the entropy to be injected from the consensus layer to the mixnet layer, I think the mixnet layer doesn't need to read something (consensus epoch or slot info) from the consensus layer. In other words, we can design mixnet independently with upper layers. Did I understand the point that you mentioned?

Yes, I think that we are thinking the same way here. I've left more in depth comment below in the next conversation (#51 (comment))

madxor · 2024-01-18T16:12:05Z

mixnet/mixnet.py

+                continue
+
+            if self.next_topology is None:
+                self.next_topology = self.build_topology()


If I understand it correctly, in this version, we are refreshing the topology only on once per epoch?

I designed that the topology is updated once per mixnet_epoch (e.g. 30min), which is going to be shorter than consensus_epoch (e.g. 5d). Also, the mixnet_epoch is going to be independent with consensus_slot (20sec).

I think we need to make a sync of this topic. It depends on how long the consensus_epoch is going to be. It would be much simpler to just remove mixnet_epoch and make mixnet follow the consensus_epoch, if the consensus_epoch is not too long. Also, it may be okay even if the topology is updated less frequently. In Nym, the topology is updated once chain validators publish a new random beacon on the Nym blockchain (once a day, iirc).

Although there will be dynamic optimization of the epoch duration, it is expected epochs will be in the order of hours or days

(from Nym whitepaper: 4.7 Mixnet epochs)

What do you think about between these two approaches (having a shorter mixnet_epoch or removing it)?

To minimize the misunderstanding, I just made a new commit assuming that we're gonna remove the mixnet_epoch concept: 9a64242. Let's talk which one would be reasonable for v1.

This is much simpler and doesn't tightly rely on the term consensus epoch and consensus slot, thus loosening the architectural relationship between mixnet and consensus.

@danielSanchezQ In this commit, I didn't use async to make diff prettier ;) In fact, the code after this change is very very simple even in the sync mode, and doesn't have the time calculation error that @madxor mentioned. Once we agree on a single approach, I'll make a separate PR for async.

I think that it might be easier to think about mixnet functions as a tool that will be used by the consensus. In my head, the consensus should be responsible for triggering mixnet function, as the consensus should be treated as the source of truth. That is, what if we focus only on providing an API that potentially could be used as part of the consensus who will be responsible for synchronization of time and data, and the mixnet API will only respond (by setting the connectivity) accordingly to data provided by consensus. This way we could simplify a lot the internal logic of the mixnet and relieve it from knowing anything about the time or consensus.

The API could be very simple, such like:
topology = get_mixnet_topology(nodes_list, entropy) <-- returns topology configuration based on the entropy and the list of nodes.
set_mixnet_topology(topology) <-- activates the topology.

I think that mixnet should only care about the two above, and all synchronization/configuration issues should be abstracted from the mixnet and done on the consensus side. Therefore, the consensus be responsible for calculating and providing synchronized data in a timely manner to the mixnet.

Technically, I think that there should be another layer on top of the consensus that would do the above in conjunction to parsing governance transactions (coordination layer?), through which some of the global parameters will be set. But we don't have it yet so I'm using consensus here.

@madxor @danielSanchezQ I think we're almost on the same page and we can now come to a conclusion.

I made a commit 9a64242 for clarity and another commit 1a97064 to rename a variable. These commit have API different from what Marcin suggested in #51 (comment), but I'm going to explain about this at the end of this comment.

In these commit, the consensus layer just injects a new entropy when it thinks that the new entropy is almost safe to use. Potentially, the consensus layer will be able to have some delays before injecting the entropy, instead of injecting it as soon as the random beacon is updated, because of #51 (comment). But anyway, it's the responsibility of the consensus layer. Mixnet layer doesn't need to know about this detail.

In perspective of mixnet, it just uses a new entropy whenever it is injected. Mixnet doesn't need to care about the frequency of entropy injection. It is up to the consensus layer. Even if the new entropy is injected every 20 seconds, it's okay. Mixnet will just use it to build a new topology, as soon as it arrives in.

The one thing Mixnet needs to care about is putting some delay before switching to the new topology. As mentioned in #51 (comment), it takes some time for each mix node to establish new connections according to the new topology. Of course, we can ignore this, by allowing some connections are established on demand when packets based on new topology are arrived in, even if it may increase the total latency of packet delivery. But, I think we can minimize the impact of this transition period by putting some delay (e.g. 1min) before switching to the new topology. Then, each mix node can earn 1min to calculate the new topology and establish new connections in advance. This is what 1a97064 tries to describe. Still, this delay is not perfect, and also this doesn't mitigate the time difference on the new entropy being injected to each mix node because the network is not fully synchronized. But at least, this delay can minimize the impact caused by connection re-establishments.

As I mentioned above, I designed API different from what Marcin suggested, because I didn't want to expose the concept of "topology" to the consensus layer by providing get/set_topology API. In that case, the consensus layer may be confused about how long it should hold the topology returned from get_topology(entropy) before calling set_topology(topology). Instead of that, I thought it's simpler to have a single API inject_entropy(entropy), so that the consensus layer just injects a new entropy whenever it wants and forgets about that.

What do you think about this? We have had many valuable discussions, and I think we are in the end of the discussion.

I'm just attaching a diagram that helps our discussion. The option 2 is what my code is suggesting, but let's discuss today.

Even if we choose the option 2, the first topology generation (when the process is started) should be like the option 1 probably.

I think that we are on the right path. However, I think we are still missing another layer to make it perfect. I fully agree that there should be a logical separation between the mixnet (and its time-awareness) and consensus (and its network awareness). Therefore, it might be necessary to create another layer on top of the mixnet and consensus, we can call it robustness (or privacy) agent/layer (or something). The robustness layer would be responsible for reading consensus and setting mixnet, we could configure it separately from the mixnet and consensus. Therefore, the flow would always be from the consensus event to mixnet action with independently configured logic.

For example, if change the length of the consensus epoch or slot time, then we should not adjust mixnet delay, that should be taken care by the robustness abstraction, which would translate the consensus timings to the mixnet ones (in practice, I imagine that mixnet will not be aware on any specific timings but will react on robustness calls).

I agree as well. We need something in the middle to coordinate efforts. I wan to understand more about the delays as I am getting a bit lost on that. I feel the responsibility is kind of mixed. This system should be totally reactive IMO.

This is a summary of our discussion today.

We're gonna have a separate "robustness" layer on top of consensus layer and mixnet layer for the clear separation of reponsibilities.

Robustness layer is gonna know parameters for building mixnet topologies.

After v1, robustness layer will have more responsibilites and parameters (e.g. changing mix delays).

Whenever a new entropy is stabilized in the consensus layer, the consensus layer is going to emit the new entropy to the robustness layer, along with the updated list of nodes in the network. In general, this is going to happen once per consensus epoch, but it depends on the design of consensus layer, strictly speaking.

As soon as a new information arrives in, the robustness layer "immediately" builds a new topology configuration based on the current parameter that it knows. The topology configuration means the list of mix nodes for each layer (for now in v1).

As soon as the new topology configuration is generated, the robustness layer injects it to the mixnet layer. There is going to be no time delay between those two events.

The mixnet layer switches the topology configuration immediately, starts establishing new QUIC connections in background, start serving packets generated based on the new topology.

Note that we're not gonna have an explicit time delay before activating the new topology. The new topology is activated automatically as soon as it is injected from the robustness layer.

Packets based on the new topology can arrive in the mix node, even before all new connections are established. But, it is totally okay. Mix node always establish a new connection on demand to forward the packet with the best effort. Even if packets based on unexpected topology is arrived in, mix node never reject packets. This design is reasonable because we assume that all nodes in the network are ready to act as a mix node (at least for v1).

madxor · 2024-01-18T16:24:52Z

mixnet/mixnet.py

+            time.sleep(1 / 1000)
+
+            now = datetime.now()
+            if now < next_epoch_ts - timedelta(seconds=self.proactive_sec):


I hope that I understand this correctly, the intention here is to update the topology before the next epoch, so that when the new epoch starts it can already use the new topology?

If so, then it might be problematic due to the construction of the random beacon. That is, the beacon stability is more probable with time, that is, it is better to wait longer for the beacon. In other words, can we consider that the new topology must be usable with some delay, for example: the beacon is "stable" when it is in calculated in slot 1 of every epoch, then the topology is updated with some delay for example we assume that it is calculated in slot 2 and, usable for slot 3 and later of each epoch. Therefore, we are giving more time for network synchronization.

How does it sounds?

Also for me it is easier to reason when I we switch from thinking in seconds into thinking in slots and epochs. I see less synchronization problems ;)

I hope that I understand this correctly, the intention here is to update the topology before the next epoch, so that when the new epoch starts it can already use the new topology?

Exactly

In other words, can we consider that the new topology must be usable with some delay,

The strategy you suggested is reasonable if we design topology to be updated once per consensus_epoch. But, this PR assumed that the topology is updated once per mixnet_epoch, which is shorter than consensus_epoch, because I thought the consensus_epoch is going to be very long (e.g. 5days). I thought this is what we've discussed in Notion, but I probably didn't understand it correctly (#51 (comment)). Let me think which way between these two approaches is more reasonable.

Ah, even if we have mixnet_epoch shorter than consensus_epoch, we also need to use the newly injected entropy with some delays, as you explained.

youngjoon-lee · 2024-01-23T12:20:10Z

Closing this, and I'm going to open another PR according to #51 (comment).

youngjoon-lee added 6 commits January 15, 2024 15:37

refactor: move MixNode to node.py

d1c8b1e

Mixnet: Packet delay in mix node

45bef69

improve test

9e9606c

refactor: queue type

6a849c4

Mixnet: mix client emission

1fbc7f3

Mixnet: Periodical topology update + Test refactoring

1c7c8ec

youngjoon-lee requested review from danielSanchezQ, al8n and madxor January 18, 2024 07:11

youngjoon-lee self-assigned this Jan 18, 2024

youngjoon-lee mentioned this pull request Jan 18, 2024

Mixnet: Packet delay in mix node #49

Merged

danielSanchezQ reviewed Jan 18, 2024

View reviewed changes

madxor reviewed Jan 18, 2024

View reviewed changes

youngjoon-lee added 3 commits January 19, 2024 14:29

fix assertion

d162a7d

remove the mixnet_epoch concept and add entropy delay

9a64242

rename entropy_delay_sec to topology_update_delay_sec

1a97064

youngjoon-lee force-pushed the mixnet-client branch from 1fbc7f3 to 6bcc1f0 Compare January 23, 2024 01:39

youngjoon-lee changed the base branch from mixnet-client to master January 23, 2024 01:51

Merge branch 'master' into mixnet-topology-entropy

2961238

youngjoon-lee mentioned this pull request Jan 23, 2024

Mixnet: Refactor with asyncio #53

Merged

youngjoon-lee closed this Jan 23, 2024

youngjoon-lee mentioned this pull request Jan 25, 2024

Mixnet: topology update #56

Merged

youngjoon-lee deleted the mixnet-topology-entropy branch February 5, 2024 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixnet: Periodical topology update + Test refactoring #51

Mixnet: Periodical topology update + Test refactoring #51

youngjoon-lee commented Jan 18, 2024

danielSanchezQ Jan 18, 2024

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024

danielSanchezQ Jan 19, 2024

youngjoon-lee Jan 19, 2024

danielSanchezQ Jan 18, 2024

madxor Jan 18, 2024

danielSanchezQ Jan 18, 2024

youngjoon-lee Jan 19, 2024 •

edited

madxor Jan 19, 2024

madxor Jan 18, 2024

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

madxor Jan 19, 2024

youngjoon-lee Jan 22, 2024 •

edited

youngjoon-lee Jan 23, 2024 •

edited

madxor Jan 23, 2024

danielSanchezQ Jan 23, 2024

youngjoon-lee Jan 23, 2024 •

edited

madxor Jan 18, 2024

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024

youngjoon-lee commented Jan 23, 2024

Mixnet: Periodical topology update + Test refactoring #51

Mixnet: Periodical topology update + Test refactoring #51

Conversation

youngjoon-lee commented Jan 18, 2024

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee Jan 22, 2024 • edited

Choose a reason for hiding this comment

youngjoon-lee Jan 23, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee Jan 23, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youngjoon-lee commented Jan 23, 2024

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited

youngjoon-lee Jan 22, 2024 •

edited

youngjoon-lee Jan 23, 2024 •

edited

youngjoon-lee Jan 23, 2024 •

edited

youngjoon-lee Jan 19, 2024 •

edited