Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nwaku simulation requirements #108

Open
alrevuelta opened this issue Apr 19, 2023 · 2 comments
Open

nwaku simulation requirements #108

alrevuelta opened this issue Apr 19, 2023 · 2 comments

Comments

@alrevuelta
Copy link

alrevuelta commented Apr 19, 2023

As discussed in a meeting, we agreed on using the current features that we have in wakurtosis to run some simulations and try to i) learn more about nwaku behaviour with a significant amount of nodes and ii) showcase all the developed features and start using them in practice.

Here I list of nice to have set of requirements, more or less with what we discussed previosly having first test this in mind and adding some more specific details. Thinking about wakurtosis as a blackbox, I think we can divide these "requirements" into:

  • Inputs: amount of nodes, configuration, connectivity, traffic, etc.
  • Outputs: set of metrics during the simulation, coming from different sources and types (time series, vs distribution)

Simulation 1

Inputs:

  • Only nwaku nodes
  • Only relay protocol
  • Only one pubsub topic
  • Amount of nodes 300
  • Using discv5 with peers "randomly" forming a mesh. Meaning no hardcoded connections.
  • Simulation time 6 hours.
  • Traffic injected via existing RPC method is fine. waku-publisher as an alternative, but not required.
  • Traffic (both at the same time)
    • a) 50 messages per second of 5kBytes each. Fixed or gausian distributed is fine.
    • b) 5 messages per second of 200 kBytes eaach. Fixed or gausian distributed is fine.
  • In order to see the gossiping in the network, each node must be connected to a maximum of 25 nodes.
  • Release v0.16.0

Output:
The existing pdf shared in the past with the results would be a perfect way to share the results. Would suggest adding more information such as release version, timestamp, and some time series information coming from prometheus (waku and nim-libp2p). So I would suggest keeping the existing report data in the pdfs we shared:

  • Propagation time (distribution): nice to have, not a requirement
  • Peak CPU usage (distribution): nice to have, not a requirement
  • Peak mem usage (distribution): nice to have, not a requirement
  • Total network IO (distribution): nice to have, not a requirement

And would suggest plotting also a time series representation for the above metrics for a randomly amount of selected nodes (let's say 5). If we have 300 nodes displaying them all would be too much, but having some time series ones from a bunch of nodes, can help validating the simulation.

And add on top:

  • Message loss (distribution). (eg if the network has 300 nodes and a message only arrives to 298 nodes, track that). Unsure it this feature is ready.

And the following prometheus metrics. Same as the other, some time series with a bunch of random ones, and calculate the probability densitiy fuction of the rest (or similar statistical "summarized" representation.)

  • libp2p_gossipsub_peers_per_topic_mesh: important to check that stays between D_low and D_high, which are the healthy amount of peers for a topic.
  • libp2p_gossipsub_received_total: used to validate the message rate. (increase(libp2p_gossipsub_received_total[1m]))/60 will display the amount of messages per second.
  • libp2p_peers: amount of connected peers.
  • (added): bandwdith over time (from cadvisor is ok)

Can you confirm if:

  • is it possible to calculate message loss rate.
  • is it possible to calculate the probability density function on prometheus metrics. If not, time series of a bunch of random nodes is fine for this simulation.
@Daimakaimura
Copy link
Contributor

Thank you for putting this together @alrevuelta I do have some questions / comments:

  • Regarding traffic injection you mention "both at the same time" At the moment we only support a single source of traffic.
  • Regarding the NWaku Prometheus metrics, would you like us to calculate the distributions and add them to the final PDF figure?
  • Yes, it is possible to calculate the message loss rate (it is already calculated)
  • Yes, we can calculate the PDFs those 3 Prometheus metrics and add this to the final PDF. However this is not implemented at the moment and I am wondering if you would like to wait for this to be added or you rather get some results without those metrics ASAP.

@alrevuelta
Copy link
Author

Regarding traffic injection you mention "both at the same time" At the moment we only support a single source of traffic.

No problem, will edit the requirements with just one source of traffic.

Regarding the NWaku Prometheus metrics, would you like us to calculate the distributions and add them to the final PDF figure

Since I assume this is not imlpemented, im fine with having the raw prometheus time series (without the distribution) by now.

Yes, it is possible to calculate the message loss rate (it is already calculated)

Great feature!

Yes, we can calculate the PDFs those 3 Prometheus metrics and add this to the final PDF. However this is not implemented at the moment and I am wondering if you would like to wait for this to be added or you rather get some results without those metrics ASAP.

No problem, lets stick to just the time series metrics to have something asap. pdfs are nice, but in a short simulation (eg few hours) perhaps with the time series if enough. So lets leave that by now :)

@kaiserd kaiserd mentioned this issue Jun 13, 2023
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants