Merge pull request #649 from dedis/simul_docu

Updating documentation for simulations and .ns-file
dedis · Jun 24, 2020 · f15728b · f15728b
2 parents 64b58cc + 539f333
commit f15728b
Show file tree

Hide file tree

Showing 3 changed files with 133 additions and 20 deletions.
diff --git a/simul/README.md b/simul/README.md
@@ -2,24 +2,20 @@
 
 The onet library allows for multiple levels of simulations:
 
--   localhost:
-
+-   [Localhost](./platform/LOCALHOST.md):
     -   up to 100 nodes
-
--   mininet:
-
+-   [Mininet](./platform/MININET.md):
     -   up to 300 nodes on a 48-core machine, multiplied by the number of machines
         available
-
     -   define max. bandwidth and delay for your network
-
--   deterlab:
-
+-   [Deterlab](./platform/DETERLAB.md):
     -   up to 1000 nodes on a strong machine, multiplied by the number of machines
         available
 
-Refer to the simulation-examples in simul/manage/simulation and
-<https://github.com/dedis/cothority_template>
+Refer to the simulation-examples in one of the following places:
+- [./manage/simulation](./manage/simulation)
+- [./test_simul](./test_simul)
+- https://github.com/dedis/cothority_template
 
 ## Runfile for simulations
 
@@ -35,16 +31,62 @@ experiment, where each experiment makes up one line.
 ### Necessary variables
 
 -   `Simulation` - what simulation to run
--   `Hosts` - how many hosts to instantiate
--   `Servers` - how many servers to use
+-   `Hosts` - how many hosts to instantiate - this corresponds to the nodes
+ that will be running and available in the main `Roster`
+-   `Servers` - how many servers to use maximum - if less than this number of
+ servers are available, a warning will be printed, but the simulation will
+  still be run 
+
+The `Servers` will mostly influence how the simulation will be run.
+Depending on the platform, this will be handled differently:
+- `localhost` - `Servers` is ignored here
+- `Deterlab` - the system will distribute the `Hosts` nodes over the
+ available servers, but not over more than `Servers`.
+ This allows for running simulations that are smaller than your whole DETERLab experiment without having to modify and restart the
+  experiment.
+- `Mininet` - as in `Deterlab`, the `Hosts` nodes will be distributed over
+ a maximum of `Servers`.
 
 ### onet.SimulationBFTree
 
-If you use the `onet.SimulationBFTree`, the following variables are also available:
+The standard simulation (and the only one implemented) is the
+ `SimulationBFTree`, which will prepare the `Roster` and the `Tree` for the
+ simulation.
+Even if you use the `SimulationBFTree`, you're not restricted to use only the
+ prepared `Tree`.
+However, there will not be more nodes available than the ones in the prepared
+ `Roster`.
+Some restrictions apply when you're using the `Deterlab` simulation: 
+- all nodes on one server (`Hosts` / min(available servers, `Servers`)) are
+ run in one binary, which means
+  - bandwidth measurements cover all the nodes
+  - time measurements need to make sure no other calculations are taking place  
+- the bandwidth- and delay-restrictions only apply between two physical servers, so
+  - the simulation makes sure that all connected nodes in the `Tree` are always
+    on different servers. If you use another communication than the one in the
+    `Tree`, this will mean that the system cannot guarantee that the
+    communication is restricted
+  - the bandwidth restrictions apply to the sum of all communications between
+   two servers, so to a number of hosts
+If you want to have a bandwidth restriction that is between all nodes, and
+ `Hosts > Servers`, you have to use the `Mininet` platform, which doesn't
+  have this restriction.  
+
+The following variables define how the original `Tree` is calculated - only
+ one of the two should be given:
 
 -   `BF` - branching factor: how many children each node has
 -   `Depth` - the depth of the tree in levels below the root-node
--   `Rounds` - for how many rounds the simulation should run
+
+If there are 13 `Hosts` with a `BF` of 3, the system will create a complete
+ tree with the root-node having 3 children, and each of the children having 3
+ more children.
+The same setup can be achieved with 13 `Hosts` and a `Depth` of 3. 
+
+If the tree to be created is not complete, it will be filled breath-first and
+ the children of the last row will be distributed as evenly as possible. 
+
+In addition, `Rounds` defines how many rounds the simulation will run.
 
 ### Statistics for subset of hosts
 
@@ -164,3 +206,67 @@ Alternatively, it can be set for each individual experiment:
     7,100
     15,200
     31,400
+
+## test_data format
+
+Every simulation will be written to the `test_data` directory with the name
+ of the simulation file as base and a `.csv` applied.
+The configuration of the simulation file is written to the tables in the
+ following columns, which are copied as-is from the simulation file:
+
+- hosts, bf, delay, depth, other, prescript, ratio, rounds, servers, suite
+
+For all the other measurements, the following statistics are available:
+
+- `_avg` - the average
+- `_std` - standard-deviation
+- `_min` - minimum
+- `_max` - maximum
+- `_sum` - sum of all calls
+
+### measure.NewTimeMeasure
+
+The following measurements will be taken for `measure.NewTimeMeasure`:
+- `_user` - user-space time, crypto and other calculations
+- `_system` - system-space time - disk i/o network i/o
+- `_wall` - wall-clock, as described above
+
+The measurements are given in seconds.
+There is an important difference in the `_wall` and the `_user`/`_system` 
+measurements: the `_wall` measurements indicate how much time an external
+ observer would have measured.
+So if the system waits for a reply of the network, this waiting time is
+ included in the measurement.
+Contrary to this, the `_user`/`_system` measures how much work has been done
+ by the CPU during the measurement.
+When measuring parallel execution of code, it is possible that the 
+`_user`/`_system` measurements are bigger than the `_wall` measurements
+, because more than one CPU participated in the calculation.
+The difference in `_user`/`_system` is explained for example here: 
+https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1
+The `_wall` corresponds to the `real` in this comment.
+
+There are some standard time measurements done by the simulation:
+- `ChildrenWait` - how long the system had to wait for all children to be
+ available - might show problems in setting up the servers
+- `SimulSyncWait` - how long the system had to wait at the end of the
+ simulation - might indicate problems in the wrap-up of the simulation
+
+### measure.NewCounterIOMeasure
+
+If you want to measure bandwidth, you can use `measure.NewCounterIOMeasure`.
+But you have to be careful to make sure that the system will not include
+ traffic that is outside of your scope by putting the `.Record()` as close as
+  possible to the `NewCounterIOMeasure`.
+Every `CounterIOMeasure` has the following statistics:
+
+- `_tx` - transmission-bytes
+- `_rx` - bytes received
+- `_msg_tx` - packets transmitted
+- `_msg_rx` - packets received
+
+Plus the standard modifiers (`_avg`, `_std`, ...).
+
+There are two standard measurements done by every simulation:
+- `bandwidth` (empty) - all node bandwidth
+- `bandwidth_root` - bandwidth of the first node of the roster
diff --git a/simul/platform/DETERLAB.md b/simul/platform/DETERLAB.md
@@ -18,7 +18,9 @@ Before a successful Deterlab simulation, you need to
 1. be signed up at Deterlab. If you're working with DEDIS, ask your
 responsible for the _Project Name_ and the _Group Name_.
 2. create a simulation defined by an NS-file. You can find a simple
-NS-file here: [cothority.ns](./deterlab_users/cothority.ns)
+NS-file here: [cothority.ns](./deterlab_users/cothority.ns) - you'll need to
+ adjust the # of servers, the type of servers, and the bandwidth- and delay
+  restrictions.
 3. swap the simulation in
 
 For point 3. it is important of course that Deterlab has enough machines

diff --git a/simul/platform/deterlab_users/cothority.ns b/simul/platform/deterlab_users/cothority.ns
@@ -1,6 +1,9 @@
 set ns [new Simulator]
 source tb_compat.tcl
 
+# Set you number of servers here, as well s the delay.
+# The bandwidth can be set below.
+
 set server_count 30
 set server_delay 100ms
 set lanstr ""
@@ -9,14 +12,16 @@ set lanstr ""
 
 for {set i 0} {$i < $server_count} {incr i} {
         set server($i) [$ns node]
-           #tb-set-hardware $server($i) dl380g3
-           #tb-set-hardware $server($i) MicroCloud
-           tb-set-hardware $server($i) pc2133
+            # Uncomment the server hardware you're using here, or add your own
+            #tb-set-hardware $server($i) dl380g3
+            #tb-set-hardware $server($i) MicroCloud
+            tb-set-hardware $server($i) pc2133
         tb-set-node-os $server($i) Ubuntu1404-64-STD
         append server_lanstr "$server($i) "
 }
 
-
+# Here you can set your bandwidth restrictions by replacing the 144Mb with the
+# expected bandwidth
 tb-use-endnodeshaping 1
 set serverlan [$ns make-lan "$server_lanstr" 144Mb $server_delay]