Skip to content

Commit

Permalink
Merge pull request #649 from dedis/simul_docu
Browse files Browse the repository at this point in the history
Updating documentation for simulations and .ns-file
  • Loading branch information
Jeff R. Allen committed Jun 24, 2020
2 parents 64b58cc + 539f333 commit f15728b
Show file tree
Hide file tree
Showing 3 changed files with 133 additions and 20 deletions.
136 changes: 121 additions & 15 deletions simul/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,20 @@

The onet library allows for multiple levels of simulations:

- localhost:

- [Localhost](./platform/LOCALHOST.md):
- up to 100 nodes

- mininet:

- [Mininet](./platform/MININET.md):
- up to 300 nodes on a 48-core machine, multiplied by the number of machines
available

- define max. bandwidth and delay for your network

- deterlab:

- [Deterlab](./platform/DETERLAB.md):
- up to 1000 nodes on a strong machine, multiplied by the number of machines
available

Refer to the simulation-examples in simul/manage/simulation and
<https://github.com/dedis/cothority_template>
Refer to the simulation-examples in one of the following places:
- [./manage/simulation](./manage/simulation)
- [./test_simul](./test_simul)
- https://github.com/dedis/cothority_template

## Runfile for simulations

Expand All @@ -35,16 +31,62 @@ experiment, where each experiment makes up one line.
### Necessary variables

- `Simulation` - what simulation to run
- `Hosts` - how many hosts to instantiate
- `Servers` - how many servers to use
- `Hosts` - how many hosts to instantiate - this corresponds to the nodes
that will be running and available in the main `Roster`
- `Servers` - how many servers to use maximum - if less than this number of
servers are available, a warning will be printed, but the simulation will
still be run

The `Servers` will mostly influence how the simulation will be run.
Depending on the platform, this will be handled differently:
- `localhost` - `Servers` is ignored here
- `Deterlab` - the system will distribute the `Hosts` nodes over the
available servers, but not over more than `Servers`.
This allows for running simulations that are smaller than your whole DETERLab experiment without having to modify and restart the
experiment.
- `Mininet` - as in `Deterlab`, the `Hosts` nodes will be distributed over
a maximum of `Servers`.

### onet.SimulationBFTree

If you use the `onet.SimulationBFTree`, the following variables are also available:
The standard simulation (and the only one implemented) is the
`SimulationBFTree`, which will prepare the `Roster` and the `Tree` for the
simulation.
Even if you use the `SimulationBFTree`, you're not restricted to use only the
prepared `Tree`.
However, there will not be more nodes available than the ones in the prepared
`Roster`.
Some restrictions apply when you're using the `Deterlab` simulation:
- all nodes on one server (`Hosts` / min(available servers, `Servers`)) are
run in one binary, which means
- bandwidth measurements cover all the nodes
- time measurements need to make sure no other calculations are taking place
- the bandwidth- and delay-restrictions only apply between two physical servers, so
- the simulation makes sure that all connected nodes in the `Tree` are always
on different servers. If you use another communication than the one in the
`Tree`, this will mean that the system cannot guarantee that the
communication is restricted
- the bandwidth restrictions apply to the sum of all communications between
two servers, so to a number of hosts
If you want to have a bandwidth restriction that is between all nodes, and
`Hosts > Servers`, you have to use the `Mininet` platform, which doesn't
have this restriction.

The following variables define how the original `Tree` is calculated - only
one of the two should be given:

- `BF` - branching factor: how many children each node has
- `Depth` - the depth of the tree in levels below the root-node
- `Rounds` - for how many rounds the simulation should run

If there are 13 `Hosts` with a `BF` of 3, the system will create a complete
tree with the root-node having 3 children, and each of the children having 3
more children.
The same setup can be achieved with 13 `Hosts` and a `Depth` of 3.

If the tree to be created is not complete, it will be filled breath-first and
the children of the last row will be distributed as evenly as possible.

In addition, `Rounds` defines how many rounds the simulation will run.

### Statistics for subset of hosts

Expand Down Expand Up @@ -164,3 +206,67 @@ Alternatively, it can be set for each individual experiment:
7,100
15,200
31,400

## test_data format

Every simulation will be written to the `test_data` directory with the name
of the simulation file as base and a `.csv` applied.
The configuration of the simulation file is written to the tables in the
following columns, which are copied as-is from the simulation file:

- hosts, bf, delay, depth, other, prescript, ratio, rounds, servers, suite

For all the other measurements, the following statistics are available:

- `_avg` - the average
- `_std` - standard-deviation
- `_min` - minimum
- `_max` - maximum
- `_sum` - sum of all calls

### measure.NewTimeMeasure

The following measurements will be taken for `measure.NewTimeMeasure`:
- `_user` - user-space time, crypto and other calculations
- `_system` - system-space time - disk i/o network i/o
- `_wall` - wall-clock, as described above

The measurements are given in seconds.
There is an important difference in the `_wall` and the `_user`/`_system`
measurements: the `_wall` measurements indicate how much time an external
observer would have measured.
So if the system waits for a reply of the network, this waiting time is
included in the measurement.
Contrary to this, the `_user`/`_system` measures how much work has been done
by the CPU during the measurement.
When measuring parallel execution of code, it is possible that the
`_user`/`_system` measurements are bigger than the `_wall` measurements
, because more than one CPU participated in the calculation.
The difference in `_user`/`_system` is explained for example here:
https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1
The `_wall` corresponds to the `real` in this comment.

There are some standard time measurements done by the simulation:
- `ChildrenWait` - how long the system had to wait for all children to be
available - might show problems in setting up the servers
- `SimulSyncWait` - how long the system had to wait at the end of the
simulation - might indicate problems in the wrap-up of the simulation

### measure.NewCounterIOMeasure

If you want to measure bandwidth, you can use `measure.NewCounterIOMeasure`.
But you have to be careful to make sure that the system will not include
traffic that is outside of your scope by putting the `.Record()` as close as
possible to the `NewCounterIOMeasure`.
Every `CounterIOMeasure` has the following statistics:

- `_tx` - transmission-bytes
- `_rx` - bytes received
- `_msg_tx` - packets transmitted
- `_msg_rx` - packets received

Plus the standard modifiers (`_avg`, `_std`, ...).

There are two standard measurements done by every simulation:
- `bandwidth` (empty) - all node bandwidth
- `bandwidth_root` - bandwidth of the first node of the roster
4 changes: 3 additions & 1 deletion simul/platform/DETERLAB.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ Before a successful Deterlab simulation, you need to
1. be signed up at Deterlab. If you're working with DEDIS, ask your
responsible for the _Project Name_ and the _Group Name_.
2. create a simulation defined by an NS-file. You can find a simple
NS-file here: [cothority.ns](./deterlab_users/cothority.ns)
NS-file here: [cothority.ns](./deterlab_users/cothority.ns) - you'll need to
adjust the # of servers, the type of servers, and the bandwidth- and delay
restrictions.
3. swap the simulation in

For point 3. it is important of course that Deterlab has enough machines
Expand Down
13 changes: 9 additions & 4 deletions simul/platform/deterlab_users/cothority.ns
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
set ns [new Simulator]
source tb_compat.tcl

# Set you number of servers here, as well s the delay.
# The bandwidth can be set below.

set server_count 30
set server_delay 100ms
set lanstr ""
Expand All @@ -9,14 +12,16 @@ set lanstr ""

for {set i 0} {$i < $server_count} {incr i} {
set server($i) [$ns node]
#tb-set-hardware $server($i) dl380g3
#tb-set-hardware $server($i) MicroCloud
tb-set-hardware $server($i) pc2133
# Uncomment the server hardware you're using here, or add your own
#tb-set-hardware $server($i) dl380g3
#tb-set-hardware $server($i) MicroCloud
tb-set-hardware $server($i) pc2133
tb-set-node-os $server($i) Ubuntu1404-64-STD
append server_lanstr "$server($i) "
}


# Here you can set your bandwidth restrictions by replacing the 144Mb with the
# expected bandwidth
tb-use-endnodeshaping 1
set serverlan [$ns make-lan "$server_lanstr" 144Mb $server_delay]

Expand Down

0 comments on commit f15728b

Please sign in to comment.