In [None]:
%matplotlib inline

import glob
import os
import subprocess

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import ccs

sns.set_theme(style="whitegrid")

# Evaluation Playground

Start building the evaluation for the reception study here.
Move more mature stuff to specific notebooks and scripts later.

## Simulation and Result Pre-Processing

Doing the stuff outside python:

In [None]:
CONFIGS = ["Default", "DoubleNoise"]

In [None]:
# run simulations
for omnetpp_config in CONFIGS:
    subprocess.run(
        ["../lib/veins/bin/veins_run", "-u", "Cmdenv", "-c", omnetpp_config],
        env={"PATH": f"/scratch/buse/sumo-1_6_0/bin:{os.environ['PATH']}"},
        cwd="scenario",
    )
# convert to csv
for ext in ["sca", "vec"]:
    for opp_result_file in glob.glob(f"scenario/results/*.{ext}"):
        print(f"{opp_result_file} -> {opp_result_file}.csv.gz")
        !{"lib/veins_scripts/eval/opp_" + ext + "2longcsv.sh"} {opp_result_file} | gzip > {opp_result_file + ".csv.gz"}
    
!ls -hl scenario/results/*.csv.gz

## Data Reading

Using the new RE-based Version using the `ccs` Package

With the aim to read results from multiple simulation runs (and thus csv files) in one go.
And extract simulation paramters at the same time.

In [None]:
# configuration columns used in various places
CONF_COLS = ["config", "runnr"]

In [None]:
RESULTDIR = "scenario/results"
dfs = (
    ccs.read_csvs(
        RESULTDIR,
        r"(?P<config>[^-]+)-#(?P<runnr>\d+)\.vec\.csv\.gz",
        read_csv_args={"sep": " ", "names": ["vecid", "module", "signal", "event", "time", "value"]},
    )
    # set index types
    .reset_index(CONF_COLS)
    .astype({"config": "category", "runnr": int})
    .reset_index(drop=True)
    # extract submodule and host number
    .pipe(lambda df: df.merge(df.module.str.extract(r"[^.]+\.node\[(?P<hostnr>\d+)\]\.(?P<submodule>.*)"), left_index=True, right_index=True))
    .drop(columns=['module'])
    .assign(signal=lambda df: df.signal.str.replace(":vector", ""))
    .astype({"submodule": "category", "signal": "category", "hostnr": int})
)
dfs.info()
dfs.head(3)

In [None]:
dfs.groupby(CONF_COLS + ["hostnr", "signal"]).event.count().unstack()

### Check Mobility

See how the postion of the receiver (host 0) changes over time to ensure there is smooth movement and a stable relationship between time and distance.

In [None]:
positions = dfs.query("hostnr == 0 and signal in ('posx', 'posy')").pivot(index=CONF_COLS + ["time"], columns=["signal"], values="value").reset_index()
assert (positions.groupby(CONF_COLS).posy.diff().dropna() == 0).all()  # The Y coordinate should not change
positions.head(3)

Both time and position (mostly) advance at 1s / 1m per recorded item, so that's fine (Veins' `updateInterval` is set to 1s and the vehicle moves with 1 m/s).

There are little rounding errors in the position, but that is fine for our purposes.
We just round to full meters and convert to `int` later.

In [None]:
positions.groupby(CONF_COLS)[['time', 'posx']].apply(lambda df: df.diff().describe()).unstack(CONF_COLS).T

In [None]:
distance = (
    dfs.query("signal == 'posx'")
    .assign(second=lambda df: df.time.astype(int))
    .pivot(index=CONF_COLS + ["second"], columns="hostnr", values="value")
    .fillna(method="backfill")
    .pipe(lambda df: pd.Series(df[0] - df[1], name="distance"))
    .round(0)
    .astype(int)
)
distance

## First Insights

Explore the RSS and SNR over time and (later) distance.

I have added recording points into `veins::Decider80211p::processSignalEnd` method.
There I record some signal properties, regardless of wheter the signal could be decoded or even detected.
That would not have been possible in the MAC layer, as that only knows about successfully decoded frames.
However, I'll only get valid SNR values for signals that were at least detected -- for others the Decider stops early and does not even compute it.

In [None]:
receptions = (
    dfs.query("hostnr == 0 and signal in ('RSSIdBm', 'SNR', 'Correct', 'Detected')")
    [CONF_COLS + ['time', 'signal', 'value']]  # TODO: drop?
    .pivot(index=CONF_COLS + ["time"], columns=["signal"], values="value")
    .assign(SNRdB=lambda df: 10 * np.log10(df.SNR))
    .astype({"Correct": bool, "Detected": bool})
    .reset_index()
    .assign(second=lambda df: df.time.astype(int))
    .set_index(CONF_COLS + ["second"]).assign(distance=distance).reset_index()
    [CONF_COLS + ['time', 'distance', 'Detected', 'Correct', 'RSSIdBm', 'SNRdB']]
)
receptions.head()

### Detection Threshold

Message detection stops at around 1120 meters of distance (vehicles start with 1 m between them and diverge with 1 m/s).

In [None]:
detection_cutoff_distances = receptions[receptions.Detected].groupby(CONF_COLS).distance.max()
assert (detection_cutoff_distances.diff().dropna() == 0).all()
detection_cutoff_distance = detection_cutoff_distances.iloc[0]
detection_cutoff_distance

In [None]:
ax = sns.violinplot(data=receptions, y="distance", hue="Detected", x="config")
ax.hlines(y=detection_cutoff_distance, xmin=ax.get_xlim()[0], xmax=ax.get_xlim()[1], color="grey", linestyles="dashed")
receptions.groupby(CONF_COLS + ["Detected"]).distance.describe()

There actually is a hard cut-off with no stochastic process in beween.

**Note**: There is no influence of noise or interference here, this boundary is purely based on the RSS of the incoming signal itself and the receiver config.
However, a previous message that the receiver is trained on will affect detection (only one signal receptable at the time, no frame capturing).

In [None]:
detection = receptions.groupby(CONF_COLS + ["distance"]).Detected.sum().reset_index()
fig, ax = plt.subplots()
sns.scatterplot(data=detection, x="distance", y="Detected", hue="config", ax=ax)
ax.set_xlim(left=detection.query("Detected == 100").distance.iloc[-1] - 20, right=detection.query("Detected == 0").distance.iloc[0] + 20)
ax.vlines(x=detection_cutoff_distance, ymin=0, ymax=100, color="grey", linestyle="dashed")

### SNR and RSS

Signals will only be detected if they are above the `minPowerLevel` setting (of -98 dBm, indicated by the dotted horizontal line).
Signals below that will still be processed by the Decider, but not even considered for decoding.
Thus, there are no values for the SNR for that.

Also note that with increased noise, even signals with a SNR below 0 dBm are considered for detection.

In [None]:
fig, ax = plt.subplots(figsize=(18, 6))
sns.lineplot(
    data=receptions.melt(id_vars=CONF_COLS + ["time", "distance", "Detected", "Correct"], var_name="signal", value_name="value"),
    x="distance",
    y="value",
    hue="signal",
    style="config",
    estimator="mean",
    ci=None,
    ax=ax,
)
ax.hlines(y=-98, xmin=receptions.distance.min(), xmax=receptions.distance.max(), color="grey", linestyle="dotted")
ax.vlines(x=detection_cutoff_distance, ymin=receptions.RSSIdBm.min(), ymax=receptions.SNRdB.max(), color="grey", linestyle="dotted")
ax.set_ylabel("RSS [dBm] / SNR [dB]")

### Decodability

Signals start to become not decodable (aka not `correct`) at around 450 m for the `Default`-configured signal.
Though it is only spurious at the time -- most messages still come through.
Only at arount 600 m there are no more decodable messages.

When looking at the relation beween decodability and RSS/SNR, a similar pattern is visible.
Just note that the x axis appears flipped now, as distance increases over time while RSS (and thus also SNR) decreases.

These patterns appear to match the plots in `bloessl2019case` and the original NIST error model paper (for the configured QPSK 1/2 and 500 Byte frames, as we do here).

With the increased noise (which simulates a single max-interferer), the decodability is visibly worse.
In the RSSI-plot, the shift of 3 dBm is directly visible.
Distance wise, the `DoubleNoise`-config start loosing packets at around 350 m and stops receiving entirely at around 410 m.
So not only is the reliable range much lower, the further range in which at least some messages get through is much smaller (60 m compared to 150 m).

In [None]:
pdr = receptions.groupby(CONF_COLS + ["distance"]).agg({"Correct": "sum", "RSSIdBm": "mean", "SNRdB": "mean"}).reset_index()
pdr_change_boundaries = pdr.query("Correct < 100 and Correct > 0")

fig, (left, mid, right) = plt.subplots(1, 3, figsize=(18, 5), sharey=True, constrained_layout=True)
sns.scatterplot(data=pdr, x="distance", y="Correct", hue="config", ax=left)
sns.scatterplot(data=pdr, y="Correct", x="RSSIdBm", hue="config", ax=mid)
sns.scatterplot(data=pdr, y="Correct", x="SNRdB", hue="config", ax=right)
left.set_xlim(left=pdr_change_boundaries.distance.min() - 10, right=pdr_change_boundaries.distance.max() + 10)
mid.set_xlim(left=pdr_change_boundaries.RSSIdBm.min() - 0.5, right=pdr_change_boundaries.RSSIdBm.max() + 0.5)
right.set_xlim(left=pdr_change_boundaries.SNRdB.min() - 0.5, right=pdr_change_boundaries.SNRdB.max() + 0.5)

## Excursus: How dB changes when the base values change

What happens with a number expressed in dB when the underlying absolute nubmer is doubled or halved?
This should give me some intuition on how much interference or noise has to increase to have significant effects on decodability.

In [None]:
base_number = pd.Series(2 ** np.arange(0, 8, 0.5))
base_vs_dB = pd.DataFrame({"base": base_number, "dB": 10 * np.log10(base_number)})

fig, (left, right) = plt.subplots(1, 2, figsize=(24, 4))
base_vs_dB.plot(x="base", y="dB", ax=left)
right.axis('off')
right.table(cellText=base_vs_dB.round(2).values, colLabels=base_vs_dB.columns, loc='center')
# base_vs_dB.head(8)

Result: Follow the rule of thumb **"doubling the base number adds 3 dB"**.

Thus, halving the base number subtracts 3 dB.

So, if there is *interference equal to the noise floor* (currently -98 dBm), then the SINR in dBm is *3 dB lower* than without any interference.

## TODO: Influence of Interference

Now that I know about the basic behavior, I want to find out how increased interference could change the results.
Note that Veins treats interference and noise mostly the same (except for some reporting), so I could also just adapt the noise to get an impression of what would change.

Main Questions:

- How much interference/noise is needed to significantly shift the reception behavior?
- How much interference can there be (assuming CSMA/CA works)?
    * ANSWERED: for a single source of interference (that is not decodable).
- And finally: at what distance will a signal be so weak that it can not interfere with the reception of another signal anymore?

#### First thoughts and insights:

- at around 1220m, the message can no longer be detected
    * because the RSS is below the `minPowerlevel` of -98 dBm
    * the noise floor is configured to the same value of -98 dBm, but does not have an influence on the *detectability*, only *decodabiliity* (through SNR)
- decoding stops much earlier, at around 600 m (without any interference)
- so, an interfering signal sent from around 1225 m will also have a RSS of around -98 dBm
    * at that time, SINR would be half as big as the pure SNR, or 3 dB lower
    * we need a SINR of around 7 dB for reliable decoding, present at around 500 m
    * to compensate for said interference, we need around 10 dB SNR, present at around only 350 m
    * this matches the observations from above with the `DoubleNoise` approach.

Verification approach: just increase the noise level by 3 dBm and compare outcomes, the reliable decodablity should be around the numbers above.