# Grouping Signals based on behavior

To enable simulation-assisted optimization, the idea is to analyze the signal behavior extracted from VCD waveforms.
For this, signals may be grouped, if they toggle together, i.e. they always have the same signal.
Accordingly, they can be combined, and the circuit design could possibly be simplified.
However, if only a single VCD waveform is analyzed, there may be signals that have the same values, but only coincidentally.
In order to achieve meaningful signal grouping, it is therefore advisable to perform several runs (for example, with different test benches).
This notebook shows how to group signals based on their behavior.
This does not necessarily require a circuit object as this is plain VCD waveform analysis.


<div class="admonition warning alert alert-warning" style="color: darkred;">
  <strong>Warning:</strong> This notebook is currently not executable as it relies on a list of VCD files to analyze.
  To keep the repository at a reasonable size, almost no VCD files are included.
  Accordingly, the notebook only demonstrates the code exemplarily, and how it can be modified to be used with generated VCD waveforms.
</div>

A very simple approach is to group signals of a VCD waveform if they toggle at the same timestamps.
This does not necessarily mean they are interchangeable, but it might indicate a certain causal connection.
The function `equal_toggles` from the `io.vcd` package groups all signals that toggle on the same timestamps.
The function `filter_signals` allows for filtering of signals based on additional properties, such as how often the signal must change at least, or how many signals at least have the same toggling timestamps.

In [None]:
from netlist_carpentry.io.vcd import VCDWaveform, equal_toggles, filter_signals

wf = VCDWaveform("path/to/vcd/file.vcd")
# Returns a dictionary of timestamps and a list of associated VCD vars that toggle on the given timestamps.
equal_toggles(wf)

# Filters signals from the given VCD waveform based on how often they toggle (at least 10 times)
# or how many signals toggle together in total (at least 2).
filter_signals(wf, min_occurences=2, min_changes=10)

The approach from the previous cell might be useful to inspect waveforms e.g. in regards to their activity.
This does not include the actual value trace of the signals.
In the following, it is shown exemplarily how signals can be compared and grouped based on their overall behavior for a bunch of given VCD files.

The first step is to collect the VCD files.
It is advisable to have the VCD files located in the same directory, for a simple read in.
In this example, the VCD files are located in `../path/to/vcd/files` (replace this with your own VCD file directory).
The first step is to sort the VCD files by size.
Although it might seem counterintuitive, it makes sense to read the largest VCD file first (and sort the file list accordingly).
The heuristic is that in a larger VCD file, there is more activity, and it is thus more likely that signal groups can be identified earlier in the grouping process.
If a signal does not have another signal toggling together with the first, this signal can be directly removed from the consideration, as it is evidently unique.
By analyzing the largest file first, a large section of signals will be eliminated directly in the first iteration, reducing the overall workload and runtime.

In [None]:
from pathlib import Path

directory = Path("../path/to/vcd/files")
files = [p for p in directory.iterdir() if p.is_file()]
files_sorted = sorted(files, key=lambda p: p.stat().st_size, reverse=True)

For the signal grouping (or matching), a predefined function can be used.
The function `find_matching_signals` from the `io.vcd` package takes a list of files (the sorted VCD files from the previous cell).
The function returns a list of signal groups.
Signals that always had the same values throughout all files are in the same group.
Signals that were present in the VCD file but are not in any group have either never toggled (stayed constant and were thus excluded for simplicity), or are unique.


<div class="admonition warning alert alert-warning" style="color: darkred;">
  <strong>Warning:</strong> The names of the top-level scopes (i.e. the testbench names) must match in all VCD files, otherwise grouping and matching will not work, as the root names would differ!
  If a testbench is named <b>tb_1</b> and another <b>tb_2</b>, then the signal <b>A</b> of the DUT instance <b>I_DUT</b> will be <b>tb_1.I_DUT.A</b> in one VCD file and <b>tb_2.I_DUT.A</b> in the other file.
  Since these hierarchical paths are different, the matching algorithm will treat them as different signals.
  The grouping is likely to fail in such case!
</div>

In [None]:
from netlist_carpentry.io.vcd import find_matching_signals

coupled_sigs = find_matching_signals(files_sorted)

The retrieved list of signal groups will contain all signals that have toggled together throughout all VCD files.
Each group consists of a list of hierarchical paths (i.e. strings) to signals that always had the same value.
Execute the cell below to write all signal groups into a file and print them into the output.

In [None]:
sigs_str = ""
for idx, vars in enumerate(coupled_sigs):
    sigs_str += f"Group {idx}:\n"
    sigs_str += "".join("\t"+var+"\n" for var in vars)
with open("signal_groups.txt", "w+") as f:
    f.write(sigs_str)
print(sigs_str)