# Setting Up HW2VEC 

HW2VEC is a Graph Neural Network(GNN)-based hardware trojan detection tool. Its source-code is available at https://github.com/AICPS/hw2vec. **The package versions and the commands (for setting up HW2VEC tool) mentioned in the above repo cause lot of errors as they are quite old. So, please don't follow them.** After many iterations of trials and errors, the following commands (in order) worked for setting up the HW2VEC tool in my system. 

`git clone https://github.com/AICPS/hw2vec`

`conda create --name hw2vec_python37_pyg_pygraphviz`

`conda activate hw2vec_python37_pyg_pygraphviz`

`conda install python=3.7`

`conda install pyg -c pyg`

`python3 -m pip install pygraphviz`

`python3 -m pip install pydot pathlib patool pyunpack pyverilog`

`python3 -m pip install pytorch-nlp`

`python3 -m pip install scikit-image`

`python3 -m pip install pandas networkx matplotlib`

`conda list`

My system has:
1. **Ubuntu 20.04.4 LTS** (execute `lsb_release -d` in your Ubuntu terminal to see the version of your Ubuntu)
2. **Anaconda**
3. **CUDA 12** (execute `nvcc --version`in your Ubuntu terminal to see the CUDA version/availability (nvcc = nvidia cuda compiler))

Having set up the HW2VEC tool, let's execute the example script `hw2vec/graph2vec/models.py`. This script basically takes an input digital design (a Verilog file), converts it into a graph (in which each node has features or embeddings) and then, inputs the graph to a Graph Neural Network (GNN). The GNN transforms the node features into a new set of node features. This tranformed graph (with new node-features) can be used to train a Multi-Layered Perceptron (MLP).

I found some issues in the tool due to which the following steps -- **1-3** -- need to be performed before one can execute the script in **step 4**.

**Step 1:** Modify **line 105** of `hw2vec/graph2vec/models.py` as:

`self.load_state_dict(torch.load(model_weight_path, map_location=torch.device('cpu')),strict=False)`

**Step 2:** Modify **line 28** of `hw2vec/examples/use_case_1.py` as:

`hw_design_dir_path = Path("../assets/datasets/TJ-RTL-toy/TjFree/RS232/")`

Here, `RS232` is the directory containing the circuit-implementation of **RS232**. None of the circuits inside this directory can have the title `topModule.v`. Else, the `pre-process` function of `HW2GRAPH` class in `.hw2vec/hw2graph.py` can't execute on them.

**Step 3:** Unzip the assets/datasets.zip

**Step 4:** Execute the following command in the terminal:

`python use_case_1.py`

## Pyverilog

I setup the environment for installing Pyverilog in my system. The reason for doing this was to see if Pyverilog can process System Verilog files.

For installation steps, I referred to the [Pyverilog repo](https://github.com/PyHDI/Pyverilog) in Github.

I took an example file - *test.v* (available in the above repo) - and changed its extension to .sv. Then, I ran the `VerilogDataflowAnalyzer()` (provided by Pyverilog) on this file by running the following command available in the Pyverilog Github repo:

`python3 pyverilog/examples/example_dataflow_analyzer.py -t top test.sv`

The included files can have either ***.sv*** or ***.v*** extension as long as the extension is mentioned in the `include` statement. 

I ran the following command to run the `VerilogDataflowAnalyzer()` on the *.sv* files of Ariane's `axi_to_apb` module. 

`python3 ./examples/example_dataflow_analyzer.py ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/cva6_axi2apb_64_32_T3P4.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/fifo_v3.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/fifo_v2.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/fifo_v1.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_w_buffer.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_single_slice.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_r_buffer.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_b_buffer.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_aw_buffer.sv ../hw2vec/assets/datasets/TJ-RTL-toy/TjIn/ariane_axi_to_apb/axi_ar_buffer.sv`

But, the Pyverilog parser seems to fail at a number of lines in the *.sv* files. It fails at many special constructs of the language. The following statements are some example lines from the Ariane code at which Pyverilog shows error.

`input  logic [AXI_NUMBYTES-1:0][7:0]   WDATA_i    ,`

`logic [1:0][31:0]              WDATA; `

 `enum logic [3:0] { IDLE, SINGLE_RD, SINGLE_RD_64, BURST_RD_1, BURST_RD, BURST_RD_64, BURST_WR, BURST_WR_64,
                    SINGLE_WR,SINGLE_WR_64, WAIT_R_PREADY, WAIT_W_PREADY} CS, NS;`
                    
 The errors go away when these lines were commented off from the code. But, these lines are part of the actual Ariane implementation.
 
 ## HW2VEC

**Just to be sure, I also ran the *.sv* files of Ariane's `axi_to_apb` module through the HW2VEC tooflow. It fails showing the same error as above.**

**However, both HW2VEC and Pyverilog successfully parsed through simpler *.v* and *.sv* codes without any error.** 

Like this hardware trojan detection method, there will be many other detection methods which not only lack in being tested for large-scale SoC-level designs but also in their ability to compile designs which use advanced features of the HDL.

The `examples`/`use_case_2.py` script performs 3 main tasks - converting hw circuits to graphs, configuring the GNN model, training the GNN model and evaluating the model.

Below are some modifications that I had to do in order to get the `examples`/`use_case_2.py` script functional:

1. `hw2vec`/`config.py`: Replace `default="./dfg_tj_rtl.pkl"` in **line 20** with `default="./DFG-TJ-RTL.pkl"`. The *.pkl* files `dfg_tj_rtl.pkl` and `ast_tj_rtl.pkl` in `examples` are empty. Or, comment off **lines 10**, **23 - 26** of `examples`/`use_case_2.py`. Adjust the indentation of **lines 11-21** of `examples`/`use_case_2.py` properly. This is done because getting the graph data from cache (*i.e.,* the *.pkl* files in `examples`) doesn't work. 
2. `hw2vec`/`hw2graph.py`: Modify **line 148** to: 

`class_weights = torch.from_numpy(compute_class_weight(**class_weight =** 'balanced', **classes=** np.unique(training_labels), **y =** training_labels))`

The script can be run as usual without the `--data_pkl_path` option and just the `--graph_type`option (`DFG` or `AST`).

<span style="color:red">**Next, I need to understand how to view the prediction results of the pre-trained GNN model for trojan detection.**</span>

# Evaluating My Own Trojan-Inserted Designs Using HW2VEC

In order to evaluate my own trojan-inserted hardware designs (*.v*), I modified the `use_case_2.py` in `examples`. The modified `use_case_2.py` can be found in this repository. The most important changes are the additions of **lines 55-56**, **74**, and **112-114**. Rest of the changes include `print()` statements in **lines 53**, **57-59**, **62-71**, **110-111**. 

In order to evaluate a trojan-free design, the Verilog file must be placed in `assets/datasets/TJ-RTL-toy/TjFree/pyVerilog`. And, in order to evaluate a trojan-inserted design, the corresponding Verilog file must be placed in `assets/datasets/TJ-RTL-toy/TjIn/pyVerilog`. Lines 55-56 of the modified script assume that only one design  - either trojan-free or trojan-inserted - is given for evaluation at a time. This design is excluded from the training and validation sets. However, lines 55-56 can easily be modified to include more designs for evaluation. At the moment, one design serves my purpose.