# **ELEC 466/568 SystemC Project**

Daler N. Rakhmatov

Including material (with modifications) from:

http://cares.icsl.ucla.edu/NetBench F. Vahid, UC-Irvine

#### Goal

- Start from a purely software implementation of an application example
  - Diffie-Hellman key exchange (NetBench v1.1.0)
- Convert computationally intensive function
   NN\_DigitMult into a hardware module
  - Design the datapath
  - Design the controller
- Connect hardware and software using a simple handshaking protocol
  - Use enable and done signals
- Simulate mixed hardware-software implementation to verify functional correctness

## Diffie-Hellman key exchange I

- The Diffie-Hellman key exchange protocol allows two users to exchange a secret key over an insecure medium without any prior secrets
- The protocol has two public system parameters:
   prime p and generator g
  - Parameter p is a prime number
  - Parameter g is an integer less than p, with the following property:
    - For every number n = 1, 2, ..., p-1, there is a power k of g such that n ≡ g<sup>k</sup> mod p
- So, suppose Alice and Bob want to agree on a shared secret key using the Diffie-Hellman protocol...

# Diffie-Hellman key exchange II

- First, Alice generates a random private value a, and Bob generates a random private value b
  - Both a and b are integers
- Second, they derive their public values using parameters p and g and their private values:
  - Alice's public value is A ≡ g<sup>a</sup> mod p
  - Bob's public value is  $B \equiv g^b \mod p$
- Third, they exchange their public values
- Fourth, Alice computes B<sup>a</sup> mod p ≡ (g<sup>b</sup>)<sup>a</sup> mod p,
   and Bob computes A<sup>b</sup> mod p ≡ (g<sup>a</sup>)<sup>b</sup> mod p
  - Since (g<sup>b</sup>)<sup>a</sup> mod p ≡ (g<sup>a</sup>)<sup>b</sup> mod p ≡ k, Alice and Bob now have a shared secret key k

#### Some definitions and macros

```
typedef unsigned short int UINT2;
typedef unsigned int UINT4;
typedef UINT4 NN DIGIT;
typedef UINT2 NN HALF DIGIT;
. . .
#define NN DIGIT BITS 32
#define NN HALF DIGIT BITS 16
#define MAX NN DIGIT Oxffffffff
#define MAX NN HALF DIGIT 0xffff
#define LOW HALF(x) ((x) & MAX NN HALF DIGIT)
\#define HIGH\ HALF(x) (((x) >> NN HALF DIGIT BITS) & MAX NN HALF DIGIT)
#define TO HIGH HALF(x) (((NN DIGIT)(x)) << NN HALF DIGIT BITS)</pre>
```

## NN\_DigitMult

```
void NN DigitMult (NN DIGIT a[2], NN DIGIT b, NN DIGIT c) {
  NN DIGIT t, u;
  NN HALF DIGIT bHigh, bLow, cHigh, cLow;
  bHigh = (NN HALF DIGIT) HIGH HALF (b);
  bLow = (NN HALF DIGIT) LOW HALF (b);
  cHigh = (NN HALF DIGIT) HIGH HALF (c);
  cLow = (NN HALF DIGIT)LOW HALF (c);
  a[0] = (NN DIGIT)bLow * (NN DIGIT)cLow;
  t = (NN DIGIT)bLow * (NN DIGIT)cHigh;
  u = (NN DIGIT)bHigh * (NN DIGIT)cLow;
  a[1] = (NN DIGIT)bHigh * (NN DIGIT)cHigh;
  if ((t += u) < u) a[1] += TO HIGH HALF (1);
  u = TO HIGH HALF (t);
  if ((a[0] += u) < u) a[1]++;
  a[1] += HIGH HALF (t);
```

#### Hardware multiplier: dh\_hw\_mult.h

```
SC MODULE (dh hw mult) {
  sc in <bool> hw mult enable;
  sc in <NN DIGIT> in_data_1, in_data_2;
  sc out <NN DIGIT> out data low, out data high;
  sc out <bool> hw mult done;
  void process hw mult();
  SC CTOR (dh hw mult) {
      SC THREAD (process hw mult);
      sensitive << hw mult enable;</pre>
```

#### dh\_hw\_mult.cpp

```
void dh hw mult::process hw mult() {
  NN DIGIT a[2], b, c, t, u;
  NN HALF DIGIT bHigh, bLow, cHigh, cLow;
  for (;;) {
    if (hw mult enable.read() == true) {
       b = in data 1.read();      c = in data 2.read();
       // Original code from NN DigitMult()
        . . .
       // Hardware multiplication delay = 100 ns
       wait (100, SC NS);
       // Write outputs
       out data low.write(a[0]); out data high.write(a[1]);
    wait();
                      // wait for a change of hw mult enable
```

#### Software module: dh\_sw.h

```
SC MODULE (dh sw) {
  sc in<bool> hw mult done;
  sc in<NN DIGIT> in data low, in data high;
  sc out<NN DIGIT> out data 1, out data 2;
  sc out<bool> hw mult enable;
  void process sw();
  SC CTOR (dh sw) {
      SC THREAD (process sw);
      sensitive << hw mult done;</pre>
  void NN DigitMult (NN DIGIT [2], NN DIGIT, NN DIGIT);
};
```

#### dh\_sw.cpp

```
void dh sw::NN DigitMult(NN DIGIT a[2], NN DIGIT b, NN DIGIT c)
{
  out data 1.write(b);
                                 out data 2.write(c);
  hw mult enable.write(true);
  wait(10, SC NS); // communication delay (10 ns)
  // Multiplication is now performed in hardware...
  wait(100, SC NS); // hardware multiplication delay (100 ns)
  wait(10, SC NS); // communication delay (10 ns)
  a[0] = in data low.read(); a[1] = in data high.read();
  hw mult enable.write(false);
  wait(10, SC NS); // communication delay (10 ns)
```

#### Main program: dhdemo.cpp

```
int sc main () {
  sc signal <bool> enable, done;
  sc signal <NN DIGIT> operand1, operand2, result1, result2;
  enable.write(false); done.write(false);
  dh sw DH SW("DH Software");
  DH SW.out data 1 (operand1);
                                          // operand1 to hardware
  DH SW.out data 2 (operand2);
                                          // operand2 to hardware
  DH SW.in data low (result1);
                                         // result1 from hardware
                                        // result2 from hardware
  DH SW.in data high (result2);
  DH SW.hw mult enable (enable);
                                        // enable hardware
  DH SW.hw mult done (done);
                                          // hardware done
  dh hw mult DH HW MULT("DH Hardware Multiplier");
  DH HW MULT.in data 1 (operand1);
                                  // operand1 from software
  DH HW MULT.in data 2 (operand2);
                                 // operand2 from software
  DH HW MULT.out data low (result1); // result1 to software
  DH HW MULT.out data high (result2); // result2 to software
  DH HW MULT.hw mult enable (enable); // enable hardware
                                         // hardware done
  DH HW MULT.hw mult done (done);
  sc start(); return(0);
}
```

#### **Correct output**

```
*** Agreed Key: 09 2a f1 41 e2 93 61 d5

*** Agreed Key: 64 30 94 c5 da d2 f6 da 49 6d

67 f1 16 55 b3 ea ee a2 c0 30 2b b5 4f 05 9e a4

58 ac 97 3b b9 a0 25 b7 56 fe 82 73 bb 22 d4 31

36 60 7f 41 e9 47 97 b9 5e 27 99 3e 73 f0 28 da

b5 25 da e4 61 84
```

## Things to do I

- Replace timed waits with enable-done handshaking protocol in both HW (dh\_hw\_mult) and SW (dh\_sw)
- Example: handshaking in HW
  - HW should wait for enable signal to be asserted by SW
  - Once enable has been asserted, HW should perform multiplication
  - Then, HW should output the result and assert done
  - HW should deassert done, but only if enable has been deasserted by SW

## Things to do II

- To implement handshaking in HW, you need:
  - Add a clock input to HW and make it a CTHREAD
  - Code a simple FSM with 4 states:
    - WAIT wait for enable signal to be asserted
    - EXECUTE multiply two inputs (use multiplication code as is)
    - OUTPUT write to output ports of module, assert done signal
    - FINISH check if enable is deasserted; if so, deassert done
- To implement handshaking in SW, you need to modify NN\_DigitMult (dh\_sw.cpp)
  - Do NOT feed any clocks to SW!
  - Correct program output does not necessarily mean correct protocol implementation
- There must be NO timed waits in the final code!

#### Things to do III

- Design HW datapath and controller
  - Extract multiplication code inside the EXECUTE state and convert it to the structural description using registers, multiplexers, shifters, adders, multipliers, etc...
  - Split the EXECUTE state into as many states as needed to control your datapath
    - Your controller becomes "embedded" into the handshaking FSM
    - Alternatively, you can separate the controller and the handshaking FSM into two communicating state machines
- Submit electronic and hard copies before deadline
  - Email your design files to <u>daler@ece.uvic.ca</u>
  - Leave your report in ELEC 466 drop-box

# Marking

- Project marking scheme
  - Correct output = 25%
  - Correct SystemC code = 50%
  - Project report = 25%
    - See ELEC 466/568 website for report guidelines
- Extra credit: 5% of the overall course mark
  - Once your hardware multiplier works, apply the same design steps to create the hardware divisor (NN\_DigitDiv)
  - Email your new design files as a separate submission