

## CAPI SNAP Education Series: User Guide

# CAPI SNAP Education hls\_nvme\_memcopy : howto?

V1.0







#### **Generalities**



- 1. **NVMe** stands for **non-Volatile-Memory expres**s. It is an open logical device interface specification for accessing non-volatile storage media attached via a <u>PCI Express</u> (PCIe) bus.
- 2. NVMe is supported on Nallatech N250S (with POWER8) and N250S+ (with POWER9) cards.
- 3. OpenPOWER CAPI SNAP NVMe hardware is based on a mechanism that's using SDRAM (DDR4 on FPGA board is used as a buffer) to handle data transfers.
- 4. Hardware bridge allows data transfers to or from the NVMe attached SSD devices from or to the SDRAM memory.
- 5. From there, the proposed application (*snap\_nvme\_memcopy*) demonstrates different kinds of transfers to and from:
  - Host memory (server memory)
  - SDRAM (on board DDR4)
  - NVMe devices
- 6. When Host memory is involved, a 2 steps transfer is performed:
  - step 1 from Host to SDRAM
  - step 2 from SDRAM to NVME (same process in the other way)
- 7. When a transfer is desired between the 2 NVMe devices, it requires to call **snap\_nvme\_memcopy** twice:
  - first to transfer from device #1 to SDRAM,
  - second to transfer from SDRAM to device #2)
- 8. There is a need for initialisation before using the NVMe attached devices.
- 9. Have a look at <a href="https://github.com/open-power/snap/blob/master/hardware/doc/NVMe.md">https://github.com/open-power/snap/blob/master/hardware/doc/NVMe.md</a>







#### **Action overview**

**<u>Purpose:</u>** Transferring data between different resources :

- host memory,
- DDR,
- NVMe

#### When to use it:

- Understand Basic access to different interfaces
- Memcopy benchmarking

#### **Memory management:**

- Application is managing address of Host memory and DDR
- Action is testing if size of transfer is greater than DRAM size (see constants)
- Size of buffer (BRAM) used to copy data can be configured (see constants)

#### **Known limitations:**

- HLS requires transfers to be 64 byte aligned and a size of multiples of 64 bytes
- DDR simulation model reads will return wrong values if non 64 bytes words or non initialized words are read (this is due to the simulation model only)
- If Source or Destination is NVME\_SSD, size must be multiples of 512 (0x200)



**CAPI SNAP Enabled Card** 

## Action usage (1/2)



```
Usage:
         ./snap nvme memcopy [-h] [-v, --verbose] [-V, --version]
         Usage: ./snap nvme memcopy [-h] [-v, --verbose] [-V, --version]
           -C, --card \langlecardno\rangle can be (0...3)
           -i, --input <file.bin> input file (HOST).
           -o, --output <file.bin> output file (HOST).
           -A, --type-in <NVME SSD, HOST DRAM, CARD DRAM>.
           -a, --addr-in <addr>
                                     byte address in CARD DRAM or NVME SSD.
           -D, --type-out <NVME SSD, HOST DRAM, CARD DRAM>.
           -d, --addr-out <addr> byte address in CARD DRAM or NVME SSD.
           -n, --drv-id <0/1> drive id if NVME SSD is used (default: 0)
           -s, --size <size> size of data (in bytes).
           -m, --mode <mode> mode flags.
           -t, --timeout
                              Timeout in sec to wait for done. (10 sec default)
           -X, --verify verify result if possible
           -N, --no irq
                                       Disable Interrupts
                                                                                   Options: (default option in bold)
Example:
                                                                            SNAP TRACE = 0x0 \rightarrow no debug trace
    export SNAP TRACE=0x0
                                                                            SNAP TRACE = 0xF \rightarrow full debug trace
    snap maint -vv -C0
    snap nvme init -vv -C0
                                                                            SNAP CONFIG = FPGA → hardware execution
                                                                            SNAP CONFIG = CPU → software execution
     echo move 4kB from Host to DDR@0x0 and back from DDR@0x0 to Host
```

snap\_nvme\_init -vv -C0
...
echo move 4kB from Host to DDR@0x0 and back from DDR@0x0 to Host
rm t2; dd if=/dev/urandom of=in4k bs=1K count=4
./snap\_nvme\_memcopy -A HOST\_DRAM -D NVME\_SSD -i in4k.bin -d 0x0
echo 4kout.bin collected from address 0x0 of SSD1 in 8 blocs of 512 (size 0x1000)
./snap\_nvme\_memcopy -A NVME\_SSD -D HOST\_DRAM -a 0x0 -o out4k.bin -s 0x1000

diff in4k.bin out4k.bin
 if diff in4k.bin out4k.bin >/dev/null; then echo "RC=\$rc file\_diff ok"; else
 echo -e "\$t RC=\$rc file\_diff is wrong\n\$del"; exit 1;

## Action usage (2/2)



#### Different cases that can be run

```
WARNING: All data transfers to and from NVME SSDs are buffered in CARD DRAM:
Check #define DRAM ADDR TO SSD 0x00000000 and #define DRAM ADDR FROM SSD 0x80000000
in $ACTION ROOT/hw/hw action nvme memcopy.H
Usage Examples:
Before using NVME following command must be run :
${SNAP ROOT}/software/tools/snap maint -Cn #n is card number to attach your action !
${SNAP ROOT}/software/tools/snap nvme init prior to use NVME memory driver !
 echo create a 128kB file with random data ...wait...
 dd if=/dev/urandom of=in.bin bs=1k count=128
 echo create a 512MB file with random data ...wait...
 dd if=/dev/urandom of=in.bin bs=1M count=512
 snap nvme memcopy -A HOST DRAM -D HOST DRAM -i in.bin -o out.bin ...
 snap nvme memcopy -A HOST DRAM -D CARD DRAM -i in.bin -d 0xD000 ...
 snap nvme memcopy -A HOST DRAM -D NVME SSD -i in.bin -d 0xE000 ...
 snap nvme memcopy -A CARD DRAM -D HOST DRAM -a 0xD000 -o out.bin -s 0x200 ...
 snap nvme memcopy -A CARD DRAM -D NVME SSD -a 0xD000 -d 0xE000 -s 0x200 ...
 snap nvme memcopy -A CARD DRAM -D CARD DRAM -a 0xD000 -d 0xD200 -s 0x200 ...
 snap nvme memcopy -A NVME SSD -D CARD DRAM -a 0xE000 -d 0xD000 -s 0x200 ...
 snap nvme memcopy -A NVME SSD -D HOST DRAM -a 0xE000 -o out.bin -s 0x200 ...
1) In Above examples, all addresses are byte address.
   CARD DRAM address limit is 0x1 0000 0000 ( 4294967296 Bytes = 4GB)
   NVME SSD address limit is 0xDF 9035 6000 (960197124096 Bytes = 960GB) for one drive.
   If Source or Destination is NVME SSD, size must be multiples of 512 (0x200)
2) NVME to NVME is not directly supported,
   but can be done by calling snap nvme memcopy twice.
3) HOST to and from NVME is actually performed using 2 hardware steps with a SDRAM buffer in the middle,
```

Take in account that running on a simulator is far more slow than an execution on a FPGA:

→ moving 512MB with a simulator is a HUGE challenge. May be just trying 4K should be sufficient!

!! See WARNING ABOVE !!

Default buffers locations, see :

\$ACTION ROOT/hw/hw action nvme memcopy.H

## Simple transfer tests



**Purpose:** Transferring 4kB data from host file to NVMe and get it back for comparison:

- File creation : dd if=/dev/urandom of=in4k.bin bs=1k count=4
- in4k.bin file copied into address 0x0 of SSD 1
- ./snap nvme memcopy -A HOST DRAM -D NVME SSD -i in4k.bin -d 0x0
- 4kout.bin collected from address 0x0 of SSD1 in 8 blocs of 512 (size 0x1000)
- ./snap nvme memcopy -A NVME SSD -D HOST DRAM -a 0x0 -o out4k.bin -s 0x1000
- diff in4k.bin out4k.bin => no difference as expected

#### Check SDRAM (used as buffer) content:

out4k.bin

Host

Mem

- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x00000000 -o SDRAM2SSD\_4k.bin -s 0x1000
- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x80000000 -o SSD2SDRAM\_4k.bin -s 0x1000
- diff SDRAM2SSD 4k.bin SSD2SDRAM 4k.bin => no difference as expected

**FPGA** 

diff SDRAM2SSD\_4k.bin in4k.bin

**PCIe** 





## Simple transfer tests



**<u>Purpose:</u>** Transferring 64kB data from host file to NVMe and get it back for comparison:

- File creation : **dd if=/dev/urandom of=in64k.bin bs=1k count=64**
- in64k.bin file copied into address 0x0 of SSD 1
- ./snap\_nvme\_memcopy -A HOST\_DRAM -D NVME\_SSD -i in64k.bin -d 0x0
- 4kout.bin collected from address 0x0 of SSD1 in 128 blocs of 512 (size 0x10000)
- ./snap\_nvme\_memcopy -A NVME\_SSD -D HOST\_DRAM -a 0x0 -o out64k.bin -s 0x10000
- *diff in64k.bin out64k.bin* => no difference as expected

#### Check SDRAM (used as buffer) content:

- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x00000000 -o SDRAM2SSD\_64k.bin -s 0x10000
- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x80000000 -o SSD2SDRAM\_64k.bin -s 0x10000
- diff SDRAM2SSD\_64k.bin SSD2SDRAM\_64k.bin => no difference as expected
- diff SDRAM2SSD\_64k.bin in64k.bin => no difference as expected





Default buffers locations, see :

\$ACTION ROOT/hw/hw action nvme memcopy.H

## Simple transfer tests



Default buffers locations, see :

\$ACTION ROOT/hw/hw action nvme memcopy.H

**Purpose:** Transferring 1GB data from host file to NVMe and get it back for comparison:

- File creation : **dd if=/dev/urandom of=in1G.bin bs=1M count=1024**
- in64k.bin file copied into address 0x0 of SSD 1
- ./snap nvme memcopy -A HOST DRAM -D NVME SSD -i in1G.bin -d 0x0
- 4kout.bin collected from address 0x0 of SSD1 in XXX blocs of 512 (size 0x4000\_0000)
- ./snap\_nvme\_memcopy -A NVME\_SSD -D HOST\_DRAM -a 0x0 -o out1G.bin -s 0x40000000
- *diff in1G.bin out1G.bin* => no difference as expected

#### Check SDRAM (used as buffer) content:

- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x00000000 -o SDRAM2SSD\_1G.bin -s 0x40000000
- ./snap\_nvme\_memcopy -A CARD\_DRAM -D HOST\_DRAM -a 0x80000000 -o SSD2SDRAM\_1G.bin -s 0x40000000
- diff SDRAM2SSD\_1G.bin SSD2SDRAM\_1G.bin => no difference as expected
- diff SDRAM2SSD\_1G.bin in1G.bin => no difference as expected



0xDF 9035 6000

#### nvme\_memcopy registers





## Application Code + software action code : what's in it?





#### Hardware action Code: what's in it?





#### **Constants - Ports**



#### <u>Constants:</u> \$ACTION\_ROOT = snap/actions/hls\_nvme\_memcopy

| Constant name        | Value                         | Туре      | Definition location                               | Usage                                                                              |
|----------------------|-------------------------------|-----------|---------------------------------------------------|------------------------------------------------------------------------------------|
| MEMCOPY_ACTION_TYPE  | 0x10141000                    | Fixed     | \$ACTION_ROOT/include/action_nvme_memcopy.h       | memcopy ID - list is in snap/ActionTypes.md                                        |
| RELEASE_LEVEL        | 0x00000001II/h                | Variable  | \$ACTION_ROOT/hw/hw_action_nvme_memcopy. <b>H</b> | release level – user defined                                                       |
| MAX_NB_OF_BYTES_READ | (256 * 1024)                  | Variable  | \$ACTION_ROOT/hw/hw_action_nvme_memcopy. <b>H</b> | Max size in Bytes of the buffer for read/write access                              |
| MAX_NB_OF_WORDS_READ | (MAX_NB_OF_BYTES_READ/BPERDW) | Operation | \$ACTION_ROOT/hw/hw_action_nvme_memcopy. <b>H</b> | Max size in 64B words of the buffer for read/write access                          |
| CARD_DRAM_SIZE       | (4 * 1024 *1024 * 1024)       | Variable  | ISACTION ROOT/NW/NW action nyme memcony <b>H</b>  | Max size of the DDR - prevents from moving data with a size larger than this value |

#### **Ports used:**

| Ports name | Description                                                   | Enabled |
|------------|---------------------------------------------------------------|---------|
|            | Host memory data bus input<br>Addr : 64bits - Data : 512bits  | Yes     |
|            | Host memory data bus output<br>Addr : 64bits - Data : 512bits | Yes     |
|            | DDR3 - DDR4 data bus in/out<br>Addr : 33bits - Data : 512bits | Yes     |
| nvme       | NVMe data bus in/out<br>Addr : 32bits - Data : 32bits         | Yes     |

## **MMIO** Registers



| 0x3C42              | 0x108     | 0x188       |                                                                      | Priva                                                                               | te Data                |                         | c0febabe            |                      |
|---------------------|-----------|-------------|----------------------------------------------------------------------|-------------------------------------------------------------------------------------|------------------------|-------------------------|---------------------|----------------------|
| 0x3C43              | 0x10C     | 0x18C       | Private Data                                                         |                                                                                     |                        | deadbeef                |                     |                      |
|                     |           |             |                                                                      |                                                                                     |                        |                         |                     |                      |
| action_             | _reg.Data | Action sp   | ecific - user defined - ı                                            | need to stay in 108 By                                                              | tes                    |                         | ,                   |                      |
| memco               | opy_job_t | This is the | e way for application (                                              | and action to exchang                                                               | ge information through | h this set of registers |                     |                      |
|                     | Write@    | Read@       | 3                                                                    | 2                                                                                   | 1                      | 0                       | Typical Write value | e Typical Read value |
| 0x3C44              | 0x110     | 0x190       | snap_addr.addr_in (LSB)                                              |                                                                                     |                        |                         |                     |                      |
| 0x3C45              | 0x114     | 0x194       | snap_addr. <b>addr_in</b> (MSB)                                      |                                                                                     |                        |                         |                     |                      |
| 0x3C46              | 0x118     | 0x198       | snap_addr_in. <b>size</b>                                            |                                                                                     |                        |                         |                     |                      |
| 0x3C47              | 0x11C     | 0x19C       | snap.addr_in.flags (SRC, DST,) snap.addr_in.type (HOST, DRAM, NVME,) |                                                                                     |                        |                         |                     |                      |
| 0x3C48              | 0x120     | 0x1A0       | snap_addr.addr_out (LSB)                                             |                                                                                     |                        |                         |                     |                      |
| 0x3C49              | 0x124     | 0x1A4       | snap_addr.addr_out (MSB)                                             |                                                                                     |                        |                         |                     |                      |
| 0x3C4A              | 0x128     | 0x1A8       | snap.addr_out. <b>size</b>                                           |                                                                                     |                        |                         |                     |                      |
| <i>0x3C4B</i> 0x12C |           | 0x1AC       | snap.addr_out. <b>f</b> l                                            | o.addr_out. <b>flags</b> (SRC, DST,) snap.addr_out. <b>type</b> (HOST, DRAM, NVME,) |                        |                         |                     |                      |
|                     | 0x130     | 0x1B0       |                                                                      |                                                                                     |                        |                         |                     |                      |
|                     | 0x134     | 0x1B4       |                                                                      |                                                                                     |                        |                         |                     |                      |
|                     | 0x138     | 0x1B8       |                                                                      |                                                                                     |                        |                         |                     |                      |
|                     | 0x13C     | 0x1BC       |                                                                      |                                                                                     |                        |                         |                     |                      |
|                     | 0x140     | 0x1C0       |                                                                      |                                                                                     |                        |                         |                     |                      |
|                     | 0x144     | 0x1C4       |                                                                      |                                                                                     |                        |                         |                     |                      |

```
$ACTION_ROOT/hw/hw_action_nvme_memcopy.H
                                                                                   $SNAP_ROOT/actions/include/hls_snap.H
                                                                                   typedef struct {
typedef struct {
                                                                                       snapu8_t sat; // short action type
    CONTROL Control;
                            /* 16 bytes */
                                                                                       snapu8 t flags;
    memcopy_job_t Data; /* 108 bytes */
                                                                                       snapu16 t seq;
    uint8_t padding[SNAP_HLS_JOBSIZE - sizeof(memcopy_job_t)];
                                                                                       snapu32 t Retc;
} action_reg;
                                                                                       snapu64 t Reserved; // Priv data
                                                                                                                          $SNAP_ROOT/software/include/snap_types.h
                                                                                    CONTROL:
                                                                                                                          typedef struct snap_addr {
     $ACTION ROOT/include/action memcopy.h
                                                                                                                              uint64_t addr;
     typedef struct memcopy_job {
                                                                                                                              uint32_t size;
          struct snap_addr in; /* input data */
                                                                                                                                                       /* DRAM, NVME, ... */
                                                                                                                              snap_addrtype_t type;
          struct snap_addr out; /* output data */
                                                                                                                              snap_addrflag_t flags;
                                                                                                                                                       /* SRC, DST, EXT, ... */
                                                                                                                          } snap_addr_t;
     } memcopy_job_t;
```



#### Measurements on N250S card

| hls_nvme_memcopy / N250S board     | 1-direction access, 1GB data going from or to SSD |               |                |               |
|------------------------------------|---------------------------------------------------|---------------|----------------|---------------|
| 256KBytes buffer - 64 access/burst | Read from Host                                    | Write to Host | Read from DDR4 | Write to DDR4 |
| Bytes transfered                   | BW (MBps)                                         | BW (MBps)     | BW (GBps)      | BW (GBps)     |
| 1GB memory area transfer           | 498                                               | 705           | 624            | 973           |

Latency to access DDR4 memory:

• Read : from HLS\_action request to data in HLS : 184ns

• Write: from HLS\_action request to data in DDR: 105ns





To run these performances, run the following: snap find card -v -AN250S A N250S card has been detected in card position 0PSL Revision is : 0x3007Device ID  $: 0 \times 0632$ Sub device is : 0x060a Image loaded is self defined as : user Next image to be loaded at next reset (load image on perst) is : user snap maint -vv [main] Enter [snap version] Enter SNAP on N250S Card, NVME enabled, 4096 MB DRAM available. SNAP FPGA Release: v1.3.5 Distance: 43 GIT: 0xe7036da5 SNAP FPGA Build (Y/M/D): 2018/03/21 Time (H:M): 17:04 SNAP FPGA CIR Master: 1 My ID: 0 SNAP FPGA Up Time: 226 sec [snap version] Exit [snap m init] Enter SNAP FPGA Exploration already done (MSAT: 1 MAID: 1) Short | Action Type | Level [snap m init] Exit rc: 0 [main] Exit rc: 0



```
snap nvme memcopy -A HOST DRAM -D NVME SSD -i in1G.bin -d 0x0
reading input data 1073741824 bytes from in1G.bin
PARAMETERS:
 input:
         in1G.bin
 output: unknown
 type_in: 0 HOST DRAM
 addr in: 00003fff73b70000
 type out: 2 NVME SSD
 addr out: 0000000000000000
 drive id: 0
  size in/out: 40000000
 mode:
              00000000
 prepare nvme memcopy job of 40 bytes size
 This is the register information exchanged between host and fpga
 00000000: 00 00 b7 73 ff 3f 00 00 00 00 40 00 00 12 00 | ...s.......
 00000010: 00 00 00 00 00 00 00 00 00 00 40 02 00 23 00 | .........
 00000020: 00 00 00 00 00 00 00
     get starting time
Action is running .... got end of exec. time
SUCCESS
memcopy of 1073741824 bytes took 2157638 usec @ 497.647 MiB/sec
This represents the register transfer time + memcopy action time
```



```
snap nvme memcopy -A NVME SSD -D HOST DRAM -a 0xE000 -o out1G.bin -s 0x40000000
PARAMETERS:
 input:
         unknown
 output: out1G.bin
 type_in: 2 NVME_SSD
 type out: 0 HOST DRAM
 addr out: 00003fff58120000
 drive id: 0
 size in/out: 40000000
 mode:
             00000000
 prepare nvme memcopy job of 40 bytes size
 This is the register information exchanged between host and fpga
 00000000: 00 e0 00 00 00 00 00 00 00 00 00 12 00 | .........
 00000010: 00 00 12 58 ff 3f 00 00 00 00 40 00 00 23 00 | ...X........
 00000020: 00 00 00 00 00 00 00
     get starting time
Action is running .... got end of exec. time
writing output data 0x3fff58120000 1073741824 bytes to out1G.bin
SUCCESS
memcopy of 1073741824 bytes took 1522240 usec @ 705.370 MiB/sec
This represents the register transfer time + memcopy action time
```



```
snap nvme memcopy -A CARD DRAM -D NVME SSD -a 0x000 -d 0x000 -s 0x40000000
PARAMETERS:
 input:
          unknown
 output: unknown
 type_in: 1 CARD_DRAM
 type out: 2 NVME SSD
 drive id: 0
 size in/out: 40000000
 mode:
           0000000
 prepare nvme memcopy job of 40 bytes size
 This is the register information exchanged between host and fpga
00000010: 00 00 00 00 00 00 00 00 00 00 40 02 00 23 00 | ..........
00000020: 00 00 00 00 00 00 00
    get starting time
Action is running .... got end of exec. time
SUCCESS
memcopy of 1073741824 bytes took 1721294 usec @ 623.799 MiB/sec
This represents the register transfer time + memcopy action time
```



```
snap nvme memcopy -A NVME SSD -D CARD DRAM -a 0x0 -d 0x0 -s 0x40000000
PARAMETERS:
 input:
          unknown
 output: unknown
 type_in: 2 NVME_SSD
 type out: 1 CARD DRAM
 drive id: 0
 size in/out: 40000000
 mode:
           0000000
 prepare nvme memcopy job of 40 bytes size
 This is the register information exchanged between host and fpga
00000010: 00 00 00 00 00 00 00 00 00 00 40 01 00 23 00 | ..........
00000020: 00 00 00 00 00 00 00
    get starting time
Action is running .... got end of exec. time
SUCCESS
memcopy of 1073741824 bytes took 1104054 usec @ 972.545 MiB/sec
This represents the register transfer time + memcopy action time
```

## Path of improvements



1. HLS memcpy function waits for the end of the request before starting a new one. Being able to parallelize reads with writes since both ports are independent would increase performance since the DMA is able to pipeline requests.

## History of this document and of the action release level



V1.0: initial document