Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
277 lines (211 sloc) 12.3 KB
/** @page mselSoC The msel ORP System-on-a-Chip
@section introduction Introduction
The mselSoC is a modification of the open source OpenRISC 1200 platform; it is targeted for the
Xilinx Artix-7 100T FPGA based Geophyte board , and includes support for three different
cryptographic cores and support for the msel ORP faux filesystem interface. It also contains an
open-source FTL for writing to NAND flash. The main processor is clocked at 50 MHz, but each of the
cryptographic cores are clocked at 100 MHz.
@section buslayout Bus Layout
The system ties together peripheral cores using a Wishbone B.3 compliant bus. The interconnects are
auto-generated by `wb_intercon_gen`. This is done via the following command:
wb_intercon_gen ./data/wb_intercon.conf ./rtl/verilog/wb_intercon.v
in the hardware/mselSoC/src/systems/geophyte directory.
There exist three masters within the `wb_intercon.conf` for the ORP. The first is `or1k_i`, or the
OpenRISC 1200 instruction bus, which provides instruction memory to the CPU. In the production
system this is limited to read only memory and as such execution of arbitrarily injected
instructions is not physically possible in the system. The instruction bus has the following memory
map:
Address | Size | Description
-----------|---------|-------------
0x100000 | 65536 | ROM
0xF0000100 | 128 | Debug Boot ROM
The second master defined in `wb_intercon.conf` is `or1k_d`, or the OpenRISC 1200 data bus. The data
bus has the following memory map:
Address | Size | Description
-----------|---------|-------------
0x100000 | 65536 | ROM
0x200000 | 131072 | RAM
0x90000000 | 32 | UART
0x91000000 | 2 | GPIO
0x92000000 | 1 | TRNG
0x93000000 | 128 | AES
0x94000000 | 256 | SHA-256
0x98000000 | 8192 | FauxFS
0xA0000000 | 8192 | NAND Controller
0xF0000100 | 128 | Debug Boot ROM
The last master defined in `wb_intercon.conf` is the debug master. In a production silicon version of
an ORP platform this master should be removed, or simply the `adv_debug_sys` instance in
`orpsoc_top.v` can be left unconnected. Currently both debug and production builds of the FPGA
bitstream provide JTAG access to the debug bus through `adv_debug_sys`.
Developer Note: The size of ROM and RAM can be adjusted, however please consult synthesis logs to
determine available resources. This requires adjustments to the `wb_intercon.conf` as well as
altering the instances of rom0 and ram0 in `orpsoc_top.v`.
Developer Note: If the size of ROM is to exceed 1MB, then RAM must be moved upwards. This requires
modifications to the mselOS linker script so that compiler is aware of this change.
@section dbgprod Debug vs. Production Builds
The system allows for building in production and debug modes. Debug builds are done by uncommenting
`define ORP_DEBUG 1
near the top of `orpsoc_top.v`. Production builds have this preprocessor identifier undefined. The
difference between production and debug is simply that the debug build does not have a prepopulated
ROM memory and allows reading and writing to the ROM region. This allows for debugging of software
builds without the necessity to re-synthesize the FPGA bitstream. For a production build the
contents of the ROM region are read from `rom.dat`, which can be generated using the included
`romgen` command.
@section srclayout Source Layout
The SoC builds through FuseSoC. FuseSoC assembles the different cores from the system definition,
which in mselSoC is `geophyte.core`, and the process technology definition, which in mselSoC is
`geophyte.system`. The cryptographic cores, SHDC core, FauxFS, and NAND controller are found within
appropriately named subdirectories within the geophyte `rtl/verilog` directory. The source was laid
out such that at a later date when FuseSoC supports multiple core repositories that these cores
could be migrated elsewhere to an ORP specific cores directory.
@section crypto Cryptographic Cores
@subsection True Random Number Generator
The TRNG is designed following a pattern laid out in "High Speed True Random Number Generators in
Xilinx FPGAs" by Catalin Baetoniu at Xilinx. The design has been altered such that the primary XOR
based oscillator has an enable gate. This avoids synthesis errors when synthesizing with XST.
Developer Note: Due to the enable logic around the oscillator logic the first value read following a
reset should be thrown away.
To read values from the TRNG, simply read one byte from location @ref TRNG_ADDR.
@subsection AES
The AES core can encrypt and decrypt blocks of data in three modes of operation: AES-128, AES-192,
and AES-256. The design is based upon a purely gate logic implementation of the forward and reverse
sboxes due to the work of Boyar and Peralta. This avoids differential power attacks present in
purely lookup table or SRAM based sbox implementation. The AES core is located at @ref AES_ADDR,
and has the following register layout:
Offset | Size | Description
-------|------|-------------
0x00 | 32 | AES key
0x20 | 16 | Data input
0x30 | 16 | Data output
0x40 | 4 | Control and Status
The control register layout is as follows:
Position | R/W | Description
---------|-----|------------
0 | W | Starts operation
1 | W | 0 for encrypt, 1 for decrypt
2-3 | W | 0, 1, and 2 for 128, 192, and 256 respectively
4-7 | - | reserved
8 | W | Clears the key and data
9-15 | - | reserved
16 | R | 0 for idle, 1 for busy
17-31 | - | reserved
Applications writing to the core should follow this protocol:
1. Write the encryption key
2. Write the input data
3. Write 1 to position 0, and set positions 1-3 of the control register
4. Poll position 16 of the control register until idle
5. Read output
6. If more data to process with the same key, go to step 2
7. Write 1 to position 8 of the control register
@subsection SHA-256
The SHA-256 core implements the transform function for the SHA-256 hashing algorithm and utilizes a
pipelined pre-calculation of half of the round function in the previous cycle, specifically \f$H + K
+ W\f$. This reduces the size and area of the address needed to perform the round function. The
design further minimizes size by iterating through the same round calculation unit for the 64
necessary SHA-256 rounds. The SHA-256 core is located at @ref SHA_ADDR, and has the following
register layout:
Offset | Size | Description
-------|------|------------
0x00 | 32 | IV/hash values
0x20 | 64 | Data input
0x60 | 4 | Control and Status
The control register layout is as follows:
Position | R/W | Description
---------|-----|------------
0 | W | Starts operation
1-7 | - | reserved
8 | W | Clears the key and data
9-15 | - | reserved
16 | R | 0 for idle, 1 for busy
17-31 | - | reserved
Applications writing to the core should follow this protocol:
1. Write the context IV
2. Write the input data
3. Write 1 to position 0 of the control register
4. Poll position 16 of the control register until idle
5. If more data to process, go to step 2
6. Read final IV values
7. Write 1 to position 8 of the control register
@section nandc NAND CPU Interface
The NAND interface provides the CPU access to an exclusive area of the NAND flash for the purposes
of storing the Device Master Key and the Monotonic Counter. The interface core is located at @ref
FLASH_CTRL_ADDR, and has the following register layout:
Offset | Size | Description
-------|------|------------
0x0000 | 4096 | Page data buffer
0x1000 | 16 | Reserved
0x1010 | 8 | Spare space write buffer
0x1018 | 8 | Spare space read buffer
0x1020 | 8 | Erase bank
0x1028 | 8 | Status register
0x1030 | 8 | Write page
0x1038 | 8 | Read page
The status register layout is as follows:
Position | R/W | Description
---------|-----|------------
0 | R | 0 for no error, 1 for error
1 | R | 0 for ready, 1 for busy
2-63 | - | reserved
Applications erasing a bank of pages should follow this protocol:
1. Check that bit 1 of the status register is zero
2. Write the bank index desired to be erased to the erase bank register
3. Poll bit 0 and 1 of the status to determine if the operation has completed successfully. The
controller returns bit 1 to 0 when the operation completes.
Applications reading a page should follow this protocol:
1. Check that bit 1 of the status register is zero
2. Write the page index desired to be read to the read page register
3. Poll bit 0 and 1 of the status to determine if the operation has completed successfully. The
controller returns bit 1 to 0 when the operation completes.
4. The contents of the page are now in the page data buffer.
Applications reading a page should follow this protocol:
1. Check that bit 1 of the status register is zero
2. Write the desired contents to the page data buffer.
3. Write the page index desired to be written to the write page register
4. Poll bit 0 and 1 of the status to determine if the operation has completed successfully. The
controller returns bit 1 to 0 when the operation completes.
For additional reference see @ref flash.c and @ref flash.h. Additional low level behavioral
information can be found in `nandc_ecc_inline_cpu.README` bundled with the SDHC core.
@section ffs Faux Filesystem Interface
The mselSoC presents itself as a 2GB device the microSD communications bus. On this device there
exist two regions, a ROM region and a NAND region with sizes of 2MB and 64MB respectively. The 2MB
region contains 1MB for the partition table and 1MB for a FAT filesystem. However, it is not a real
filesystem; files and directories cannot be created or destroyed. Instead, there are two "files" in
the root directory of the filesystem, named RFILE and WFILE. The Android host can communicate with
the mselSoC via these two files: data is written to mselSoC via WFILE, and responses are read from
RFILE.
Both files are 2048 bytes in size, which is the maximum amount of data that can be transmitted to
the peripheral device at a time. Applications are expected to establish a communication protocol to
stream larger amounts of data; the @ref tidl serialization language is provided to aid in this
process.
In addition, reads from WFILE and writes to RFILE query and set status registers on mselSoC for the
communication. Status values indicate the success or failure of the last read or write; a list of
valid status values can be found at @ref ffs_status "Faux Filesystem Return Values".
The faux filesystem's base address on the peripheral device is at @ref FFS_ADDR, and the register
layout is as follows:
Offset | Size | Description
-------|------|------------
0x0000 | 2048 | WFILE data buffer
0x0800 | 2048 | RFILE data buffer
0x1000 | 16 | WFILE acknowledgment buffer
0x1010 | 16 | RFILE acknowledgment buffer
0x1030 | 4 | Control and Status
When writes are preformed to WFILE, it triggers interrupt 0x10, and when the host device sends a
read acknowledgment, it triggers interrupt 0x11. The control register is used to acknowledge that
interrupts have been handled. The first position acknowledges that WFILE interrupts are handled,
and the second position acknowledges that RFILE status interrupts have been handled.
RFILE and WFILE act as normal files in a FAT filesystem, and so can be read and written to by nearly
any device that can accept microSD cards. However, because they are not real files, and the
contents of the files can change without the host operating system being aware of the change,
filesystem caching on the host can interfere with the communication between the host and mselSoC.
To prevent this interference, caching on the open files must be disabled. On a Linux- or
Android-based system, the files can be opened with the `O_DIRECT` flag as follows:
int fd = open(path, O_RDWR | O_SYNC | O_DIRECT);
Alternately, if root permissions are available, the filesystem cache can be flushed after every
write to WFILE or RFILE by writing `1` to `/proc/sys/vm/drop_caches`:
system("echo 1 > /proc/sys/vm/drop_caches");
Either of these methods will prevent caching issues from interfering with the communication between
the host device and the mselSoC peripheral.
In benchmark tests, a maximum round-trip communication data rate of about 75 KB/sec can be achieved
using the mselSoC faux filesystem. In contrast, writing to a real SDHC card in 2048-byte chunks can
achieve a maximum data rate of about 160 KB/sec.
*/