SRIO AXIS Packet Generator

The purpose of this repository is to have a central location to capture all of the functional FPGA source code into a concise library based on type of function. This will be my primary FPGA development repository from now on and all other repos will soon be merged into this.

SRIO AXIS Packet Generator

This module generates random deterministic data payloads for the Xilinx SRIO Endpoint. This module utilizes the HELLO Header format as specified in the Xilinx PG document:

http://www.xilinx.com/support/documentation/ip_documentation/srio_gen2/v2_0/pg007_srio_gen2.pdf

This module utilizes SystemVerilog constructs and was simulated with QuestaSim 10.2.

Merge Sort

This design block provides the ability to sort up to 16 binary numbers. The block will output the original numbers, though sorted from lowest (position 0) to highest (position n-1).

The number of clock cycles delay required to process the ordering of all input numbers is based on the next highest value of 2^n - 1. Meaning if 3 values are being sorted, then the total clock cycle latency is (2^2 = 4) - 1 = 3 clock cycles. If 5 numbers are being sorted, then the total clock cycle latency is (2^3 = 8) - 1 = 7 clock cycles.

The block also provides an output of values signifying the input positional oputput transposition. Meaning, given the following inputs:

0 - 16
1 - 27
2 - 7
3 - 19

The output bus "sorted_positions" would be:

2 - in sorted_positions[0]
0 - in sorted_positions[1]
3 - in sorted_positions[2]
1 - in sorted_positions[3]

This bus is good for dereferencing memories of stored numbers, or maybe creating a priority queue of some kind.

Unused number inputs should be loaded with all 1s, where the used number inputs should be stacked at the bottom of the input number ports for logical consistency.

Synplify Pro Timing and Utilization Results, Targetting a Virtex 7

Compare 2 values

Clock              Constraint f   Estimated f   Slack 
merge_sort|clk     200.0 MHz      393.0 MHz     2.456
System             200.0 MHz      476.9 MHz     2.903

Mapping to part: xc7vx690tffg1927-2

Cell usage:

CARRY4 4 uses
FD 5 uses
FDR 1 use
FDRE 64 uses
GND 1 use
VCC 1 use
LUT3 66 uses
LUT4 32 uses
LUT5 4 uses
I/O ports: 136
I/O primitives: 136
IBUF 66 uses
IBUFG 1 use
OBUF 69 uses
BUFG 1 use

Compare 4 values

Clock              Constraint f   Estimated f   Slack 
merge_sort|clk     200.0 MHz      339.2 MHz     2.052
System             200.0 MHz      384.6 MHz     2.400

Mapping to part: xc7vx690tffg1927-2

Cell usage:

CARRY4 20 uses
FD 9 uses
FDR 7 uses
FDRE 274 uses
GND 1 use
VCC 1 use
LUT2 5 uses
LUT3 4 uses
LUT4 167 uses
LUT5 193 uses
LUT6 150 uses
I/O ports: 272
I/O primitives: 272
IBUF 130 uses
IBUFG 1 use
OBUF 141 uses
BUFG 1 use

Compare 8 values

Clock              Constraint f   Estimated f   Slack 
merge_sort|clk     200.0 MHz      282.2 MHz     1.456
System             200.0 MHz      349.4 MHz     2.138

Mapping to part: xc7vx690tffg1927-2

Cell usage:

CARRY4 80 uses
FD 15 uses
FDR 2 uses
FDRE 576 uses
GND 1 use
VCC 1 use
LUT2 12 uses
LUT3 20 uses
LUT4 787 uses
LUT5 221 uses
LUT6 727 uses
I/O ports: 548
I/O primitives: 548
IBUF 258 uses
IBUFG 1 use
OBUF 289 uses
BUFG 1 use

Compare 16 values

Clock              Constraint f   Estimated f   Slack 
merge_sort|clk     200.0 MHz      274.4 MHz     1.355
System             200.0 MHz      329.1 MHz     1.962

Mapping to part: xc7vx690tffg1927-2

Cell usage:

CARRY4 304 uses
FD 18 uses
FDR 11 uses
FDRE 1184 uses
GND 1 use
MUXF7 1 use
VCC 1 use
LUT2 25 uses
LUT3 59 uses
LUT4 2546 uses
LUT5 513 uses
LUT6 2666 uses
I/O ports: 1108
I/O primitives: 1108
IBUF 514 uses
IBUFG 1 use
OBUF 593 uses
BUFG 1 use

Multiplication

While addition and subtraction functions are fairly straight forward algorithms in an FPGA, multiplication is quite complicated and very resource intensive. This document will discuss several multiplication techniques that will be used in the multiplication block.

Unsigned Binary Multiplication

This methodology of producing a result by multiplying two unsigned n-bit integers (a, b) together where the clock cycle delay is n-1 clocks. Thus for two 64-bit values the delay is 63 clock cycles, although the result is a 128-bit value. Depending on the difficulty of meeting timing in the below assign combinatorial statements, these could be registered as well adding 1-2 clock cycle latency which could be made up for with quite high clock speeds. While this algorithm isn't very complex it is quite slow for a design in hardware.

// Combinatorial mashing of the two input values, packing to double the bit space
assign stage_0 = ( { {n{1'b0}}, (b[(n-1):0] & {n{a[0]}}) } ) << 0;
assign stage_1 = ( { {n{1'b0}}, (b[(n-1):0] & {n{a[1]}}) } ) << 1;
assign stage_2 = ( { {n{1'b0}}, (b[(n-1):0] & {n{a[2]}}) } ) << 2;
....
// Only requires n/2 operations
assign stage_n-1 = ( { {n{1'b0}}, (b[(n-1):0] & {n{a[n]}}) } ) << (n-1);

// Now sequentially add the stages
always@(posedge clk) begin
  case(state)
    IDLE : begin
      if(go) begin
        state <= add0;
      end
    end
    add0 : begin
      out0 <= stage_0 + stage1;
      state <= add1;
    end
    add1 : begin
      out1 <= stage_1 + out0;
      state <= add2;
    end
    add2 : begin
      out2 <= stage_2 + out1;
      state <= add3;
    end
    ...
    addn-1 : begin
      outn-1 <= stage_n-1 + outn-2;
      state <= IDLE;
    end
  endcase
end

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
scripts		scripts
sim		sim
src		src
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SRIO AXIS Packet Generator

Merge Sort

Synplify Pro Timing and Utilization Results, Targetting a Virtex 7

Compare 2 values

Compare 4 values

Compare 8 values

Compare 16 values

Multiplication

Unsigned Binary Multiplication

About

Uh oh!

Releases

Packages

Languages

License

rdustinb/GADev

Folders and files

Latest commit

History

Repository files navigation

SRIO AXIS Packet Generator

Merge Sort

Synplify Pro Timing and Utilization Results, Targetting a Virtex 7

Compare 2 values

Compare 4 values

Compare 8 values

Compare 16 values

Multiplication

Unsigned Binary Multiplication

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages