AWS F1 DMA Example

How does this example work?

The original AWS DMA example allows you to write and read the FPGA as if it were an address space in memory. This example works by using a narrow portion of that functionality:

Write a small input buffer 'message' of fixed size N bytes to address 0
- This acts as the input to the FPGA hardware
Read those same N bytes 'message' from address 0
- This is the output from the FPGA hardware
No other addresses or buffer sizes are supported The code describing the conversion between AWS DMA interfaces and this simple 'message' abstraction is in files dma_msg.h, dma_msg_hw.c , and dma_msg_sw.c.

Software

Amazon provides a simple read+write interface to the FPGA through user space file IO and a kernel driver. This example writes+reads 'message' byte arrays to/from file - relatively simple code.

Hardware

Amazon uses an AXI4 bus in their DMA example. This example hardware serializes and deserializes bursts of AXI4 data to form 'message' byte arrays that can be passed to and from other logic such as the POSIX Experiment.

Input and output byte representation

DMA data is just bytes that need to be interpreted further, specific to your application.

The files work_sw.c and work_hw.c describe the conversion of the DMA message struct to/from work() input/output types used in this example.

The FPGA 'output = work(input)' function

This example does a matrix multiplication. The work.h file contains the definition of output = work(input): the function, its inputs (N floating point values), and outputs (a single floating point value).

Software driver/tester

test.c describes the standard test of 'do work on the CPU', 'do work on the FPGA', and see if there was a speed up. It includes helper functions to easily swap out what the input values are and how the output values are compared. In this example the CPU and FPGA both use the same work() function source code so this isn't the best possible CPU implementation to compare against.

Run the example

In your AWS F1 Developer AMI instance (doesn't need to be an F1 FPGA instance yet) use these steps to run the example:

These steps require 16+ GB of RAM for your instance:

Update and install the latest PipelineC repo

cd ~/src/project_data/
git clone https://github.com/JulianKemmerer/PipelineC.git # Fine to fail if exists
cd PipelineC
git pull # In case already exists
cd examples/aws-fpga-dma
chmod +x install.sh
./install.sh

Run the AWS environment setup scripts

AWS_FPGA_REPO_DIR=/home/centos/src/project_data/aws-fpga
cd $AWS_FPGA_REPO_DIR
source hdk_setup.sh
source sdk_setup.sh
cd $HDK_DIR/cl/examples/cl_dram_dma
export CL_DIR=$(pwd)

Run the PipelineC tool (~ minutes to several hours)

cd ~/src/project_data/PipelineC/;
rm -r /home/centos/pipelinec_syn_output; 
python -u ./src/pipelinec 2>&1 | tee out.log

These steps require 32+ GB of RAM for your instance:

Build Vivado checkpoint that will be turned into an Amazon FPGA Image (AFI) (~ several hours)

cd $CL_DIR/build/scripts
./aws_build_dcp_from_cl.sh

Wait for vivado to finish and put checkpoint file in $CL_DIR/build/checkpoints/to_aws/

ls -lt $CL_DIR/build/checkpoints/to_aws/ | grep .tar | head -n 1
# Set these environment variables based on your output
export TARTIMESTAMP=20_03_20-103330
export TARFILENAME=$TARTIMESTAMP.Developer_CL.tar

These steps require very little RAM:

Copy checkpoint to Amazon S3 for Amazon to do their magic. (requires AWS credentials to be setup)

# Set environment vars needed 
export REGION=us-east-1
export S3BUCKET=pipelinec
export S3DCPDIRNAME=dcps
export S3LOGSDIRNAME=logs
aws s3 mb s3://$S3BUCKET --region $REGION  # Create an S3 bucket (choose a unique bucket name)
aws s3 mb s3://$S3BUCKET/$S3DCPDIRNAME/   # Create folder for your tarball files
aws s3 cp $CL_DIR/build/checkpoints/to_aws/$TARFILENAME s3://$S3BUCKET/$S3DCPDIRNAME/    # Upload the file to S3
# Make room for Amazon's log file on S3
aws s3 mb s3://$S3BUCKET/$S3LOGSDIRNAME/  # Create a folder to keep your logs
touch LOGS_FILES_GO_HERE.txt                     # Create a temp file
aws s3 cp LOGS_FILES_GO_HERE.txt s3://$S3BUCKET/$S3LOGSDIRNAME/   #Which creates the folder on S3

Tell Amazon to generate an AFI using those S3 files

export AFI_NAME=pipelinec
export AFI_DESC=aws_example
aws ec2 create-fpga-image --region $REGION --name $AFI_NAME --description $AFI_DESC --input-storage-location Bucket=$S3BUCKET,Key=$S3DCPDIRNAME/$TARFILENAME --logs-storage-location Bucket=$S3BUCKET,Key=$S3LOGSDIRNAME
# Set these environment variables based on your output
export AFIID=afi-0a418ee223c9a814c
export AGFIID=agfi-0148fb8a218d50b49

Wait for Amazon to say your AFI is 'available' (~ few hours)

aws ec2 describe-fpga-images --fpga-image-ids $AFIID | grep "Code"

These steps require an F1 FPGA instance:

Start working with real FPGA hardware (must be on F1 instance now)

# Clear FPGA (30s)
sudo fpga-clear-local-image  -S 0
# Load FPGA (30s)
sudo fpga-load-local-image -S 0 -I $AGFIID
# Reset (pcie reset)
sudo fpga-describe-local-image -S 0 -R -H

Do test (rebuild, reset fpga again, run ./test)

cd /home/centos/src/project_data/PipelineC/examples/aws-fpga-dma
reset; make clean; make && sudo fpga-describe-local-image -S 0 -R -H && sudo ./test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly