Skip to content

Commit

Permalink
labs_commit (#305)
Browse files Browse the repository at this point in the history
* labs_commit

* Updated Readme files

* updated readme
  • Loading branch information
hatchuta-xilinx authored and heeran-xilinx committed Sep 19, 2017
1 parent 65ba578 commit 8baf196
Show file tree
Hide file tree
Showing 29 changed files with 2,306 additions and 1 deletion.
13 changes: 13 additions & 0 deletions getting_started/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ S.No. | Category | Description
6 | [debug][] |Debugging and Profiling of Kernel.
7 | [rtl_kernel][] |RTL Kernel Based Examples
8 | [misc][] |OpenCL miscellaneous Examples
9 | [cpu_to_fpga][] |CPU to FPGA conversion Examples with Kernel Optimizations.


__Examples Table__

Expand Down Expand Up @@ -86,6 +88,11 @@ Example | Description | Key Concepts / Keywords
[misc/sum_scan/][]|Example of parallel prefix sum|
[misc/vadd/][]|Simple example of vector addition.|
[misc/vdotprod/][]|Simple example of vector dot-product.|
[00_cpu/][]|This is a simple example of matrix multiplication (Row x Col).
[01_ocl/][]|This is a simple example of OpenCL matrix multiplication (Row x Col).|__Key__ __Concepts__<br> - OpenCL APIs
[02_lmem_ocl/][]|This is a simple example of matrix multiplication (Row x Col) to demonstrate how to reduce number of memory accesses using local memory|__Key__ __Concepts__<br> - Kernel Optimization<br> - Local Memory
[03_burst_rw_ocl/][]|This is a simple example of matrix multiplication (Row x Col) to demonstrate how to achieve better pipeline with burst read and write to/from local memory from/to DDR.|__Key__ __Concepts__<br> - Kernel Optimization<br> - Burst Read/Write
[04_partition_ocl/][]|This is a simple example of matrix multiplication (Row x Col) to demonstrate how to achieve better performance by array partitioning and loop unrolling.|__Key__ __Concepts__<br> - Array Partition<br> - Loop Unroll<br>__Keywords__<br> - xcl_pipeline_loop<br> - xcl_array_partition(complete, dim)<br> - opencl_unroll_hint

[host]:host
[host/concurrent_kernel_execution_ocl/]:host/concurrent_kernel_execution_ocl/
Expand Down Expand Up @@ -159,3 +166,9 @@ Example | Description | Key Concepts / Keywords
[misc/sum_scan/]:misc/sum_scan/
[misc/vadd/]:misc/vadd/
[misc/vdotprod/]:misc/vdotprod/
[cpu_to_fpga]:cpu_to_fpga
[00_cpu/]:00_cpu/
[01_ocl/]:01_ocl/
[02_lmem_ocl/]:02_lmem_ocl/
[03_burst_rw_ocl/]:03_burst_rw_ocl/
[04_partition_ocl/]:04_partition_ocl/
24 changes: 24 additions & 0 deletions getting_started/cpu_to_fpga/00_cpu/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
COMMON_REPO := ../../../

# Common Includes
include $(COMMON_REPO)/utility/boards.mk

# Host Application
host_SRCS=./src/main.cpp
host_CXXFLAGS=-I./src/

host_NTARGETS=hw_emu hw
EXES=host

# check
check_EXE=host
check_NTARGETS=$(host_NTARGETS)

CHECKS=check

#Reporting warning if not targeting for sw_emu
ifneq (sw_emu,$(findstring sw_emu,$(TARGETS)))
$(warning WARNING:Application supports only sw_emu TARGETS. Please use sw_emu for running the application)
endif

include $(COMMON_REPO)/utility/rules.mk
147 changes: 147 additions & 0 deletions getting_started/cpu_to_fpga/00_cpu/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
Matrix Multiplication
======================

This README file contains the following sections:

1. OVERVIEW
2. HOW TO DOWLOAD THE REPOSITORY
3. SOFTWARE TOOLS AND SYSTEM REQUIREMENTS
4. DESIGN FILE HIERARCHY
5. COMPILATION AND EXECUTION
6. EXECUTION IN CLOUD ENVIRONMENTS
7. SUPPORT
8. LICENSE AND CONTRIBUTING TO THE REPOSITORY
9. ACKNOWLEDGEMENTS
10. REVISION HISTORY


## 1. OVERVIEW
This is a simple example of matrix multiplication (Row x Col).

## 2. HOW TO DOWNLOAD THE REPOSITORY
To get a local copy of the SDAccel example repository, clone this repository to the local system with the following command:
```
git clone https://github.com/Xilinx/SDAccel_Examples examples
```
where examples is the name of the directory where the repository will be stored on the local system.This command needs to be executed only once to retrieve the latest version of all SDAccel examples. The only required software is a local installation of git.

## 3. SOFTWARE AND SYSTEM REQUIREMENTS
Board | Device Name | Software Version
------|-------------|-----------------
Alpha Data ADM-PCIE-7V3|xilinx:adm-pcie-7v3:1ddr|SDAccel 2017.1
Xilinx VU9P|xilinx:xil-accel-rd-vu9p:4ddr-xpr|SDAccel 2017.1
AWS VU9P F1|xilinx:aws-vu9p-f1:4ddr-xpr-2pr|SDAccel 2017.1
Xilinx KU115|xilinx:xil-accel-rd-ku115:4ddr-xpr|SDAccel 2017.1
Alpha Data ADM-PCIE-KU3|xilinx:adm-pcie-ku3:2ddr-xpr|SDAccel 2017.1


*NOTE:* The board/device used for compilation can be changed by adding the DEVICES variable to the make command as shown below
```
make DEVICES=<device name>
```
where the *DEVICES* variable accepts either 1 device from the table above or a comma separated list of device names.

## 4. DESIGN FILE HIERARCHY
Application code is located in the src directory. Accelerator binary files will be compiled to the xclbin directory. The xclbin directory is required by the Makefile and its contents will be filled during compilation. A listing of all the files in this example is shown below

```
Makefile
README.md
description.json
src/main.cpp
```

## 5. COMPILATION AND EXECUTION
### Compiling for Application Emulation
As part of the capabilities available to an application developer, SDAccel includes environments to test the correctness of an application at both a software functional level and a hardware emulated level.
These modes, which are named sw_emu and hw_emu, allow the developer to profile and evaluate the performance of a design before compiling for board execution.
It is recommended that all applications are executed in at least the sw_emu mode before being compiled and executed on an FPGA board.
```
make TARGETS=<sw_emu|hw_emu> all
```
where
```
sw_emu = software emulation
hw_emu = hardware emulation
```
*NOTE:* The software emulation flow is a functional correctness check only. It does not estimate the performance of the application in hardware.
The hardware emulation flow is a cycle accurate simulation of the hardware generated for the application. As such, it is expected for this simulation to take a long time.
It is recommended that for this example the user skips running hardware emulation or modifies the example to work on a reduced data set.
### Executing Emulated Application
***Recommended Execution Flow for Example Applications in Emulation***

The makefile for the application can directly executed the application with the following command:
```
make TARGETS=<sw_emu|hw_emu> check
```
where
```
sw_emu = software emulation
hw_emu = hardware emulation
```
If the application has not been previously compiled, the check makefile rule will compile and execute the application in the emulation mode selected by the user.

***Alternative Execution Flow for Example Applications in Emulation***

An emulated application can also be executed directly from the command line without using the check makefile rule as long as the user environment has been properly configured.
To manually configure the environment to run the application, set the following
```
export LD_LIBRARY_PATH=$XILINX_SDX/runtime/lib/x86_64/:$LD_LIBRARY_PATH
export XCL_EMULATION_MODE=<sw_emu|hw_emu>
emconfigutil --xdevice 'xilinx:xil-accel-rd-ku115:4ddr-xpr' --nd 1
```
Once the environment has been configured, the application can be executed by
```
./host
```
This is the same command executed by the check makefile rule
### Compiling for Application Execution in the FPGA Accelerator Card
The command to compile the application for execution on the FPGA acceleration board is
```
make all
```
The default target for the makefile is to compile for hardware. Therefore, setting the TARGETS option is not required.
*NOTE:* Compilation for application execution in hardware generates custom logic to implement the functionality of the kernels in an application.
It is typical for hardware compile times to range from 30 minutes to a couple of hours.

## 6. Execution in Cloud Environments
FPGA acceleration boards have been deployed to the cloud. For information on how to execute the example within a specific cloud, take a look at the following guides.
* [AWS F1 Application Execution on Xilinx Virtex UltraScale Devices]
* [Nimbix Application Execution on Xilinx Kintex UltraScale Devices]
* [IBM SuperVessel Research Cloud on Xilinx Virtex Devices]


## 7. SUPPORT
For more information about SDAccel check the [SDAccel User Guides][]

For questions and to get help on this project or your own projects, visit the [SDAccel Forums][].

To execute this example using the SDAccel GUI, follow the setup instructions in [SDAccel GUI README][]


## 8. LICENSE AND CONTRIBUTING TO THE REPOSITORY
The source for this project is licensed under the [3-Clause BSD License][]

To contribute to this project, follow the guidelines in the [Repository Contribution README][]

## 9. ACKNOWLEDGEMENTS
This example is written by developers at
- [Xilinx](http://www.xilinx.com)

## 10. REVISION HISTORY
Date | README Version | Description
-----|----------------|------------
SEPT2017|1.0|Initial Xilinx Release

[3-Clause BSD License]: ../../../LICENSE.txt
[SDAccel Forums]: https://forums.xilinx.com/t5/SDAccel/bd-p/SDx
[SDAccel User Guides]: http://www.xilinx.com/support/documentation-navigation/development-tools/software-development/sdaccel.html?resultsTablePreSelect=documenttype:SeeAll#documentation
[Nimbix Getting Started Guide]: http://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_2/ug1240-sdaccel-nimbix-getting-started.pdf
[Walkthrough Video]: http://bcove.me/6pp0o482
[Nimbix Application Submission README]: ../../../utility/nimbix/README.md
[Repository Contribution README]: ../../../CONTRIBUTING.md
[SDaccel GUI README]: ../../../GUIREADME.md
[AWS F1 Application Execution on Xilinx Virtex UltraScale Devices]: https://github.com/aws/aws-fpga/blob/master/SDAccel/README.md
[Nimbix Application Execution on Xilinx Kintex UltraScale Devices]: ../../../utility/nimbix/README.md
[IBM SuperVessel Research Cloud on Xilinx Virtex Devices]: http://bcove.me/6pp0o482
25 changes: 25 additions & 0 deletions getting_started/cpu_to_fpga/00_cpu/description.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"runtime": ["OpenCL"],
"example": "Matrix Multiplication",
"overview": [
"This is a simple example of matrix multiplication (Row x Col)."
],
"os": [
"Linux"
],
"targets": ["sw_emu"],
"em_cmd": "./host",
"contributors" : [
{
"group": "Xilinx",
"url" : "http://www.xilinx.com"
}
],
"revision" : [
{
"date" : "SEPT2017",
"version": "1.0",
"description": "Initial Xilinx Release"
}
]
}
80 changes: 80 additions & 0 deletions getting_started/cpu_to_fpga/00_cpu/src/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
/**********
Copyright (c) 2017, Xilinx, Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
**********/

/*
This is a simple example of Matrix Multiplication.
*/

#include<iostream>
#include<stdlib.h>

//Array size to access
#define DATA_SIZE 32

void mmult_cpu( int *in1, // Input matrix 1
int *in2, // Input matrix 2
int *out, // Output matrix (out = A x B)
int dim // Matrix size of one dimension
)
{
//Performs matrix multiplication out = in1 x in2
for (int i = 0; i < dim; i++){
for (int j = 0; j < dim; j++){
for (int k = 0; k < dim; k++){
out[i * dim + j] += in1[i * dim + k] * in2[k * dim + j];
}
}
}
}

int main(int argc, char** argv)
{
int size = DATA_SIZE;
size_t matrix_size_bytes = sizeof(int) * size * size;

//Allocate memory
int *source_in1 = (int *) malloc(matrix_size_bytes);
int *source_in2 = (int *) malloc(matrix_size_bytes);
int *source_cpu_results = (int *) malloc(matrix_size_bytes);

//Creates the data
for(int index = 0; index < size * size; index++){
source_in1[index] = index;
source_in2[index] = index * index;
source_cpu_results[index] = 0;
}


//Function call to perform matrix multiplication
mmult_cpu(source_in1, source_in2, source_cpu_results, size);

std::cout << "Matrix Multiplication completed." << std::endl;

return 0;
}
28 changes: 28 additions & 0 deletions getting_started/cpu_to_fpga/01_ocl/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
COMMON_REPO := ../../../
#Common Includes
include $(COMMON_REPO)/utility/boards.mk
include $(COMMON_REPO)/libs/xcl2/xcl2.mk
include $(COMMON_REPO)/libs/opencl/opencl.mk

# Host Application
host_SRCS=./src/host.cpp $(xcl2_SRCS)
host_HDRS=$(xcl2_HDRS)
host_CXXFLAGS=-I./src/ $(xcl2_CXXFLAGS) $(opencl_CXXFLAGS)
host_LDFLAGS=$(opencl_LDFLAGS)

# Kernel
mmult_SRCS=./src/mmult.cl
mmult_CLFLAGS=-k mmult

XOS=mmult
EXES=host
XCLBINS=mmult

mmult_XOS=mmult
# check
check_EXE=host
check_XCLBINS=mmult

CHECKS=check

include $(COMMON_REPO)/utility/rules.mk
Loading

0 comments on commit 8baf196

Please sign in to comment.