New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for Intel FPGA SDK for OpenCL (AOCL) #1474

Merged
merged 27 commits into from Jul 31, 2018

Conversation

Projects
None yet
7 participants
@ktabata
Copy link
Contributor

ktabata commented Jul 23, 2018

This PR adds initial support for Intel FPGA SDK for OpenCL (AOCL).
Currently it only works in CPU emulation mode.
To try this patch, you can use AOCL 17.1 without paid license.

@liangfu

This comment has been minimized.

Copy link
Member

liangfu commented Jul 23, 2018

Nice effort.
Can we support targeting opencl for both gpu and accelerator at the same time?

@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 23, 2018

Hi @liangfu , as far as this PR goes, no, we can't.
If we support targeting OpenCL for both GPU and FPGA at the same time,
what is the best way to choose device types (GPU and FPGA) from python script?

@liangfu

This comment has been minimized.

Copy link
Member

liangfu commented Jul 23, 2018

i'm not sure. perhaps by defining a different context? usually, ctx=tvm.context('opencl',0)

#
# Possible values:
# - OFF: disbale AOCL
# - board_name: use specific board name for offline compilation

This comment has been minimized.

@kazum

kazum Jul 24, 2018

Member

I think the board name should be passed with target options (e.g. tvm.context("opencl -device=[board name]")).

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Changed to use target name option like tgt="aocl -device=de5net_a7".

Init("opencl", "gpu");
#else
Init("opencl", "accelerator");
#endif

This comment has been minimized.

@kazum

kazum Jul 24, 2018

Member

Implement AOCLWorkspace as a subclass of OpenCLWorkspace so that it can coexist with other OpenCL platforms.

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Introduced AOCLWorkspace.

if (device_type == "accelerator") dtype = CL_DEVICE_TYPE_ACCELERATOR;
#else
if (device_type == "accelerator") dtype = CL_DEVICE_TYPE_DEFAULT;

This comment has been minimized.

@kazum

kazum Jul 24, 2018

Member

Why do you want to use CL_DEVICE_TYPE_DEFAULT here?

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Deleted this code and now works with device_type == "accelerator".

const char* s = data_.c_str();
size_t len = data_.length();
cl_int err;
program_ = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
OPENCL_CHECK_ERROR(err);
#else
OfflineCompile(w, t);

This comment has been minimized.

@kazum

kazum Jul 24, 2018

Member

Should be moved to codegen since compiling OpenCL codes for FPGAs takes very long time.

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Offline compilation was moved to codegen. See also BuildAOCL().

@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 24, 2018

Hi @kazum thank you for reviewing. I'll fix them.

TABATA Keiichi added some commits Jul 25, 2018

@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 25, 2018

@kazum I implemented AOCLWorkspace and moved offline compile process to codegen. Please review.

TABATA Keiichi added some commits Jul 25, 2018

TABATA Keiichi
TABATA Keiichi
TABATA Keiichi
TABATA Keiichi
TABATA Keiichi
- Install AOCL 17.1 on Ubuntu 16.04.4 LTS.
- Install FPGA device driver.
- Make ICD file. (/etc/OpenCL/vendors/Altera.icd)
- Make FCD file. (/opt/Intel/OpenCL/Boards/de5net.fcd)

This comment has been minimized.

@kazum

kazum Jul 25, 2018

Member

Add more explanation about what kinds of files we should make.

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Fixed document.

This comment has been minimized.

@962711982

962711982 Jan 4, 2019

But I can't install FPGA PCIe driver on Ubuntu 16.04 LTS,

@@ -91,6 +91,9 @@ Target CreateTarget(const std::string& target_name,
} else if (target_name == "sdaccel") {
t->device_type = kDLOpenCL;
t->keys_array.push_back(ir::StringImm::make("sdaccel"));
} else if (target_name == "aocl") {
t->device_type = kDLOpenCL;

This comment has been minimized.

@kazum

kazum Jul 25, 2018

Member

I guess this should be kDLAOCL.

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Fixed.

// Compile the .cl file.
Target target = Target::create(target_str);
std::string cmd = "aoc aocl.cl -march=emulator -board=";
cmd += target->device_name;

This comment has been minimized.

@kazum

kazum Jul 25, 2018

Member

I think this doesn't work if we don't specify the '-device' option.

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Added logic to check device name.

}

void AOCLWorkspace::Init() {
OpenCLWorkspace::Init("aocl", "accelerator", "Intel");

This comment has been minimized.

@kazum

kazum Jul 25, 2018

Member

I think "Intel" would match the Intel OpenCL platform for CPU/GPU. Should be "Intel(R) FPGA"?

This comment has been minimized.

@ktabata

ktabata Jul 25, 2018

Contributor

Changed "Intel" to the exact platform name.

TABATA Keiichi added some commits Jul 25, 2018

TABATA Keiichi added some commits Jul 25, 2018

TABATA Keiichi
TABATA Keiichi
@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 25, 2018

@kazum I made some changes to meet your comments. Will you review again?

TABATA Keiichi added some commits Jul 25, 2018

TABATA Keiichi
TABATA Keiichi
TABATA Keiichi
TABATA Keiichi
import tvm
tgt_host="llvm"
tgt="aocl -device=de5net_a7 -mattr=emulator"

This comment has been minimized.

@kazum

kazum Jul 26, 2018

Member

I'd suggest "-device=s5_ref" for the tutorial. It is the default device of aoc and available without installing additional BSP.

This comment has been minimized.

@ktabata

ktabata Jul 27, 2018

Contributor

Changed de5net_a7 to s5_ref.

std::string cmd = "aoc aocl.cl";
if (target_str.find("-mattr=emulator") != std::string::npos) {
cmd += " -march=emulator";
}

This comment has been minimized.

@kazum

kazum Jul 26, 2018

Member

Use target->options() to get target parameters.

  for (std::string option : target->options()) {
    if (option == "-mattr=emulator") {
      cmd += " -march=emulator";
    }
  }

This comment has been minimized.

@ktabata

ktabata Jul 27, 2018

Contributor

Fixed.

@kazum

This comment has been minimized.

Copy link
Member

kazum commented Jul 26, 2018

@ktabata In addition, please add a testcase to test the aocl backend.

TABATA Keiichi added some commits Jul 27, 2018

TABATA Keiichi
TABATA Keiichi
@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 28, 2018

@kazum I added testcases.

@kazum

kazum approved these changes Jul 30, 2018

Copy link
Member

kazum left a comment

Looks good to me, thanks.

@ktabata

This comment has been minimized.

Copy link
Contributor

ktabata commented Jul 30, 2018

@kazum Thank you for reviewing. I tested this code on not only emulator but also physical FPGA device. It works.

@tqchen

This comment has been minimized.

Copy link
Member

tqchen commented Jul 30, 2018

cc @vegaluisjose @tmoreau89 @comaniac can you also please take a quick look?

@comaniac

This comment has been minimized.

Copy link
Contributor

comaniac commented Jul 30, 2018

As I saw from the PR, this feature leverages the existing OpenCL code generator for Intel FPGA kernel. It seems to me that this may become annoying for the future improvement because the high-performance OpenCL kernel for Intel FPGA and NVidia GPU is very different. From my personal perspective, separating intel_opencl and nvidia_opencl is necessary, but I'm open if you have a better plan.

@tmoreau89

This comment has been minimized.

Copy link
Contributor

tmoreau89 commented Jul 30, 2018

@ktabata @kazum since we are adding support to both SDAccel and AOCL OpenCL backends, how much common infrastructure should we be using for FPGA-specific code generation? I understand that the OpenCL specs are similar between the two vendors, and that there exist analogous code pragmas between the two specs which can be generated depending on the target we're in.

My opinion is to merge common infrastructure as much as possible early on rather than later. I recommend if you haven't already to coordinate together to reuse common code generation infrastructure as much as possible.

Otherwise, this seems like a good start at supporting Intel FPGAs, I look forward to seeing more ways in which to use TVM to leverage FPGAs.

@kazum

This comment has been minimized.

Copy link
Member

kazum commented Jul 30, 2018

@comaniac +1 for separating AOCL code generation from the existing codes. Adding CodeGenAlteraOpenCL which inherits from CodeGenOpenCL looks good to me.

@tmoreau89 I think it's a good idea to add a key like hls to both SDAccel and AOCL targets and have common schedulers for high-level synthesis backends. Then both backends can share the same logic for code generation, and vendor-specific code generators would only be in src/codegen/. I'll give it a try in a few days.

At the current stage, this PR only adds support for compiling with AOCL and, IMHO, it looks good to be merged before implementing common code generation infrastructure.

@tmoreau89

This comment has been minimized.

Copy link
Contributor

tmoreau89 commented Jul 30, 2018

@kazum - great! As long as we plan on converging the back-ends.

It would indeed be good to have an hls key if we plan on encompassing both OpenCL and C-like HLS languages. If we want a common target to only AOCL and SDAccel, then we should make the key more specific, like fpga_opencl to make it clear that it would differ from HLS like languages.
I think it makes more sense to have an all-encompassing key (hls) since scheduling could most likely be reused across HLS-C and OpenCL for FPGAs.

Feel free to approve!

@comaniac

This comment has been minimized.

Copy link
Contributor

comaniac commented Jul 30, 2018

Minor comment: My experience was the Xilinx HLS C and Intel OpenCL have very different programming model for generating the same architecture. For example, Intel OpenCL uses global variables to represent FIFOs between modules and a module is generated from a kernel function with __kernel OpenCL identifier. On the other hand, Xilinx HLS C uses their built-in data structure to explicitly specify FIFOs. As a result, even both code generators use the same lowered IR, the implementation of each generator may still be very different.

@tmoreau89

This comment has been minimized.

Copy link
Contributor

tmoreau89 commented Jul 30, 2018

@comaniac that's a good point regarding hardware constructs such as FIFOs.

As a result the conclusion of this discussion is to re-use IR passes for schedule lowering purposes between the different vendors, and have specialized code generators for each vendor to convert lowered TVM IR into OpenCL code.

Let me know if you agree with this approach.

@tmoreau89 tmoreau89 merged commit f711853 into dmlc:master Jul 31, 2018

1 check passed

continuous-integration/jenkins/pr-merge This commit looks good
Details

tqchen pushed a commit to tqchen/tvm that referenced this pull request Aug 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment