Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support FPGA Xilinx #26691

Open
Belkharym opened this issue Sep 23, 2019 · 11 comments
Open

Support FPGA Xilinx #26691

Belkharym opened this issue Sep 23, 2019 · 11 comments
Labels
module: backend non-standard backend support triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Belkharym
Copy link

Hello, World!

We are a group intending to accelerate some Pytorch operations on Xilinx UltraScale FPGAs. However, we are a little lost to where to begin to port the functions.

From what we could see, we think we can start from the CUDA implementation and modify it to use the OpenCL API and add an FPGA device type and components (Streams, Storage, Tensors, ...).

We would like to have some guidance on the right way to take. Would you be so kind to help us?
Thank you.

@VitalyFedyunin VitalyFedyunin added triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 24, 2019
@jspisak
Copy link
Contributor

jspisak commented Sep 26, 2019

@Belkharym - glad to connect with you to chat through this before jumping into the technical details. Can you send me an email at jspisak@fb.com so we can find a time?

@vincentqb vincentqb added needs research We need to decide whether or not this merits inclusion, based on research world module: cuda Related to torch.cuda, and CUDA support in general module: backend non-standard backend support and removed triage review module: cuda Related to torch.cuda, and CUDA support in general labels Sep 30, 2019
@dylanbespalko
Copy link
Contributor

Hello,

Xilinx recently released Vitis High Level Synthesis which supports FPGA programming using C++ and Verilog. I have added a proof-of-concept with relevant links here.

I am hoping to work with both:

  1. PyTorch to enable OpenCL, FPGA support.
  2. Xilinx Vitis Libraries to enable complex number support.

Please contact me if you need more information.

@jspisak
Copy link
Contributor

jspisak commented Feb 2, 2020

thanks!! (and apologies for the delayed response). Is this something you would be willing to develop out of tree but promoted as part of the pytorch ecosystem? These are really cool projects but we try hard to keep the core lean and have it as modular as possible.

btw, can you state more about the plans for complex number support? Are you for example planning to support quaternions?

@dylanbespalko
Copy link
Contributor

Hi @jspisak,

Out-of-Tree Project

  • Yes this should absolutely remain out-of-tree.
  • I am using the DeviceType::FPGA device type, however there are two other FPGA solutions from Xilinx alone. Vitis is supposed to replace the other two, but it takes time.
  • Unlike the CPU/GPU, I had to use a config file to reduce the build time by:
    • Specifying which kernels to build (BinaryOps, SpectralOps, ReduceOps, etc).
    • Specifying which data types to build (int, float, double, std::complex).
  • However, FPGA development has a software emulation mode where you might be able build every kernel simultaneously and run all of the unit tests.

In-Tree Changes (minimal)

  • Sometimes there are assertions that block execution on the FPGA.
  • I will submit a few PRs to fix these as they block me.

Support for Quaternions (Higher-Order Spaces).

  • Yes, Vitis can support array data types as long as the combined data does not exceed the memory bit width (typically 512 bits on servers).
  • Eg). 512 bit width = float32 bits * 16 dimensions (Maximum number of dimensions for 32-bit)
  • Eg). 512 bit width = float128 bits *4 dimensions (Maximum precision for Quaternions)
  • I have implemented a generalization of Vec256 (from PyTorch) called Vec which allows for this flexibility using C++ class templates.
  • I will write a blog about this next week

Promotion in the PyTorch Ecosystem.

  • The sw license is the same as PyTorch.
  • I'm not looking to do this for profit.
  • I am always looking for jobs in the San Francisco Bay Area related to Radio Frequency or Optical communications.

@gchanan
Copy link
Contributor

gchanan commented Feb 18, 2020

out-of-tree sounds right and we are happy to accept fixes for the assertions that break you.

@gchanan gchanan removed the needs research We need to decide whether or not this merits inclusion, based on research world label Feb 18, 2020
@tataetae
Copy link

Hi @jspisak @dylanbespalko,
I am a graduate researcher in a group focusing on accelerating machine learning on hardware. We have built a int-16 inference convolution kernel on Xilinx FPGA (Alveo U250) and have some idea about integrating them into Pytorch. We think it would be really cool if we can have a Pytorch convolution inference layer that runs on FPGA and you can just call .fpga() on the layer to have it runs on FPGA. Since Pytorch already has support for quantized data structure, we could have retraining and quantization done all in Pytorch.

Would you mind if we would like to discuss more about implementing convolution inference on FPGA in Pytorch? Thanks!

@dylanbespalko
Copy link
Contributor

Would you mind if we would like to discuss more about implementing convolution inference on FPGA in Pytorch? Thanks!

@tataetae,

I have implemented the very basic math kernels here. Development is on-going, but I think I have all binary functions (eg a + b) and unary functions (eg. sin(a)) covered. I have been testing on the Alveo U200 card, but I'm just using sw_emu and hw_emu modes for now.

As for my future development:

  • I mostly work on embedded FPGAs (Xilinx Zynq, Versal)
  • I implement non-neural network functions.
  • I currently just use autograd.
  • I am using floating-point precision until I can decide what to do about fixed precision

If you would like to develop in my repo, send me a Gitlab ID and tend me when I need to clean up my act.

@tataetae
Copy link

@dylanbespalko sure, I am more than happy to talk about this. Is there a way to DM or email you?

@dylanbespalko
Copy link
Contributor

@dylanbespalko sure, I am more than happy to talk about this. Is there a way to DM or email you?

You can register for pytorch.slack.com and find me as Dylan Bespalko. Or you can email me here.

I am writing a blog right now that outlines the project status and how to contribute.

@dylanbespalko
Copy link
Contributor

@tataetae,

I have posted a tutorial on my work integrating PyTorch with Xilinx Vitis/Vivado:
pytorch-for-fpga-part-1-heterogeneous-processing
pytorch-for-fpga-part-2-basic-fpga-optimizations
pytorch-for-fpga-part-3-advanced-fpga-optimizations
pytorch-for-fpga-part-4-deploying-pytorch-kernels

  • I still need to make some in-tree changes to PyTorch (See PyTorch WIP: Add FPGATensorId for Xilinx Vitis Devices #32920)
  • I anticipate that I will be developing math kernels for the FPGA for another 2-months.
  • There is additional work for calling multiple math operations. This can be done in two ways.
    1. Registering a new top-level math kernel to calls sub-kernels.
    2. Exporting the PyTorch graph to a Xilinx .cfg file (may require changes to the PyTorch JIT)

I'm have no stress in my life, so I'm going to develop a bunch of math kernels and then export the PyTorch graph.

@dylanbespalko
Copy link
Contributor

dylanbespalko commented Feb 25, 2020

@jspisak, @anjali411, @ezyang

btw, can you state more about the plans for complex number support? Are you for example planning to support quaternions?

Here is an update:

  • I have blogged about how to deploy math kernels for real numbers.
  • I have generalized the code to work with complex and quaternions.

Here are some issues:

The last issue is very scary for me. Please vote up the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: backend non-standard backend support triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

7 participants