Skip to content
This repository has been archived by the owner on Oct 27, 2022. It is now read-only.
/ 1st-CLaaS Public archive
forked from yeshuj/1st-CLaaS

Developing Smith Waterman accelerators on F1 instances using 1st CLaaS


Notifications You must be signed in to change notification settings



Repository files navigation

1st CLaaS

The 1st CLaaS Framework

CLaaS: Custom Logic as a Service

1st-CLaaS header

- Unleashing the 3rd wave of cloud computing!

  • Wave 1: CPUs
  • Wave 2: GPUs
  • Wave 3: FPGAs

Having FPGAs (Field-Progammable Gate Arrays) available in the data center presents enormous potential for new and exciting compute models. But, for this emerging ecosystem to thrive, we need infrastructure to develop custom hardware accelerators for these platforms and integrate them with web applications and cloud infrastructure. The 1st CLaaS Framework brings cloud FPGAs within reach of the open-source community, startups, and everyone.


This README provides an overview of the project. The following documents are also available. After this README, get hand-on with Getting Started.

For local development:

For optimization and deployment of your custom kernel using AWS F1 with Xilinx tools.

FPGA-Webserver Project Overview

1st-CLaaS header

With 1st CLaaS, you can stream bits directly to and from your custom FPGA kernel using standard web protocols (WebSockets or REST). In the simplest use case, all software is client-side in the web browser, and all server logic and data storage is implemented in the FPGA. Your kernel uses a very simple interface to stream the data, and it can be developed in Verilog (or any language compilable to Verilog).

1st CLaaS is ideal for implementing functions in hardware that are compute-limited but tolerant of internet latency and bandwith. Applications requiring a more sophisticated partitioning of responsibilities can extend the host C++ code or Python web server to process data between the web application and FPGA.

Possible application domains might include:

  • voice/image processing/filtering
  • bioinformatics
  • simulation
  • pattern matching
  • machine learning
  • etc.

Your application might be:

  • a web application that includes a hardware-accelerated function
  • a hardware project controlled via a web interface
  • a healthy mix of custom hardware and software

1st CLaaS supports hardware kernel development using free and open source tools on Linux (Ubuntu and CentOS, currently). Deployment is currently targeted to Amazon's F1 FPGA instances. We welcome contributions to extend 1st CLaaS to other platforms and operating systems.

Project Components

A hardware accelerated web application utilizing this framework consists of:

  • Web Client Application: a web application, or any other agent communicating with the accelerated server via web protocols.
  • Web Server: a web server written in Python using the Tornado library.
  • Host Application: the host application written in C/C++/OpenCL which interfaces with the hardware.
  • FPGA Shell: FPGA logic provided by this framework that builds upon and simplifies the shell logic provided by Xilinx.
  • Custom Kernel: The application-specific FPGA logic.


Data is transmitted from the Web Client Application in chunks of 512 bits (currently). JavaScript calls a send method and receives data from Custom Kernel in a callback. Custom Kernel has a simple streaming interface with a 512bit bus of input data and a 512-bit bus for output data. Data travels from JavaScript:

  • as JSON (currently) via WebSocket to Web Server,
  • as JSON (currently) via a UNIX socket to Host Application,
  • as chunks via OpenCL memory buffers to FPGA Shell,
  • as streamed chunks to Custom Kernel,
  • and back, in similar fashion.

Communication performance is not currently the focus. Applications that are well suited to this architecture are inherently compute-limited, so optimizing communication is often unimportant, but the implementation can be optimized as the need arises.

In the simple case, you provide only the green components in the diagram above, and all custom processing of data is performed by the Custom Kernel. But the C++/OpenCL Host Application and/or Python Web Server can be extended as desired.

Streamlining F1

Prior to this project, integrating FPGA hardware acceleration with web and cloud applications was a daunting undertaking requiring:

  • a full-stack developer
  • a software engineer
  • a domain expert
  • an IaaS expert
  • a hardware designer

By providing the web server, host application code, and kernel shell logic to stream the data between web application and FPGA kernel as well as automating cloud instance creation and configuration, 1st CLaaS reduces your work to:

  • web development, and
  • logic design

1st-CLaaS header [CC BY-SA 2.0, LuMaxArt, modified]

Infrastructure development overhead is reduced from several person-months down to hours.

Looking specifically at the Amazon F1 platform, F1 provides powerful Xilinx FPGAs and Xilinx development tools on a pay-per-use basis, which is quite compelling. But the platform is bleeding edge and requires significant expertise to utilize. Our experience with this platform has been a rather painful (and somewhat expensive) one for several reasons:

  • Documentation is often misleading as APIs and infrastructure are evolving.
  • External dependencies are poorly managed, so tutorials break at random.
  • Xilinx tools, Vivado and SDAccel, while powerful, are difficult to learn and use, slow, and arcane.
  • OpenCL is a whole other beast, built for folks who want to design hardware like it's software... which it obviously isn't.
  • Developers must understand AXI protocols and manage AXI controllers.
  • The AWS platform can be intimidating to a newcomer.

We had to go through this pain, but we bundled our work so you wouldn't have to.

To further streamline development, reduce cost, and avoid any dependency on the F1 platform and Xilinx tool stack, we support development on your local machine where the kernel is emulated with RTL simulation using the Verilator open-source RTL simulator. AWS and Xilinx tools are only required for kernel optimization and deployment. As an added bonus, Verilator simulation runs significantly (~100x?!) faster than simulation using the Xilinx "hardware emulation flow," partly because Verilator is fast and partly because we simulate only the Custom Kernel, not including the shell logic surrounding the kernel.

Reducing the problem to web and RTL development is not the finish line for us. 1st CLaaS is a part of a broader effort to redefine the silicon industry and bring silicon to the masses. Getting past the complexities of RTL modeling is part of that. 1st CLaaS is driven by avid supporters of TL-Verilog, in association with Redwood EDA. TL-Verilog introduces a much-needed digital circuit design methodology shift with simpler and more powerful modeling constructs. 1st CLaaS is in no way tied to TL-Verilog. You can use Verilog/SystemVerilog or any hardware description language that can be turned into Verilog. But TL-Verilog lnguage extensions are supported out of the box, and we strongly encourage you to take advantage of them and help us drive this innovation forward. Redwood EDA provides a free, online IDE for TL-Verilog development at You can find training materials in the IDE. Read the more-complete story from Redwood EDA founder, Steve Hoover.

Use Cases

Commercially, 1st CLaaS is used by ThroughPuter, Inc. to provide dynamic classification of streaming data.

1st CLaaS renders realtime fractals for


This repository is generally working, and the initial development push is winding down.

Anything and everything is subject to change at this point, especially with respect to the interface provided by the framework for applications to build upon. So, you should build on a specific version of the framework and expect to do some debugging if you choose to upgrade by pulling from master.

Main Contributors

Related Technologies

  • We are considering a unification with Fletcher.

Acclaim and Further Information


Hmmmm... We haven't given that much thought yet. Just say something nice, and we'll be happy.


All trademarks cited within this repository are the property of their respective owners.


Developing Smith Waterman accelerators on F1 instances using 1st CLaaS







No releases published


No packages published


  • C 41.2%
  • C++ 14.6%
  • JavaScript 13.9%
  • Python 7.6%
  • Makefile 6.5%
  • HTML 6.3%
  • Other 9.9%