Projects

FastClick is an ever evolving project since the original ANCS paper. For example, the research package contains ongoing work on multi-threading elements while the github projects page contains ideas for future improvements.

Academic work using FastClick as their dataplane or as prototype/demonstrator

2023 - ShRing shares each Rx ring among several cores when networking memory bandwidth consumption is high consequently, shRing increases the throughput of NFV workloads by up to 1.27x
2023 - LemonNFV LemonNFV loads NFs into a single process down to the binary level, schedules them using an intercepted I/O, and isolates them with the help of a restricted memory allocator. Experiments show that LemonNFV can consolidate 5 complex NFs without modifying the native code.
2023 - Ribosome uses a P4 switch to store payload on unused memory of servers through RDMA while processing only headers on a P4 switch, enabling terabit-worth of data analysis on a single server.
2023 - Impact of IOTLB "Overcoming the IOTLB wall for multi-100-Gbps Linux-based networking" looks at the impact of the IOTLB using FastClick.
2022 - The Benefits of General-Purpose On-NIC Memory is a paper published at ASPLOS'22 using FastClick as prototype to show how much using NIC memory can improve performance.
2022 - Gallium analyses a Click pipeline to run some parts on a P4 switch. Gallium saves 21-79% of processing cycles and reduces latency by about 31% across various software middleboxes.
2022 - Packet Order Matters uses FastClick to reorder packets in a small us-scale window, allowing to increase temporal and spatial locality and increase server efficiency.
2022 - Morpheus applies run-time optimizations to FastClick to avoid trampolines and streamline the binary for heavy hitters.
2021 - Inference of virtual network functions’ state via analysis of the CPU behavior is fingerprinting FastClick's memory access (among others) to infer NF state.
2021 - IAT IAT dynamically chooses the tenants that share its LLC resource with DDIO to minimize the performance interference by both the tenants and the I/O. It is demonstrated using FastClick and Redis.
2021 - PacketMill is presented at ASPLOS'21. It boosts the performance of network packet processing frameworks by melting the driver and the pipeline together.
2021 - ConnTrack uses FastClick as the prototype to compare many hash-tables implementations in a connection tracking environment. The results of that paper are still to be integrated in FastClick.
2020 - CrossRSS uses FastClick for its simulations.
2020 - ddio-bench uses FastClick and Metron to measure the performance Data Direct I/O (DDIO) technology in different scenarios. The paper.
2020 - Cheetah is a load-balancer presented at NSDI'20. Cheetah is a new load balancer that solves the challenge of remembering which connection was sent to which server without the traditional trade off between uniform load balancing and efficiency. Cheetah is up to 5 times faster than stateful load balancers and can support advanced balancing mechanisms that reduce the flow completion time by a factor of 2 to 3x without breaking connections, even while adding and removing servers.
2020 - P4SFC Offloads some portion of NFV chains in a P4 switch. Some part might still need to run in FastClick, so we classify it here.
2019 - RSS++ performs load and state-aware intra-server load-balancing. RSS++ works by tweaking NICs' RSS indirection tables. To do so, RSS++ monitors the load of each RSS bucket and solves an optimization problem to re-assign RSS buckets to different CPU cores. Moreover, RSS++ proposes a state migration algorithm to avoid synchronization problems when rebalancing. RSS++ can load-balance either FastClick applications or any socket application by attaching to XDP using BPF code and ethtool to change the indirection table. This is NOT Click in Kernel mode, it only uses standard APIs to communicate with the Kernel. In our ACM CoNEXT 2019 paper we demonstrated how RSS++ evenly balances 100 Gbps of iperf traffic across 12 CPU cores of an 18-core server, while a regular RSS-based iperf server requires the entire processing capacity (i.e., 18 cores). RSS++ not only saves 6 CPU cores (a third of the server's capacity) but also dramatically reduces the tail latency of the iperf server by 4.5x. Find more exciting results in our paper.
2018 - MiddleClick is now merged into FastClick and focuses on providing efficient flow-based service chaining. It was published in 2021 - ToNin 2021 afteran invited paperin 2018.
2018 - Beamer is using FastClick for its high-speed stateless load-balancer dataplane implementation. The paper.
2018 - Metron is another FastClick branch that combines a DPDK-based FastClick agent with a controller, based on ONOS and the ONOS server device drivers. Metron (i) offloads service chains' classification operations into programmable hardware (i.e., OpenFlow switches and/or server network cards) and (ii) performs accurate packet dispatching (using tags) to eliminate inter-core communication. Metron achieves ultra efficient packet processing at the speed of the underlying hardware. In our USENIX NSDI 2018 paper we demonstrated how Metron realizes deep packet inspection at 40 Gbps (on top of an OpenFlow switch and 4x10 GbE Intel NICs) and stateful service chains at the speed of a 100 GbE Mellanox ConnectX-4 NIC. A follow-up paper was published in ACM ToCS in 2021 including support for scaling blackboxes.
2016 - SplitBox uses FastClick for its privacy-preserving firewall dataplane.

Academic work using FastClick only as a point of comparison

Contrary to the previous section, these papers only run FastClick as a point of comparison.

2021- Comparing the performance of state-of-the-art software switches for NFV compares FastClick with others frameworks such as VPP and BESS. VPP shows generally lower performance. BESS is generally performing faster, but also has less features. This difference is now compensated by PacketMill's improvement.
OpenPATH is a service chaining framework that analyses dependencies between NF to enable parallelism. It compares with VPP, BESS and FastClick and performs better as it uses a different I/O virtualization mechanism, likely orthogonal to the performance of NFs themselves.
mmb is an arxiv paper which compares to FastClick here and there as part of the evaluation of their technique to build a stateless and stateful matching algorithm. Forwarding performance are similar. The firewall is performing worst because of the algorithm used in Click is an LPM as stated in 4.3.2 and authors actually use a 5-tuple match which is a very different, much more limited kind of matching. Recent work on FlowIPManager+FlowIPNAT enables "thread-safe" and much more efficient multi-threading than IPRewriter. Though, using per-thread NAT is still a better approach, allowed by recent NICs as they can ensure returning packets go on the right core (symmetric 2-tuple RSS or port range classification).

As a packet generator

Somehow deserve its own category. The wiki contains information about advanced use case to build a fit-for-purpose packet generator.

2020 - Letting off STEAM: Distributed Runtime Traffic Scheduling for Service Function Chaining

Complete citation list

The complete publication list mentioning FastClick can be found at google scholar's FastClick citations

Companies using (Fast)Click

There are at least 3 companies using Click internally, sometimes with some bits of FastClick, but none specifically state it officially. It's more than probable more companies use it...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly