High-Performance Data Pipelines
Our goal is to optimize .NET much more for scenarios in which inefficiencies are directly tied to your monthly billing. With ASP.NET Core, we've already improved significantly and are now in the top 10 for the plain text benchmark. But we believe there is still a lot more potential that we could tap into.
Our primary area of focus for CoreFxLab in 2018 continues to be to improve performance for data pipeline oriented apps. This is pretty much any cloud app as the request-response pattern is fundamentally a data pipeline.
Consider an example for a typical web request:
Most cloud apps are parallelized by running multiple requests at the same time while each request is often executed as a single chain. This results in the picture above where all components are daisy-chained. That means slowing down one component will slow down the entire request.
An important metric for a cloud app is how many requests per second (RPS) it can handle. That's important because the load is (usually) outside the control of the app author. So the fewer requests your app can handle, the more instances of your app you need in order to satisfy the demand, which basically means the more machines you need to pay for.
Also, consider the role of the framework. For the most part, your app code is represented by the green box above, while the blue and red parts are usually components provided by the framework. While you can optimize your app code, your ability to reduce the overhead in the framework provided pieces is limited.
That's why many people rely on benchmarks to assess the potential of a given web framework. It's important to keep in mind that benchmarks are by definition gross simplifications of real-world workloads; but they are often considered to be good at providing an indicator for the theoretical best a given framework can do for you if you remove virtually all overhead that is specific to your app.
Application performance can be directly mapped to hosting dollars, and for companies both large and small, hosting costs can be a pain point.
What if building an application on one framework meant that at the very best your hardware is suitable for one tenth as much load as it would be had you chosen a different framework?
What does high-performance mean?
It might not be the best term, but for the context of cloud apps the property we're seeking to improve is scalability:
Scalability is the capability of a system, network, or process to handle a growing amount of work.
Many areas affect scale, but an efficient request/response pipeline is key as it's the backbone for all cloud solutions.
Other investments (faster GC, better JIT, AOT) aren't a replacement, but will provide additional benefits.
Current areas of concerns
If we look at the .NET Stack, in particular the BCL, there are a few areas where we could do much better:
Stringis UTF16 but networking is UTF8, forcing translations
- Buffers are often defensively copied, slowing down operations and increasing allocations
- Buffers are often not pooled, causing fragmentation and GC pressure
- Interop with native code often creates additional buffers to avoid passing around raw pointers
- Async streaming forces pre-allocation of buffers, causing excessive memory usage
Our goal is to reduce the number of allocations for the basic operations, such as parsing and encoding, having a more efficient buffer management that can handle managed and native memory uniformly, and providing a programming model that makes the result easy to use while not losing efficiency.
Other components of the .NET stack (such as MVC, Razor Pages and Serialization) will rewire their implementation in order to take advantage of the efficiency gains provided by these new APIs.
CoreFxLab Roadmap Updates (last updated December 12, 2018)
As part of .NET Core 2.1, Span<T>, Memory<T>, and Pipelines have shipped. As such all future development on these primitive types will continue in the corefx repo and will no longer be a part of the experimental CoreFxLab repository.
With the release of .NET Core 3.0 Preview 1, System.Device.Gpio feature development has moved from corefxlab to the new open source repo https://github.com/dotnet/iot. This new repo contains device drivers for Linux and Windows 10 IoT Core RS5; as well as new device bindings for important sensors, motors, and displays.
Buffer Reader has also graduated from the corefxlab prototype-phase and now ships in .NET Core 3.0 as System.Buffers.SequenceReader from the https://github.com/dotnet/corefx repo. Prototype development of BufferWriter currently continues in corefxlab.
The high-performance System.Text.Json.Utf8JsonReader has moved to the https://github.com/corefx repo as part of .NET Core 3.0 Preview 1. Please see the JSON Announcement, the JSON Roadmap, and the Future of JSON Discussion for more details on the rest of the System.Text.Json features planned for .NET Core.
Next Wave of Experimental CoreFxLab APIs
The following areas of prototype experimentation are our primary focus in the near term.
- Utf8String, The cloud and the web are heavily based on UTF-8. .NET String processing and character types are ill suited for the modern world as they require transcoding back and forth from UTF-8 to UTF-16 and use upwards of 2x as much memory.
- BufferWriter, analogous to TextWriter but optimized for UTF-8 and byte buffer scenarios instead of UTF-16 text.