SAS ESP integration with In-memory Databases for State and Data Persistence

Overview
Understanding SAS Event Stream Processing in Kubernetes
Challenges
Introduction to ESP StateDB Windows
- Deep Dive into the Architecture of StateDB Windows
When to use external in-memory databases with ESP
- Limitations
Reference Architectures
Benefits of Using ESP StateDB Windows
Performance Evaluation
Sample use-cases to understand migration to StateDB windows
Demos
Additional References

Overview

Have you ever been in a situation where the internal memory of SAS Event Stream Processing (ESP) was not sufficient to store all the data? Have you faced a time-consuming recovery of ESP from a failure due to the recreation of the internal state? Have you thought of having an external unlimited in-memory database to handle the limited internal memory issues? If yes is your answer to all these questions, then you are at the right place.

In ESP, indexes are used to store the state of the windows in the ESP XML model. Indexes are saved in the internal memory of the ESP server pod which is assigned at the deployment time. The following windows maintain state in the internal memory:

An ESP Aggregate Window is used for performing aggregation over a defined retention period which can be long or short.
An ESP Join Window is used to perform a lookup against a whitelist/reference data. These windows store the aggregated data and result of the join operation in the internal memory. This works well when the join is performed over a small reference data and aggregation is performed where the retention period is small.

In situations where we need to compute aggregated data over a long retention period and find a record amidst millions of records (reference data), we cannot continue to use the internal memory. Along with that, each ESP server pod will have to maintain data in the in-memory and the state will be lost upon server failure. Restoring the state can be a time-consuming process when a huge volume of data is involved. Additionally, the reference data which could be millions of records (user data) must be uploaded to the internal memory of every ESP server pod. If we have scaling pods, a good amount of time will be consumed to upload the reference data to the memory which will delay the processing of events by the ESP server pod.

In this project, we present two new ESP StateDB windows StateDB Reader and StateDB Writer to integrate with Singelstore and Redis in-memory databases. We will present how we can solve the above-mentioned issues. You will also learn about the various benefits of this new architecture for persisting state and data using in-memory databases.

NOTE: We provide integration with Singlestore and Redis in-memory databases only.

Understanding SAS Event Stream Processing in Kubernetes

Before jumping into the introduction and architecture of StateDB windows, it is important to understand how ESP runs in Kubernetes.

According to the new ESP paradigm:

Each ESP server pod runs one and only one ESP XML Project.
Each ESP server has its dedicated resources, i.e., CPU and Memory. These resources are provided in the deployment settings when deploying the project from SAS Event Stream Manager.
ESP server pods scale based on the CPU and memory metrics to handle the incoming events. The auto-scaling parameters, i.e., max and min replicas are also provided in the deployment settings.
ESP XML Projects can be Stateless and Stateful.

Challenges

In ESP, join, aggregation, and rolling aggregation operations are performed with very high throughput and very low latency. The data aka state maintained in the join or aggregation windows are stored in the internal RAM of the ESP server pods. In some use cases, if the volume of the state is huge, then relying on the internal memory can be very expensive.

Moreover, ESP works on distributed scalable architecture where we have multiple interconnected nodes and ESP server pods run in these nodes. At any point, a node can crash which subsequently crashes all the running pods on that node. It is important to ensure the reliability, fault tolerance, and failure recovery of the ESP server pods. When we have stateful ESP XML projects, it further becomes challenging to handle and manage the states maintained in the internal memory of the ESP server pods in the event of node or pod failure/crash.

Following are the limitations to using ESP internal memory for performing memory-intensive operations like join/lookup over millions of records (reference data), simple aggregation, multiple aggregations, or rolling aggregation with a high retention period and high volume of data.

Limited RAM: RAM assigned to ESP is limited and cannot be scaled vertically while the ESP server pod is running. In that case, we are bound to have only that much data that our internal memory can handle.
Slow and time-consuming pre-load of data: For a high volume of reference data that can be in millions, pre-loading all these records to ESP server pods can be very slow and expensive.
Reprocessing of events in case of failure: In case of ESP failure, we have to reprocess a lot of events to get back to the same state.
Slow auto-scaling due to pre-loading of RAM: For auto-scaling stateful models, whenever a new ESP server pod would join, it has to be pre-loaded with the reference data. The pre-load of data would take time before the ESP server can start processing the incoming events.

These limitations are not acceptable in many situations as reprocessing, pre-loading, and limited RAM would add to the latency and reduce the throughput.

Introduction to ESP StateDB Windows

To overcome the above challenges and limitations, we integrate ESP with external in-memory databases to ensure the reliability and failure recovery of XML projects along with state and data persistence.

Hereby, we introduce you to the ESP StateDB windows that are capable of communicating with the Singlestore and Redis in-memory databases. These windows can read and write events, reference/whitelist data, results from aggregation operations, and also perform lookups. Additionally, the ESP XML models can now be designed stateless as all the state and data are persisted in the in-memory databases.

Let’s look into the details of the ESP StateDB windows.

Deep Dive into the Architecture of StateDB Windows

In this section, we will learn about the new ESP StateDB Windows. We have introduced two different ESP windows that integrate with high-performance, low-latency in-memory databases, that are Singlestore and Redis as shown in Figure 1. Today, we integrate with Singlestore and Redis only.

The two windows are:

ESP StateDB Writer Window
ESP StateDB Reader Window

Figure 1: ESP Integration with Singlestore and Redis via ESP StateDB Windows

Let’s have a deep dive into the architectural designs of these windows. We will also learn how these windows communicate with Singlestore and Redis.

When to use external in-memory databases with ESP

High Volume of Data: When the reference/historic/dimension/whitelist (whatever you call it) data is in many thousands or millions of records and we need to perform lookup operations. It can be a single entity/record search or multiple. It makes sense, to offload the high volume to external in-memory databases rather than keeping it in the ESP server pod's internal memory.
Large Retention Periods: When we perform aggregation operations over a large volume of data with high retention periods which could span for days or months, you must consider using external in-memory databases. Because high retention periods mean that you need to keep the aggregated results in the memory for a longer period. We likely go out of internal memory for such huge volumes. External in-memory databases make the most sense when both high-volume and high-retention periods are there.
Failover: Best scenario for failover is when the ESP Models are stateless. Failover is complex and time-consuming when the ESP server pod’s internal memory is used to maintain the state and store the data. In case of failure, we will lose both the state and data from the internal memory. To have a quick recovery from failure, without any need to reprocess the events to create the state again, we must use external in-memory databases for persisting state and data. Therefore, whenever the ESP server pod would crash/fail, a new one that replaces it can fetch both the state and data from the external in-memory database and resume.
Scalability: We often encounter fluctuating workloads. To handle the fluctuating incoming number of events, it is important for ESP to auto-scale to continue to provide the same throughput. This is only possible with the deployment of ESP in Kubernetes. Additionally, having stateDB windows is essential for persisting the state and data. Both Singlestore and Redis allow concurrent accesses and can scale both horizontally and vertically if required.
Join and Aggregation Operations: As previously mentioned, it makes sense to maintain the states of these operations in external in-memory databases. With this, scalable models can share states and data and perform operations in parallel without interfering with each other.

Limitations

There are two limitations that must be considered when using the architecture with the ESP StateDB windows.

Increased Latency: We have to have over-the-network communication with the external in-memory databases and this can bring some additional latency, which in most cases is within the acceptable range. Also, the speed of databases can limit the throughput on ESP server pods. This can be worked out with the proper configuration of Singlestore and Redis. If having very low latency is critical to your application, then consider using ESP memory instead of StateDB.
No state/data management for Pattern, Geofence, MAS: In case of a failure, state, and data for ESP Pattern, Geofence & MAS windows will be lost. Currently, we have no integration of these with external in-memory databases. Therefore, they continue to use the ESP internal memory allocated at runtime in deployment settings.

Reference Architectures

We have reference architecture designs that integrate with the ESP StateDB Windows. In this section, you will learn about these architectures, their characteristics, benefits, and limitation. These architectures are deployed in a Kubernetes environment running in a public/private cloud or on-premises platforms and use the Lightweight SAS Event Stream Processing (ESP) Kubernetes Operator Framework.

Following are the three reference architectures with ESP StateDB Windows:

Stateless ESP Projects Using In-memory DB for Persisting State and Data: Architecture where individual ESP server pods integrate with Singlestore and Redis for persisting state and data.
Scalability of ESP Server Pods with Kafka using In-Memory Database for State and Data Persistence: Architecture where ESP server pods autoscale with Kafka and concurrently access Singlestore and Redis
Multiple Cascading Projects with Message Buses and DB integration: Architecture with cascading ESP projects (projects interconnect using messages buses) where any of these projects can autoscale and connect to in-memory Databases

Benefits of Using ESP StateDB Windows

Figure 2 illustrated the key benefits of using ESP StateDB Windows for state and data persistence rather than relying/using the ESP server pods internal memory.

Figure 2: Benefits of Using ESP StateDB Windows

Achieve stateless ESP models. Integration with StateDB Windows allows having stateless ESP models for join and aggregation operations. However, it is not yet possible for other windows like pattern, geofencing, and MAS.
Reference data loaded directly or via ESP to in-memory DB. All the reference/whitelist/dimension data can be preloaded either via the ESP model or directly. Any mechanism can be used to upload the data into the in-memory database.
No pre-load of data. There is no need to pre-load the data as the internal memory is not used for storing the reference data anymore. Data is loaded into the DB and shared across all.
Concurrent access. All the data and state are maintained and persisted at the in-memory DB which is concurrently accessible by all the ESP server pods. Configuration for managing concurrent access must be done at the database level during its deployment.
Scalability. ESP server pods can autoscale and all of them can access the configured in-memory database allowing high throughput.
Failure Recovery and Reliability. No need of spending hours in failure recovery because no state is lost during a failure.
Distributed processing. All ESP servers can process incoming events in parallel. This is only possible because of the shared state and data in the in-memory DB.
Aggregation performed at Singlestore but not at Redis. It is very important to note that aggregation is performed right at the Singlestore which is not the case with Redis. Redis will fetch all the matching records and then bring them to the window and then perform the aggregation.

Performance Evaluation

Figure 3 demonstrates the performance evaluation of Singlestore VS Redis for write, lookup, and aggregation operations. This clearly shows when we should use Singlestore and when Redis.

Setup: Both Singlestore and Redis are running in a cluster configuration of 3 nodes in Kubernetes. It was a simple, most basic configuration recommended in the documentation of each of these in-memory databases. Please note that these evaluations should not be considered as benchmark performance.

Figure 3. Performance Evaluation of ESP StateDB Windows for Write, Lookup and Aggregation Operations

Write Performance. When the total number of events written to both databases was up to 1 million, Redis surpasses Singlestore. Therefore, if the rate of incoming events to the ESP is fast then Redis must be configured.
Lookup Performance. When the total number of records in the in-memory databases was 1 million, and a one-to-one search/lookup was performed, Redis again outshines Singlestore. Redis showcased almost double the performance of Singlestore.
Aggregation Performance. Test was conducted with 3 sizes of aggregation groups, i.e., 20, 200, and 2000 records per group. Redis excelled when the aggregation group size was smaller. With the increase in aggregation group size, which means more records to aggregate, we notice that Singlestore starts to perform way better than Redis. When the group size was 2000, Redis was incredibly slow and performed very poorly.

So, based on these evaluations, whenever the use-case demands write and lookup operations with small groups of aggregation, Redis is the right choice. But, if the aggregation group size is huge, one must opt for Singlestore.

Sample use-cases to understand migration to StateDB windows

In this section we present use cases that can act as reference to migrate your existing approach to a StateDB based approach. Go through the section to understand the potential issues with your current approaches related to Lookup, Aggregation and Multiple Retention implementation in ESP and how StateDB windows can help in solving them.

Demos

In this section, you will have the opportunity to run some sample examples.

License

This project is licensed under the SAS License Agreement for Corrective Code or Additional Functionality.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
demos		demos
images		images
reference_architectures		reference_architectures
sample-use-cases		sample-use-cases
statedb_windows_architecture		statedb_windows_architecture
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SUPPORT.md		SUPPORT.md

License

sassoftware/iot-sas-esp-integration-with-in-memory-databases

Folders and files

Latest commit

History

Repository files navigation

SAS ESP integration with In-memory Databases for State and Data Persistence

Contents

Overview

Understanding SAS Event Stream Processing in Kubernetes

Challenges

Introduction to ESP StateDB Windows

Deep Dive into the Architecture of StateDB Windows

When to use external in-memory databases with ESP

Limitations

Reference Architectures

Benefits of Using ESP StateDB Windows

Performance Evaluation

Sample use-cases to understand migration to StateDB windows

Demos

License

Additional References

About

Resources

License

Stars

Watchers

Forks