Join GitHub today
Sessionizer is the stage to sessionize events, it receives the events from collector via a consistent hashing scheduler, it maintains session state and generates session marker event (Begin and End). It uses the si as the affinity key and the same si event will be routed to the same node. Each sessionizer has a local in-memory offheap store to store the session sate, and it has a pluggable RemoteSessionStore for recovery.
The session state includes the session metadata and variables. The session metadata is the data which not change frequently, like ip, ua. The session variables are some state or the counters of the session. The sessionizer logic is declared by the EPL with a few annotations.
The sessionize supports multiple tenants' sessionization, each tenant has a configuration, and it calls SessionProfile. And a session profile can contain a few sub session profiles.
The SessionizerConfig is the configuration for whole sessionizer, and it includes mulitple session profiles. The sessionizer utilizes an off-heap cache, and the off-heap configuration is in the SessionzierConfig. There are a few rules when configuring off-heap:
- User needs to set MaxDirectoryMemorySize to give sessionizer enough off-heap, for SUN JDK or OpenJDK, please use -XX:MaxDirectMemorySize
- There is no limitation on off-heap, but it needs to follow a few rules.
- BlockSize is the minimized memory per entry, an session may use a few blocks
- the MemoryPageSize should less than or equal blockSize * 65536, a memory page can have max 64k blocks.
- Sessionizer off-heap cache is a time slot cache, each second will be a time slot. the maxTimeSlots will be the max time slots it can support, so the max session IDLE time in seconds should less than maxTimeSlots
The sessionizer utilizes Esper to do sessionization, it has two stages:
- First stage it uses EPL to get the hints for how to do sessionization, then pass it to sessionizer engine to create/load session, this is controlled by the epl on the SessionizerConfig.
- Second stage it uses EPL to manipulate the session state. This is controlled by the epl on each SessionProfile or SubSessionProfile.
Both stages sessionizer introduces a few annotations on top of Esper. When Esper statement matches the condition, it will trigger the annotation which annotated on that statement, then the listener registered on that annotation will be invoked.
First stage input is the raw events to sessionizer, and it will determine the session profile which is applicable for the raw event. It uses @Session See source to provide the hint.
Second stage input can be raw event and session marker event, this stage will be used to manipulate the session state or enrich the input event.
It has a few annotations See source
Since this stage need access the session state. The session state will be encapsulated with a few Esper variables:
- session - type: com.ebay.pulsar.sessionizer.esper.impl.SessionVariable
- metadata - type: com.ebay.pulsar.sessionizer.esper.impl.AttributeMapVariable
- parentSession - type: com.ebay.pulsar.sessionizer.esper.impl.SessionVariable
- parentMetadata - type: com.ebay.pulsar.sessionizer.esper.impl.AttributeMapVariable
The parentSession, parentMetadata will only be available when the session is sub session.
There is a pluggable remote store for the user to plug remote store for session state.
The instance of the RemoteStoreProvider can be injected to com.ebay.pulsar.sessionizer.impl.SessionizerProcessor via spring xml.
Besides the built-in extension, the sessionizer allowes the user to register new annotation.
This is a jetstream app which can be run on the docker. It will expose below ports:
- 9999 for monitoring
- 15590 for receiving the events from collector and internal loop back events.