Skip to content

Conversation

@viktorbeck98
Copy link
Collaborator

First version for the persistency and the configuration engine. The core of the configuration engine is integrated in the persistency. So far I implemented:

  • A generic persistency class for event-based purposes that incorporates interchangable data structures that are defined at definition of the persistency object.
  • Multiple data structures that can be injected into the persistency class based on the use case, such as:
    • EventDataFrame: best for developers to get an event-type-separated view of the data (based on pandas)
    • ChunkedEventDataFrame: similar as above but is fit for online processing (based on polars). Technically production-ready but since it stores all incoming data it is still RAM heavy. Howerver, in the future, for a storage component we could connect this to a database handling the storage.
    • EventVariableTracker: is also a data structure just as the two above but only stores series of persitency changes in the most efficient way (run-length encoding - RLE) and the set of unique values per variable. Persistency changes can thereby be seen as a feature extracted from the data. Detectors like the NewValueDetector, ComboDetector, EventSequenceDetector and possibly more can use this as the underlying structure.
  • A RLU set implementation for handling outdated values (not yet integrated into persitency class)

Some todos:

  • Integrate persitency class into the existing detectors (I am on it already)
  • implement the functionality to remove outdated values from persistency

@viktorbeck98 viktorbeck98 requested a review from ipmach January 7, 2026 14:58
@viktorbeck98 viktorbeck98 self-assigned this Jan 7, 2026
@viktorbeck98 viktorbeck98 added enhancement New feature or request New feature labels Jan 7, 2026
@ipmach
Copy link
Contributor

ipmach commented Jan 22, 2026

looking good so far!

But we need to discuss about the notebooks, because I dont know if we want them in the main branch. They present multiple issues:

  • Code traceability is pretty bad. As the files do not have human readable code. (it is written in Jupyter language).
  • They are not tested in the main pipeline, so they have a higher risk of not working in the next versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement persistency class Implement auto-configuration for detectors

3 participants