Introducing DBrex, a light-weighted pattern matching application for databases that uses regular expressions and deterministic finite automata (DFA) to efficiently detect patterns in SQL tables.
Once the container is running, the application finds all patterns in the table and returns them in an output table, optimised for frequently updated databases.
- ⚙️ Effortless Setup: Install seamlessly using Docker.
- 🚀 Performance Boost: Storing intermediate states allows quick evaluations of new rows without recomputing.
- 🎯 Pattern Detection: Utilize regular expressions and DFA-based transitions to find complex patterns.
- 🔄 Incremental Processing: Leverage existing results when new data arrives, eliminating full recomputation.
- 🤝 SQL Integration: Generate results through efficient JOIN operations based on DFA transitions.
Note
DBrex will listen to your database table as long as the container is running. Stop the container if not needed.
Warning
-
When using Docker, make sure to mount a volume with
-v dbrex:/app/dbrex/datain your Docker command. This step is crucial as without it DBrex restarts all computations after container restart. -
If you are on Linux, make sure to add
--add-host=host.docker.internal:host-gateway
Example Docker Command:
docker run -d -v dbrex:/app/dbrex/data --name dbrex ghcr.io/janskn/dbrex:latest <args>To set up DBrex with your own database and to start the container, follow this guide.
DBrex employs a unique approach to pattern matching in databases:
- Pattern Definition: Patterns are defined using regular expressions, making them intuitive and flexible
- DFA Translation: Regular expressions are converted into deterministic finite automata
- SQL Integration: DFA transitions are mapped to SQL JOIN operations
- State Management: Intermediate states are preserved to optimize performance
- Incremental Updates: New data is processed using existing results, avoiding full recomputation
DBrex is built on top of the SQL engine Trino.
Included SQL dialects are:
- MySQL
- PostgreSQL
- MariaDB
And many more...
Check the Trino documentation for a complete list of supported databases and data sources.
While traditional CEP engines offer support for data streams, DBrex provides:
- Native SQL database integration
- Efficient state management for incremental processing
- No need for separate processing engine
Compared to SQL's MATCH_RECOGNIZE:
- Simpler syntax and easier maintenance
- Efficient handling of new data through incremental processing
- Data Monitoring: DBrex is ideal for scenarios where new data is continuously added, and patterns need to be detected in close-to-real-time without reprocessing the entire dataset.
- Log Analysis: Efficiently identify patterns in log data stored in SQL tables, such as detecting sequences of errors or specific user activities.
- Financial Transactions: Detect fraudulent patterns or anomalies in transaction data as new records are inserted.
- Trend analysis: Deriving future trends from past data.
Your table must contain a column with integers in ascending order with a step size of 1.
Regular expression operators * (Kleene star) and + (Plus) are currently not supported in pattern definitions due to JOIN constraints.
Patterns are limited to finite-length sequences.
You can see the magic happen at http://127.0.0.1:8080/ui/# - to log in, enter an arbitrary username.
Find performance evalutations here.
Underlying data set is from kaggle.
This project is licensed under the MIT License.
Created by Jan Skowron

