Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[FLINK-4520][flink-siddhi] Integrate Siddhi as a light-weight Streaming CEP Library #2487
Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
Siddhi CEP is a lightweight and easy-to-use Open Source Complex Event Processing Engine (CEP) released as a Java Library under
It would be very helpful for flink users (especially streaming application developer) to provide a library to run Siddhi CEP query directly in Flink streaming application.
Thank you for that big contribution. Siddhi looks like a cool approach to CEP.
Before digging into the details, I would like to start a discussion about whether we should have this as a part of the core Flink repository, as a subproject, or if it would be best to have it initially as an external project.
The reason is that that Flink repository is becoming a bit big right now. Build times are very long, test stability hard to manage, and there is quite a bit of "dead" code that was contributed by someone at some point but seems rarely used and is not maintained by the contributors.
To help have a good discussion, it would be great to learn a bit more:
@StephanEwen thanks for the comments, I think it's both ok to keep this in the core or as an separated project, but the concern is it maybe better for community development to centralize qualified libraries togather. As an alternative solution for too test stability and dead code, may it possible to create another code repository say "flink-library" with independent ci?
BTW: here are the answers to your questions one by one:
Siddhi is a rich-featured CEP and has its own community, and maybe almost the only open source CEP solutions compatible with Apache License. And this library
So I think it would be extremely light-weight but useful, and the current implementation should be almost completed.
Sure, first of all, personally I am very willing to keep continuously contributing to Flink project in any way.
And also we used siddhi with distributed streaming system a lot in production, and currently considering to support flink as well under consideration of better state management and window supporting. So I would continuously maintain the code if merged, it not, I would maintain as separated project to make sure it's open sourced and workable as well.
We use siddhi with streaming environment in production a lot, currently supports storm and spark streaming, and also consider extending to Flink.
That all sounds very good.
There have been thoughts and discussions one in a while about creating a dedicated sub-projects for libraries/extensions like this, or at least a dedicated repository under Flink. I think this would be a great opportunity to revive those discussions.
Let me start a thread on the mailing list.
@haoch I hope you are okay with waiting for a few days for that discussion to come to a conclusion.
Hi @haoch, thanks alot for this contribution. I'm sorry for the late response.
I recently started moving some of the streaming connectors of Flink to Apache Bahir, a community for extensions to Spark, Flink (and maybe others).
I think Bahir is addressing this issue nicely. So far we added only streaming connectors to Bahir, but I would like to see libraries and other things build on top of Flink there as well.
By the way, the tests you've added are failing on our CI system. Can you look into it? https://s3.amazonaws.com/archive.travis-ci.org/jobs/166483919/log.txt