-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAMOA-16: Add an adapter for Apache Flink-Streaming #11
Conversation
Thanks. There seem to be some errors in the tests (Kryo serialization for what I can see). |
Sure, we are looking into it! Kryo version incompatibility seems to be trickier than we thought. |
If we need to update Kryo on our side, we'll be happy to do so. |
It seems that the earliest Kryo version that Flink is compatible with is v2.23.0. However, Storm tests seem to fail to initialise the serialiser for versions greater than 2.17. Judging from the changelog It looks like the current Apache Storm version (v.0.10.0) supports newer Kryo versions so maybe a first approach would be to try upgrading Storm first. What do you think? |
Makes sense, as we want to update the Storm dependency anyway. We have a single kryo.version variable to ensure that we don't use different Kryo versions around the codebase, but in this case it seems to me that it is necessary. |
Good point. I just overrided the kryo version in the flink build and tested it locally. Also, all module tests seem to be passing on travis. |
I can't build this PR successfully on my local. Here's the error message:
Apparently there are API changes apache/flink@8436e9c due to FLINK-1625. Probably we can use stable version of Flink for this PR. What do you guys think? |
hey Arinto! Thanks for looking into it. I agree, let's stick to the last stable release (0.8.1) for now. We will commit some appropriate changes for it today. cheers |
We reviewed the changes after 0.8.1 and a lot of things have changed since then, many constructs such as the type extractors we are currently using will not be functioning in 0.8.1. Thus, we decided to just make a final patch for the first RC of 0.9.0 that is coming very soon, most probably end of next week. It looks like it is the best way to go :) |
+1 |
Hi, |
yes, we are planning a 0.9.0-milestone release so we can use that as the first supported stable version for the PR |
d472085
to
d2570cb
Compare
[Flink] added the missing license [flink] Changes to Flink integration to SAMOA [flink] changes in order to debug [flink] Change Utils class [flink] minor refactorings [flink] StreamExecutionEnvironment passed through the factory Removed .iml files from git Add algorithm for detecting circles in the topology
Added more checkes in the circleCanBeInitialised function. Added setters/getter change kryo version to pom file added debug printouts in FlinkProcessingItem class
Hi @senorcarbone, |
Hi @senorcarbone, This adapter for Apache Flink seems amazing! Thanks so much! I tested and reviewed, and found the same issues as @gdfm. The accuracy of the VHT was quite good, and it was very slow in my computer. Any thoughts about this? Do we need to tune any parameter in the conf file? |
We have a open JIRA in Flink to start a "streaming only" mode, which starts up Flink in an streaming optimized mode. For now, you can set the following configuration values in the conf/flink-conf.yaml to achieve a similar effect:
|
Hello again @gdfm and @abifet , It looks like the algorithm performance (and accuracy) depends heavily on the ingestion speed of the local statistics processors. The irony is that the greater the speed the slower the whole computation gets by time since more and more attribute events are sent to the local statistics processors with higher rate, the more updates the model aggregator gets back. The average processing delay (in num of flatten instances processed by the aggregator between sending a process event and receiving the respective local statistics) is ~2k instances for Flink and around 400k instances for Storm. Also in Storm the aggregator continuously broadcasts ~100-200 attribute messages to local processors on average while Flink broadcasts ~2100 attribute messages due to the rate it gets results back I assume. These are collected locally on each component and there was no message duplication. |
I also tried adding a 2sec sleep on the local flink processors to delay their results back and the algorithm finished in 12sec with less (57%) accuracy. Perhaps the model aggregator could be enhanced with some flow control logic to compensate between model updating rates and accuracy. |
Hi @senorcarbone, |
I think it has to do with the number of splits. If the model aggr. gets more local statistics during the time it processes the same amount of instances, it will have to split (exponentially?) more times and send more attributes in the same period. That is what I got from the experiments at least. Maybe there should be a separate issue on VHT but you can keep me and @fobeligi in sync regarding it. @fobeligi is working on an experimental native implementation of VHT on Flink which we can share soon and she had to deal with similar issues. Also, feel free to let me know if you want me to try out more experiments and share results to speed up the process. Regarding the PR, do you think we should look into something more? |
Indeed, I see what you mean. Given that the feedback loop in Flink is faster, the number of attempts to split should increase. We already have some flow control to regulate the rate of ingestion in PrequentialEvaluation. I'll play a bit with it to see what happens. |
public void setInstanceInformation(InstancesHeader instanceInformation) { | ||
this.instanceInformation = instanceInformation; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these changes for Flink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were test leftovers apparently, we do need them according to @fobeligi
Hi, I think we should address the issue with VHT in a separate PR. @senorcarbone there are just a few issues outstanding. Once these are fixed I'm +1 |
18d6d27
to
8c54b73
Compare
8c54b73
to
493c4c9
Compare
Thanks @gdfm . I hope I addressed everything in the last commit and inline comments. Let me know if you want me to look into something more! |
+1 |
Merged. |
Great to see this merged. Is the Samoa website located in the |
Yes, the sources are in gh-pages and we use jekyll to build the website. |
Thank you. |
Awesome, thanks @gdfm! Will you merge from local branch? Mind that it is missing some last commits. I will close the PR once everything is merged. On a slightly related note, we ran some experiments today on the new stable streaming api that will be released soon that contains many good fixes and additions. The VHT experiment mentioned above now takes approximately around 80sec for 1m instances since filters are properly chained, making input ingestion much faster so that it catches up with local processors. You can find the branch with the patch here. Special thanks go to @gyfora for fixing chaining in iterations! I will create a new PR once we have a stable release. |
Weird, I have already merged the PR and pushed back to Apache. Something in the mirroring is not working, the PR should be closed automatically. Thanks for the info, improving performances would be a great contribution! |
This PR includes support for Apache Flink as an adapter. We haven't updated the documentation yet but we could perhaps do it in a follow up PR, since it seems you are already working on a docs redesign.