New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akka Streams: Potential onComplete deadlock #17507
Comments
This is the expected behavior, but this is not the desired behavior :) It is a tricky to workaround this. |
Original mailing-list topic: https://groups.google.com/forum/#!topic/akka-user/_ENOeHyeEF0 |
In a naive scraping approach you would keep a queue of work items containing the pages you still need to scrape. The termination condition would then be that 1) no work is outstanding and 2) the queue is empty. Using the stream setup you seem to distribute the work queue into the buffers of the participating stream elements and thus you cannot observe the above conditions at any single point. So if there's a solution, it would probably be to make that information locally accessible somehow by having some kind of "conductor" at some point which has enough information to terminate processing. |
Another option would be to add a "poision pill" element that is sent as the last element in the original stream, and once it is observed at the feedback point it should close the stream. We don't have a built-in takeUntil stage right now but it is not hard to construct one. |
How do i close the stream manually when the poison pill element is received? |
You only need to close the feedback loop, since the other port of the merge has been closed. You can do this by adding a custom stage as the last stage of the feedback-arc (before the preferred port) that just finishes the stream when it sees the poison-pill. |
@drewhk in the simplest case the original stream may only contain one element (the "anchor" to start scraping). If the poison pill would be the next element the stream would be instantly closed. |
ah, ok, it is adding recursively elements. Btw, if the depth limit of the recursion is static, then the best approach would be to unroll the loop. Iteration is hard to express with the current stages we have, I had some ideas around one year ago but was not implemented. |
well, to be more precise, (nested) iteration is easy to express (doesn't need cycles in the graph), unbounded depth iteration is the hard one. |
This is not a bug per se, closing. |
Hello Akka-Team,
as proposed by Endre on the Akka mailing-list i'm creating this issue.
I'm currently building a scraper system on top of Akka Streams. I have written a Flow that is able to follow paginated sites and scrape them in a loop.
For this i use a feedback loop with Merge/Unzip. As Endre stated the stream does not deadlock on elements/backpressure but it deadlocks on the completion signal.
When all URL's are processed the stream never completes. OnComplete never gets invoked.
Is this the expected behaviour?
You can find a sample project with a Spec that demonstrates the problem here
The text was updated successfully, but these errors were encountered: