Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PartitionTable backoff on errors #247

Closed
R053NR07 opened this issue Apr 15, 2020 · 4 comments
Closed

PartitionTable backoff on errors #247

R053NR07 opened this issue Apr 15, 2020 · 4 comments

Comments

@R053NR07
Copy link

As noted in #239 (comment) by @mattburman the CatchupForever function just sleeps for some time and retries afterwards if an error occurs. An enhancement for this behavior would be to use an backoff instead and make it even configurable for the user.

I propose a solution for that in #246.

@frairon
Copy link
Contributor

frairon commented Apr 15, 2020

@mattburman one question regarding the expected and intended behavior you described in #239:
So before (in the current master), the view.Run method would return when there's an error. If you specify the option WithViewRestartable, it would retry until you call Terminate .
The new implementation is similar like returning on error or repeating when the option is set. To stop the view from repeating, the context needs to be closed. That's the difference.
It sounds like the View in your project is reconnecting though you did not specify the restartable-option. If this is the case it might be a bug we should look further into.
If not, then Jan's solution might just be what you need and provide you an option for the Backoff.

@mattburman
Copy link

so just to clarify:

We have a service using v0.1.3 (one version before master?) using WithViewRestartable that implements custom logic around view.Run. We found view.Run would return an error if the kafka cluster was stopped (for example). We assumed WithViewRestartable just meant you could call view.Run again. We basically had a goroutine which would loop retrying view.Run if there was an error (with backoff).

We are now writing a wrapper around the goka view/emitter to encapsulate some of these things (retries, prometheus metrics), but using the latest beta version instead. The latest beta version changed so that view.Run would no longer return an error and retry internally. And now #246 adds the backoff interface so we can just provide our previous http://github.com/jpillora/backoff instead of calling Reset and Duration ourselves

@frairon
Copy link
Contributor

frairon commented Apr 16, 2020

Oh, now I realize how the old behavior was, I wasn't really aware of that. The reasons for the old behavior were historical I guess. Initially we had problems with Kafka connection and needed the services to continue serving requests even if the views were down. So the easy solution was to keep the storage open and retry externally.

Sorry for the behavior change, that surely needs some documentation improvement :-/
Still I think the new behavior makes more sense.
Thanks for clarifying that!

@R053NR07
Copy link
Author

Implemented in #246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants