Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Http-service to check bookie sanity state #3630

Merged
merged 3 commits into from Nov 25, 2022

Conversation

rdhabalia
Copy link
Contributor

Motivation

Right now, if Bookie startup or process fails due to various reasons then bookie getState() rest API still gives a successful response even though bookie writes are not going through. Therefore. Bookie should have a REST API to check bookie sanity test and that can be used to check bookie liveliness in k8/helm-chart or to plot bookie health graph.

Changes

Add Bookie REST API to check bookie health to validate bookie liveliness

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need this feature.

I have only one comment:
Could we apply some throttling or, better, try to run only one check at a time: if there is an ongoing sanity check, we wait for the current check and return the result ?

@rdhabalia
Copy link
Contributor Author

good point. made a change to run max (1) concurrent requests.

@rdhabalia
Copy link
Contributor Author

@eolivelli can you PTAL.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left one minor issue please check my comment.
Then the patch is good to go from my POV.
And we should release this with 4.16.

@hangc0276 FYI

try {
// allow max concurrent request as sanity-test check relatively
// longer time to complete
lock.acquire();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wait at most for X seconds (5?) and if the semaphore cannot be acquired fail the request. We don't want many requests piling up.

Please also fix formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@hangc0276
Copy link
Contributor

I have left one minor issue please check my comment. Then the patch is good to go from my POV. And we should release this with 4.16.

@hangc0276 FYI

@eolivelli Sure

Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

@hangc0276 hangc0276 added this to the 4.16.0 milestone Nov 14, 2022
@hangc0276 hangc0276 merged commit c2e59e1 into apache:master Nov 25, 2022
yaalsn pushed a commit to yaalsn/bookkeeper that referenced this pull request Jan 30, 2023
### Motivation

Right now, if Bookie startup or process fails due to various reasons then bookie getState() rest API still gives a successful response even though bookie writes are not going through. Therefore. Bookie should have a REST API to check bookie sanity test and that can be used to check bookie liveliness in k8/helm-chart or to plot bookie health graph.

### Changes

Add Bookie REST API to check bookie health to validate bookie liveliness
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants