-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[broker]Handle BadVersionException thrown by updateSchemaLocator() #6683
Conversation
/pulsarbot run-failure-checks |
2 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
I am seeing this exact issue while testing the 2.5.1 candidate load:
It happens every time the check is run and the topic doesn't already exist. So, the first time it fails like this and then it will work until the topic is deleted, which will happen automatically after a while with default broker settings. I think it would be a good idea to cherry-pick this fix to 2.5.1 since it makes it look like the broker is broken when the user requests a heartbeat in Pulsar Manager. This is also a regression compared to 2.5.0 which doesn't have this issue. |
This problem is worse with Docker images built using the 2.5.1 candidate. The health check fails every time, not just the first time. It's going to look really bad in Pulsar Manager. @tuteng @sijie I think this fix should be added to the 2.5.1 release or the PR that caused the behavior to change from 2.5.0 should be pulled out. |
@cdbartholomew thank you for reporting this. since rc2 is canceled to include AVRO related changes, we will some other issues that were labeled for 2.5.2 (including this one) in the new RC. |
Just to clarify - I don't think there are any regressions between 2.5.0 and 2.5.1. The problem exists in 2.5.0. #5563 only addressed some problems. |
The health check problem does not exist in 2.5.0, so there is definitely a regression here. |
@cdbartholomew I have just reproduced the issue in 2.5.0. thanks
|
@sijie Let me clarify. The fact that the health check never works when running with Docker images built from the 2.5.1 candidate is a regression. I assumed that it was related to this issue, but it is not. I just built new Docker images with this fix (v2.5.1-candidate-2 + 6683), and it doesn't resolve that issue. So, there is a different issue here. I will test with master to see if the health check works there. |
I don't understand "never" here. The current buggy behavior is - it works when the topic exists, and it will fail the first time when the topic was deleted due to inactivity. I have just verified that branch-2.5 + this fix has fixed the buggy behavior described above.
If you are seeing health check failure using Docker Images, I don't think it is a problem of the code. It sounds more like a problem of your docker image. Because the broker code shouldn't have any difference running locally than running in a docker. |
This is the error:
This is when calling the endpoint from within the broker container. |
|
I am building the docker images using:
Has anyone else tested images built from this tag? |
Add label release-2.5.1 |
…tor() (apache#6683) Co-authored-by: Sijie Guo <sijie@apache.org>
…tor() (apache#6683) Co-authored-by: Sijie Guo <sijie@apache.org>(cherry picked from commit cf045e4)
…tor() (apache#6683) Co-authored-by: Sijie Guo <sijie@apache.org>
putShema()
still throwsKeeperException.BadVersionException
even after #5563 was merged when health check api is requested.After GC, health check api returns
ok
butPutShema()
throwsKeeperException.BadVersionException
.( By #6577, the client reconnects when getting the exception and so health check api returns
ok
. )