Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RHPAM-263] fix for Liveness probe for Business Central #40

Merged
merged 1 commit into from Apr 20, 2018
Merged

Conversation

calvinzhuca
Copy link
Contributor

Signed-off-by: calvin zhu calvinzhu@hotmail.com

Thanks for submitting your Pull Request!

Please make sure your PR meets the following requirements:

  • Pull Request title is properly formatted: [CLOUD-XYA] Subject
  • Pull Request contains link to the JIRA issue
  • Pull Request contains description of the issue
  • Pull Request does not include fixes for issues other than the main ticket
  • Attached commits represent units of work and are properly formatted
  • You have read and agreed to the Developer Certificate of Origin (DCO) (see CONTRIBUTING.md)
  • Every commit contains Signed-off-by: Your Name <yourname@example.com> - use git commit -s

@calvinzhuca
Copy link
Contributor Author

https://issues.jboss.org/browse/RHPAM-263

replace the probe script with HTTP request for business central's home page.

@errantepiphany errantepiphany self-requested a review April 19, 2018 18:17
Copy link
Collaborator

@errantepiphany errantepiphany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calvinzhuca , can you please also apply similar change to the new templates/rhpam70-prod.yaml template that was recently added? Thanks!

@errantepiphany
Copy link
Collaborator

@calvinzhuca When you're done, can you please squash your commits into one? Thanks again!

Signed-off-by: calvin zhu <calvinzhu@hotmail.com>
@errantepiphany errantepiphany merged commit 34b76c9 into jboss-container-images:rhpam70-dev Apr 20, 2018
@sutaakar
Copy link
Contributor

@calvinzhuca @errantepiphany Just a question, isn't 180 second initial delay too much? From my own experience the Workbench usually starts in around 60-70 seconds. Wouldn't it have more sense to set delay to 120 seconds, to start Workbench pod faster while still keeping some time buffer?

@errantepiphany
Copy link
Collaborator

@sutaakar I would be okay with that if you want to submit a PR which adjusts the time. @calvinzhuca , what do you think?

@calvinzhuca
Copy link
Contributor Author

I think 180 seconds is a reason time, I discussed this value with Babak.
We want to give enough slack time to prepare for situation that client might have a slow box or lots of things to load up during boot.
for example, in EAP7 there is a setting jboss.as.management.blocking.timeout, default value is 300 seconds, mean it will timeout server startup after 5 minutes, but usually the EAP just take 1 or 2 minutes to be up.

Leaving the time a little bit longer, will also benefit the client if anything goes wrong, they have enough time to check the log before the pod will be restarted automatically.

@sutaakar
Copy link
Contributor

@calvinzhuca Ok. In that case wouldn't it have sense to create a custom probe to check readiness and liveness as is mentioned in JIRA? The current implementation forces user to wait for some time even when Workbench is actually started.
The probe approach seems more flexible for me, allowing user to start working with Workbench as soon as it is ready while providing enough time to check that is wrong in case Workbench doesn't start, not being bound to some specific time.

@bmozaffa
Copy link
Contributor

@calvinzhuca @errantepiphany @sutaakar I actually had concerns about this and negotiated Calvin down from 5 mins to 180 seconds :-). Now that I'm looking further into this and hear Karel's concern, I agree. We don't want to kill the pod if it hasn't started in 120 seconds, but we also shouldn't have to wait an extra 1+ minute to use it, even though it's ready. Instead of a high initialDelaySeconds, we can have higher values for failureThreshold and timeoutSeconds to avoid unnecessary pod restarts, without preventing access to pods that are ready to use.

@calvinzhuca calvinzhuca deleted the RHPAM-263 branch April 24, 2018 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants