Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With too many apb's bootstrap causes the route to timeout. #876

Closed
jmontleon opened this issue Apr 4, 2018 · 14 comments
Closed

With too many apb's bootstrap causes the route to timeout. #876

jmontleon opened this issue Apr 4, 2018 · 14 comments
Assignees
Labels
3.12 | release-1.4 Kubernetes 1.12 | Openshift 3.12 | Broker release-1.4

Comments

@jmontleon
Copy link
Contributor

jmontleon commented Apr 4, 2018

Bug:

Pretty low priority at this point.

With too many apb's bootstrap causes the route to timeout.

What happened:

apb bootstrap
Contacting the ansible-service-broker at: https://asb-1338-ansible-service-broker.172.17.0.1.nip.io/ansible-service-broker/v2/bootstrap
Error: Attempt to bootstrap Broker returned status: 504
Unable to bootstrap Ansible Service Broker.

What you expected to happen:
Boostrap completes.

How to reproduce it:
Populate a whole lot of APB's. In my cases I created 150 service templates in CFME while running a simple test of the adapter. Those were in addition to the ones available at docker.io/ansibleplaybookbundle that were also being loaded.

Boostrap actually completes but the route times out. I'd guess the defaults is ~10-30 seconds. My boostrap is taking probably closer to a minute

oc annotate route -n ansible-service-broker asb-1338 --overwrite haproxy.router.openshift.io/timeout=300s stops it from happening.

We could probably add the annotation to the route we create with a reasonable setting higher than the default, otherwise as we see users with more than a handful of APB's we may start seeing folks reporting this.

@rthallisey rthallisey added bug 3.10 | release-1.2 Kubernetes 1.10 | Openshift 3.10 | Broker release-1.2 labels Apr 10, 2018
@SaravanaStorageNetwork
Copy link
Member

SaravanaStorageNetwork commented Apr 13, 2018

@rthallisey @jmontleon Thanks! It helped - able to run bootstrap only after extending the timeout as mentioned.

@jwmatthews
Copy link
Member

Assuming correct fix here is to make this an async call, thoughts?
If we wanted to make this async assume we align this to 3.11 to limit churn in 3.10.

@jmrodri
Copy link
Contributor

jmrodri commented Apr 16, 2018

@jwmatthews yes if the bootstrap is taking too long, we should spawn it off as we start.

@jmontleon
Copy link
Contributor Author

jmontleon commented Apr 16, 2018

@jwmatthews @jmrodri @dymurray when running apb boostrap it's waiting for the bootstrap to complete to also do a relist so the catalog sees the updates as well.

If it was changed to async would that behavior have to be changed in the client tool?

@jmrodri
Copy link
Contributor

jmrodri commented Apr 16, 2018

@jmontleon I misread the original complaint about apb bootstrap. I thought it was during broker startup. hrm that makes it a bit more complicated. Making it async would mean the bootstrap call returns immediately but we'd need a way to get back the same information.

@jmontleon
Copy link
Contributor Author

We could probably annotate the route we create with a 1 or 2 minute timeout for now. I think it would cover most cases for the time being. If we ever see cases of someone with 100's of APB's we might have to readdress it, but even close to 200 was returning for me in under a minute.

@jwmatthews
Copy link
Member

Example @jmontleon shared:

oc annotate route -n ansible-service-broker asb-1338 --overwrite haproxy.router.openshift.io/timeout=300s

@jmrodri jmrodri self-assigned this May 17, 2018
@jmrodri jmrodri added 3.11 | release-1.3 Kubernetes 1.11 | Openshift 3.11 | Broker release-1.3 and removed 3.10 | release-1.2 Kubernetes 1.10 | Openshift 3.10 | Broker release-1.2 labels May 17, 2018
@jmrodri
Copy link
Contributor

jmrodri commented Jul 24, 2018

Fixed by PR #1008

@jmrodri jmrodri closed this as completed Jul 24, 2018
@wenchma
Copy link

wenchma commented Aug 8, 2018

@jwmatthews it's unlucky, the 504 error not gone after increasing the timeout as your mentioning .
😭

# svcat get brokers
           NAME                                                        URL                                              STATUS  
+-------------------------+-------------------------------------------------------------------------------------------+--------+
  ansible-service-broker    https://automation-broker.automation-broker.svc:1338/automation-broker/                     Ready   
  template-service-broker   https://apiserver.openshift-template-service-broker.svc:443/brokers/template.openshift.io   Ready   

# apb push --broker https://automation-broker.automation-broker.svc:1338/automation-broker
version: 1.0
name: my-new-apb
description: This is a sample application generated by apb init
bindable: False
async: optional
metadata:
  displayName: my-new
  dependencies: []
plans:
  - name: default
    description: This default plan deploys my-new-apb
    free: True
    metadata: {}
    parameters: []
Found registry IP at: 172.30.138.73:5000
Finished writing dockerfile.
Building APB using tag: [172.30.138.73:5000/openshift/my-new-apb]
Successfully built APB image: 172.30.138.73:5000/openshift/my-new-apb
Found image: docker-registry.default.svc:5000/openshift/my-new-apb
Warning: Tagged image registry prefix doesn't match. Deleting anyway. Given: 172.30.138.73:5000; Found: docker-registry.default.svc:5000
Successfully deleted sha256:906da7a2ddb5cbb925f2cc27b9199d396c370c81322a36c5cfe774985b70718f
Pushing the image, this could take a minute...
Successfully pushed image: 172.30.138.73:5000/openshift/my-new-apb
Contacting the ansible-service-broker at: https://automation-broker.automation-broker.svc:1338/automation-broker/v2/bootstrap
Error: Attempt to bootstrap Broker returned status: 504
Unable to bootstrap Ansible Service Broker.

@jmontleon
Copy link
Contributor Author

There are a couple PR's hanging out to fix this. Trying to get them merged:
automationbroker/automation-broker-apb#27
#1033

@wenchma can you describe the asb route and verify the annotation is there? Is it actually taking 5 minutes (300s) to return now if you set that value? If that doesn't seem right maybe we're hitting something else.

@jmontleon jmontleon reopened this Aug 8, 2018
@wenchma
Copy link

wenchma commented Aug 9, 2018

@jmontleon thanks for your reply, I guess I need a longer timeout.

@jmontleon
Copy link
Contributor Author

jmontleon commented Aug 9, 2018

As mentioned it there might be something else happening that we haven't run into before as well. 5 minutes seems like an awfully long time to bootstrap unless you are loading a huge number of APB's. I suppose it is possible, but managed to get to only around around 60s by creating a couple hundred APB's.

@wenchma
Copy link

wenchma commented Aug 15, 2018

@jmontleon it looks like that the 5 minutes timeout did not take effect. it was obvious it returned 504 error in less than 5 minutes.

@jmrodri jmrodri added 3.12 | release-1.4 Kubernetes 1.12 | Openshift 3.12 | Broker release-1.4 and removed 3.11 | release-1.3 Kubernetes 1.11 | Openshift 3.11 | Broker release-1.3 labels Sep 14, 2018
@jmrodri
Copy link
Contributor

jmrodri commented Nov 12, 2018

Closing as this has been fixed.

@jmrodri jmrodri closed this as completed Nov 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 | release-1.4 Kubernetes 1.12 | Openshift 3.12 | Broker release-1.4
Projects
None yet
Development

No branches or pull requests

6 participants