Skip to content
This repository has been archived by the owner on Jul 23, 2020. It is now read-only.

Error in OpenShift Online deployment not reported to user in OpenShift.io if quota exceeded #2629

Closed
ldimaggi opened this issue Mar 15, 2018 · 29 comments · Fixed by fabric8-ui/fabric8-ui#3469

Comments

@ldimaggi
Copy link
Collaborator

Steps to recreate:

  • Reset user environment
  • Create a new quickstart, observe deployment to stage and run, verify endpoints are reachable on both stage and run
  • Create a 2nd new quickstart, observe deployment to stage and run, verify endpoints are reachable on both stage and run
  • Create a 3rd new quickstart, observe deployment to stage and run are reported in the pipeline view:

screenshot 20

  • Verify that the endpoints are not reachable for the 3rd quickstart - on both stage and run
  • Navigate to the deployment display - observe that the endpoints' pods are not started

screenshot 19

Can we add some notification to the user at the point in time when the attempt to deploy to stage or run fails? The error is available in the events for the endpoint's pod in OSO:

screenshot 21 1

@joshuawilson
Copy link
Member

This is due to the first 2 apps using up all the quota and not being shut down. We might need to tell them that there quota is full and they need to turn off ones they are not using.

This gets to be very bad if they "remove" the app from OSIO but the children are not removed from OSO. They continue to run but now the user can't see them in OSIO.

@joshuawilson
Copy link
Member

Workaround is to reset env on profile page.

@joshuawilson
Copy link
Member

@catrobson once this is confirmed by SD team, we will need some direction in what and how to tell the user.

@catrobson
Copy link
Collaborator

@joshuawilson we have a design to warn the user and not allow them to add more deployments (applications) here: https://redhat.invisionapp.com/share/PMDCE3G94

This was not carried into the launch wizard design since the functionality wasn't prioritized for implementation, but we could easily do the same thing in the new visual design.

@catrobson
Copy link
Collaborator

I think this also relates to deleting a space, since we leave stuff in OSO but the user might not realize that and then question why they are hitting limits.

@maxandersen
Copy link
Collaborator

Is there a way in the launcher flow to check for resources available and warn user if its getting "crowded" ?

@catrobson
Copy link
Collaborator

@maxandersen We had a solution (linked above) that we had removed this from the design process for launcher since there was no backend support for this capability. If we can support it, we can add that element back into the design for launcher. Please let me know and I'll get designs updated accordingly.

@joshuawilson
Copy link
Member

I think we need to prioritze this. If it weren't for the Env Reset this might be a SEV1.

@catrobson
Copy link
Collaborator

After speaking with @joshuawilson we decided this was bigger than just the launcher - this is about informing users throughout the system about quota limitations. I have created the following UXD story to capture the work needed around this: fabric8-ui/fabric8-ux#932

@joshuawilson
Copy link
Member

The Deployment API could or should provide quota information to inform any component that needs to know if the quota has been maxed out.

@jiekang
Copy link
Collaborator

jiekang commented Mar 19, 2018

The Deployments API can be queried for quotas of:

per environment (run, stage)
per deployment (instances of an application in stage/run)

@andrewazores
Copy link
Collaborator

andrewazores commented Mar 19, 2018

^ for ease of reference:

https://github.com/fabric8-ui/fabric8-ui/blob/af70017347c0da12f185a9920bf047e90e57e32b/src/app/space/create/deployments/services/deployments.service.ts#L384

This links to the function that returns the used vs. available (quota) memory for an environment. The other relevant functions are adjacent, within that same class.

Worth mentioning that this service was designed with a poll cycle and a push model for subscribers, so Observables returned by those functions do not necessarily issue an immediate HTTP request. If this is going to be reused for other parts of the UI then we may need to make some modifications or expose another function to get access to the same data with an on-demand pulling sort of model, depending on what kind of new UX is going to consume this outside of the Deployments page.

@catrobson
Copy link
Collaborator

catrobson commented Mar 19, 2018

From March 19 platform meeting, we agreed that:

  • Will add Delete API which actually deletes all components of a space in OSIO. @xcoulon to communicate this to UX team when complete.
  • Deployments team already has API to use for checking quota utilization (https://github.com/fabric8-ui/fabric8-ui/blob/master/src/app/space/create/deployments/services/deployments.service.ts#L384) that the UX team can use
  • Still need to identify if launcher knows size of boosters and can provide that through API. For now, we set up a standard preset limit in OSIO for all boosters right now and this will suffice for the first implementation of user feedback on quota limits.
  • There is no way to delete an application (or view your application in a sensible way beyond what we have on the deployments page). This is an outstanding issue that we won’t address right now.
  • Pipeline page does not show failure to deploy when quota is hit. This also needs to be fixed (there is an outstanding bug), and we’ll update any design artifacts around how this should be reflected as part of fabric8-ux/Validate username is ok #932

@ebaron
Copy link
Collaborator

ebaron commented Mar 19, 2018

If you want to use the backend API directly to get environment quota/usage, here's an example:

$ curl -H "Authorization: Bearer $MY_TOKEN" \
https://api.openshift.io/api/deployments/environments/run
{"data":{"attributes":{"name":"run","quota":{"cpucores":{"quota":2,"used":0.488},
"memory":{"quota":1073741824,"units":"bytes","used":262144000}}},
"id":"run","type":"environment"}}

@ldimaggi
Copy link
Collaborator Author

I missed the meeting due to a conflict on this point:
'...will add Delete API which actually deletes all components of a space in OSIO...'

Will this delete API remove the corresponding resources in OSO (build configs, deployment configs, etc)? Thx!

@catrobson
Copy link
Collaborator

@ldimaggi

Will this delete API remove the corresponding resources in OSO (build configs, deployment configs, etc)?

Yes, this is the intent.

@ldimaggi
Copy link
Collaborator Author

Perfect! We will want to make use of that in automated tests.

@ldimaggi
Copy link
Collaborator Author

ldimaggi commented Mar 20, 2018

I wanted to add a couple of additional data points to this issue as the patterns in which quota-related errors are seen are consistent:

  • A user is able to create/deploy/stage/run a maximum of (2) quickstarts before quota-related errors occur.
  • When the user creates (3-to-5) quickstarts, the errors described earlier in this issue are seen. The quickstarts endpoints are not available, and no error is returned to the user in the OSIO UI.
  • If the user attempts to create a 6th quickstart, the service count quota for Jenkins will be exceeded. This error is reported to the user in the OSIO UI here:

builderror

The error is also reported in the Jenkins build log and the Jenkins pod :

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at:
 https://kubernetes.default/api/v1/namespaces/ldimaggi-stage/services. Message:
 Forbidden!Configured service account doesn't have access. Service account may have been revoked.
 services "march20f" is forbidden: exceeded quota: object-counts, requested: services=1, used:
 services=5, limited: services=5.

@mceledonia
Copy link
Collaborator

mceledonia commented Mar 22, 2018

@ldimaggi @maxandersen @joshuawilson @aslakknutsen @dgutride @jiekang

I'm looking for some thoughts/guidance on my requirements doc for the resource quota limits story (link), I have the beginning of a design plan mapped out and want to make sure I'm not missing anything. Please check it out when you get a chance and let me know how it looks!

https://docs.google.com/document/d/1zbLVYu5D2_v8jr46_NHW47ILM7e1E9GTulLYlcW2f5c/edit?usp=sharing

@joshuawilson
Copy link
Member

I'm hopeful that between the Launch and Platform teams we can fix this.
cc @animuk

@andrewazores
Copy link
Collaborator

andrewazores commented Apr 27, 2018

@joshua @mceledonia has the designs for the new starting points we discussed for this issue:

https://redhat.invisionapp.com/share/CTGJ6F7V4K9

Can we split this out into a story with two tasks (Launcher and Deployments), and mark that new story as P0 and target it for before Summit? Then the remainder of the work in the original design, with the slide-out resource usage panel and notifications system, can become future work items for after Summit with reduced severity. Those future work things will not be implementable in the next week because they require a fair amount of UI work as well as significant backend support that does not yet exist.

@joshuawilson
Copy link
Member

@andrewazores yes, do it. Please.

@andrewazores
Copy link
Collaborator

I have created a new user story at #3344 to track the work targeted for before Summit and marked that as P0, removing the P0 on this issue.

@joshuawilson joshuawilson added this to the Platform Backlog milestone May 29, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
andrewazores added a commit to andrewazores/fabric8-ui that referenced this issue Jun 19, 2018
@GeorgeActon GeorgeActon added the priority/P4 Normal label Jul 31, 2018
@stevengutz stevengutz added priority/P2 High and removed priority/P4 Normal labels Sep 24, 2018
@alexeykazakov
Copy link
Member

Not a bug. Rather missing functionality.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.