New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Launcher only reports Pod errors/log whereas others can happen #793
Comments
We currently look at the pod by UID -- but we could consider looking back at the Notebook and a collection of other resources, but we'll need to make sure we limit what we get by time; otherwise past instances or other instances that might touch the same k8s items will start flooding the log. Perhaps some UX can be done here to handle specific use-cases instead of just adding to the log... we will need to see what cost reporting logs will have. |
I'm bumping this up -- we probably don't have time this sprint to work on it, but there are several issues from RHODS as well as this one in ODH that note the same thing. We need to be able to detect state beyond the pod logs as the pod is not always created. |
I think this issue is the same one as #1849, which was still present in 1.33 RC1. I'll retest in RC2 but there's a possibility this hasn't been fixed @lucferbux @andrewballantyne @christianvogt |
Apparently we did not catch the full scenario. We improved it some and will need to readdress your feedback in 1849. Thanks for logging that issue. |
@andrewballantyne I retested in RC2 and it appears to work? |
We didn't do anything -- I imagine this is a flake or something. Keep an eye out for it, but I am glad you have it working again. |
Is there an existing issue for this?
Current Behavior
The notebook launcher reports errors and can display the event log from the Pod being launched.
However, there are other errors that can prevent the notebook to start, with the Pod itself ont even being launched.
For example, a LimitRange set on the namespace can prevent the Pod from scheduled. In this case, the request from the Notebook Controller is simply filtered and nothing happens. The launcher stays on the modal window, stating "Waiting for server request to start..." indefinitely...
Expected Behavior
The launcher should detect and display other errors or events that happen all along the chain: Pod, StatefulSet Controller, Notebook Controller, Dashboard backend.
Steps To Reproduce
As the default "Small" environment has a limit set at 8Gi, so more than the 6Gi allowed, the Pod request will be blocked.
2. Try to launch a notebook using the "Small" environment
3. You stay stuck on "Waiting for server request to start..." indefinitely.
Workaround (if any)
Look at events, logs from different controllers to figure out what's happening. This requires some rights on the cluster that users don't have.
What browsers are you seeing the problem on?
No response
Open Data Hub Version
Any version using the KF Notebook Controller
Anything else
No response
The text was updated successfully, but these errors were encountered: