You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I deployed CDK on AWS using the bundle. All well. I deployed then kubeflow on top of it and I ran into some issues with the script (see my other bug) and I've worked around them manually. Now, when I try to expose the ambassador app to access the kubeflow dashboard, the first ambassador-auth unit goes into this error: The node was low on resource: ephemeral-storage. Container ambassador was using 93222, which exceeds its request of 0., and all the other ambassador-auth units go into error with the error message Pod The node had condition: [DiskPressure].. It also looks like because of this error, juju tried to scale out and spawned 8 more units of the ambassador-auth, and they all ended up in the same state. The machines used for this are used only for this test, nothing else runs on them.
How can I resolve this issue? I am unable to deploy fully kubeflow at the moment.
My CDK status
juju kubeflow model status
kubectl status
Something weird too that I do not know if it is due to kubeflow or juju..probably juju. But I tried to scale down the number of units for ambassador-auth and it did the opposite.
$ juju scale-application ambassador-auth 2
ambassador-auth scaled to 2 units
gave that result... I have 17 units in error now o.o
The text was updated successfully, but these errors were encountered:
@camille-rodriguez: Can you try adding a CDK worker node with a decent amount (100G+) of disk space, then retrying the deploy? This looks like none of the worker nodes have enough disk space.
I followed the workaround in the other issue I opened, and everything deployed fine this time. This bug was probably an aftermath of the other issues I faced. Thank you!
Hi,
I deployed CDK on AWS using the bundle. All well. I deployed then kubeflow on top of it and I ran into some issues with the script (see my other bug) and I've worked around them manually. Now, when I try to expose the ambassador app to access the kubeflow dashboard, the first ambassador-auth unit goes into this error:
The node was low on resource: ephemeral-storage. Container ambassador was using 93222, which exceeds its request of 0.
, and all the other ambassador-auth units go into error with the error messagePod The node had condition: [DiskPressure].
. It also looks like because of this error, juju tried to scale out and spawned 8 more units of the ambassador-auth, and they all ended up in the same state. The machines used for this are used only for this test, nothing else runs on them.How can I resolve this issue? I am unable to deploy fully kubeflow at the moment.
My CDK status
juju kubeflow model status
kubectl status
Something weird too that I do not know if it is due to kubeflow or juju..probably juju. But I tried to scale down the number of units for ambassador-auth and it did the opposite.
gave that result... I have 17 units in error now o.o
The text was updated successfully, but these errors were encountered: