New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with KFServing as part of a Kubeflow install #629
Comments
I've also had issues getting the Tensorflow KFServing example up on a vanilla GCP Kubeflow deployment (v0.7.1 and v0.7.0). As @amygdala also mentioned above, I can't get the I try getting KFServing into the cluster as suggested here. Relevant lines from the kfserving TF guide:
|
@wronk did you get the same error amy is getting “waiting for virtual service to be ready”? |
Here's my output after re-running these lines on the kubeflow deployment. I don't see that error about virtual service now, but I'm not sure it didn't appear before... I'm also not sure which namespace is the correct one to use (in case it's supposed to differ from the docs).
|
So you already installed kubeflow with kfserving and apply the kfserving 0.2.2 release yaml again ? With kubeflow installation kfserving gets deployed to kubeflow namespace while the standalone installation deploys to kfserving-system namespace. |
Edited I'm facing a similar issue while I'm trying to setup Kubeflow The revisions 'flowers-sample-predictor-default' is stuck in status 'Deploying':
Outputs
Deployment seems to have scaled down to 0 replica-sets from 1 initially.
|
@amygdala and for others who faced the last issue you mentioned which related to istio authorization So the temporary solution will be either
and as a reminder, before adding X-Auth-Token to the curl command you have to edit envoyfilter to allow X-Auth-Token in the header (reference to this PR) |
One option here is to configure knative to use kubeflow gateway https://github.com/knative/serving/blob/master/config/istio-ingress/config.yaml#L49, based on my testing this only works with knative 0.11 and previous versions do not support configuring the gateway in a different namespace other than knative-serving. We also need to change kfserving to point to the kubeflow gateway https://github.com/kubeflow/kfserving/blob/master/config/default/configmap/inferenceservice.yaml#L93. After that all the virtual services created by knative or kfserving will start to use kubeflow’s gateway to avoid conflicts. |
Just a note that to do the above, seems that Kubeflow's version of knative needs to be upgraded (from 0.8 to 0.11) |
@yuzisun What values should those configuration variables be set to? |
In kubeflow installation we have two istio gateway installed, they both serve on http port 80 and route to the same istio ingress gateway, so one way to solve the issue is to configure knative to use the kubeflow gateway instead.
I'd love to see if someone can try out this and confirm this can fix the issue. |
@yuzisun there's a correction here:
The namespace should be 'kubeflow'.
Unfortunately, this does not yet solve this issue.
|
I have a similar issue, the output for the following is BLANK I believe this is because my environment don't have an external load balancer (using default Kubeflow installation)
My istio-ingressgateway's type is NodePort. If I use CLUSTER-IP(10.98.65.35), the following work fine for me: Question: is an external load balancer required for KFServing? |
@janeman98 load balancer is not required, there is a PR to fix the doc #618 |
@janeman98, I'm still having problems getting the TF flowers example to work from a KF deployment. Did you have to do anything special to make the example work after manually setting the Perhaps I'm not setting cc @yuzisun Steps to reproduceSet up KF with GCP IAP Applied the example
Set the necessary env variables
Curl times out
|
I'm experiencing issues with this as well. Used the https://github.com/kubeflow/manifests/blob/master/kfdef/kfctl_istio_dex.v1.0.0.yaml config first and tried this and it failed as it seems no configuration was made that sets up the ingress for it so instead I tried this one: https://github.com/kubeflow/manifests/blob/master/kfdef/kfctl_aws.v1.0.1.yaml and it doesn't even create an inferenceservice. [ec2-user@ip-10-0-0-170 tfserving-test]$ kubectl apply -f tensorflow.yaml |
@wronk
|
I'm facing this issue as well (even after upgrading to 1.0.1). On GCP with IAP. Used pip install in TF2-CPU notebook to upgrade to latest Kfserving package. Seems this is the culprit: Error detected in taxi-sample-predictor-default version taxi-sample-predictor-default-s5kvs google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/storage/v1/b/kf-poc-edi/o?projection=noAcl&prefix=tfx_pipeline_output%2Fmy_tfx_on_kf_pipeline%2Fserving_model%2F1584479473%2F: Primary: /namespaces/saas-ml-dev.svc.id.goog with additional claims does not have storage.objects.list access to kf-poc-edi. Tried following to no avail KFServing = KFServingClient() |
Thanks @janeman98 - those last two bullets helped out as the sample instructions did not work as is |
Hi All, I need your help on fixing this Kfserving issue
Appreciate all your help in this. |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
@janeman98, I am facing exactly the same issue, as mentioned by @wronk. I tried this solution by you, but it again says connection timed out. Do I need to change the kubeflow version? (Currently I am using the configuration kfctl_k8s_istio.v1.0.2) Also, the SERVICE_HOSTNAME is blank.
|
Hi Folks, This issue was closed on January 29. If you are still having trouble I would suggest opening new issues. It also seems like multiple issues and platforms might be being discussed; I would suggest trying to create one issue for each specific problem. |
added second change for ingress issue for The RC yaml has a bug for gateway kubectl edit cm inferenceservice-config -n kfserving-system kserve#629. Line 829
/kind bug
I'm seeing some issues with the KFServing install that's part of the 'out-of-the-box' Kubeflow install (0.7.1). As documented, this 'should' work without the need to install additional stuff: the KF 0.7.1 install includes istio, knative-serving, and installs the kfserving-controller-manager statefulset.
Not clear if the following probs are all related, so this bug might need to be factored out into several.
It's possible that some of these issues relate to knative-serving vs kubeflow gateways conflicting, as apparently there can be issues if the transport like http or https could go via either gateway. e.g.: istio/istio#11509.
First, it looks like an
inferenceservice
can only be deployed into the automatically-createdkubeflow-<user>
namespace. Is this intended?Otherwise, there's this error:
Once it is deployed, the inferenceservice is showing Ready==False, and giving 'Failed to reconcile predictor' errors:
Then, in this section of the instructions: https://github.com/kubeflow/kfserving/tree/master/docs/samples/tensorflow#run-a-prediction
..this command does not return a value:
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
It’s not finding the
status.url
. Here’s what the json looks like. What should it be returning? I'm guessing the SERVICE_HOSTNAME should be set toflowers-sample-predictor-default.kubeflow-amyu.svc.cluster.local
or similar, right? But perhaps due to the above issue I don't see that string in the json below.Finally, even if I deploy the KF install so that the istio-ingressgateway is set up with an exernal IP, I can't successfully make an inference request by following the instructions. I get an origin auth failure.
(cc @jlewi as fyi)
The text was updated successfully, but these errors were encountered: