Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a configMapRef inside of a seldon deployment manifest causes a NullPointerException in the SeldonDeploymentWatcher #450

Closed
sroj opened this issue Feb 15, 2019 · 5 comments
Assignees
Labels
Milestone

Comments

@sroj
Copy link

sroj commented Feb 15, 2019

I'm trying to use a ConfigMap to feed env variables into a SeldonDeployment, however, when using a configMapRef inside of the manifest as per official docs, the seldon cluster manager gets stuck in a NullPointerException loop and doesn't deploy anything.
Steps to reproduce:

  1. Everything is in the default namespace.
  2. Create a ConfigMap named ttng-classifier-config (the name doesn't really matter)
  3. Deploy this (using kubectl apply -f seldon-dep.yaml):
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
    - componentSpecs:
        - spec:
            containers:
              - name: classifier
                image: seldonio/mock_classifier:1.0
                envFrom:
                  - configMapRef:
                      name: ttng-classifier-config
      graph:
        children: []
        endpoint:
          type: REST
        name: classifier
        type: MODEL
      name: example
      replicas: 1

Note that if

envFrom:
  - configMapRef:
         name: ttng-classifier-config

is removed from the above manifest, the deployment succeeds.

Error as seen in the logs for the SeldonClusterManager:

2019-02-15 21:04:00.361 ERROR 1 --- [pool-1-thread-1] o.s.s.s.TaskUtils$LoggingErrorHandler    : Unexpected error occurred in scheduled task.
 java.lang.NullPointerException: null
	at io.seldon.clustermanager.k8s.SeldonDeploymentWatcher.getNamespace(SeldonDeploymentWatcher.java:130) ~[classes!/:0.2.5]
	at io.seldon.clustermanager.k8s.SeldonDeploymentWatcher.watchSeldonMLDeployments(SeldonDeploymentWatcher.java:200) ~[classes!/:0.2.5]
	at io.seldon.clustermanager.k8s.SeldonDeploymentWatcher.watch(SeldonDeploymentWatcher.java:226) ~[classes!/:0.2.5]
	at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) ~[na:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_181]
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) ~[spring-context-4.3.20.RELEASE.jar!/:4.3.20.RELEASE]
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-4.3.20.RELEASE.jar!/:4.3.20.RELEASE]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_181]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_181]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_181]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
@sroj sroj changed the title Using a configMapRef inside of a seldon deployment manifest causes a NullPointerExceptin in the SeldonDeploymentWatcher Using a configMapRef inside of a seldon deployment manifest causes a NullPointerException in the SeldonDeploymentWatcher Feb 15, 2019
@ukclivecox ukclivecox added the bug label Feb 15, 2019
@ukclivecox ukclivecox added this to Needs triage in Bugs via automation Feb 15, 2019
@ukclivecox ukclivecox self-assigned this Feb 15, 2019
@ukclivecox ukclivecox added this to the 0.2.x milestone Feb 15, 2019
@ukclivecox
Copy link
Contributor

It looks like you are using 0.2.5 release. This bug I think is fixed in 0.2.6-SNAPSHOT release so will be fixed in 0.2.6.

if (actualObj.has("metadata") && actualObj.get("meta").has("namespace"))

I'm not sure why the configMap would make a difference here.

Can you explicitly add a namespace to the metadata in your yaml and see if this fixes things?

@ukclivecox ukclivecox added this to To do in 0.2.6 via automation Feb 15, 2019
@sroj
Copy link
Author

sroj commented Feb 16, 2019

Added the namespace as suggested. It did not fix the issue.

@ukclivecox
Copy link
Contributor

As we use the k8s proto buffers for parsing the issue may be you need to be more verbose in your definition. Can you try below which adds localObjectReference:

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
    - componentSpecs:
        - spec:
            containers:
              - name: classifier
                image: seldonio/mock_classifier:1.0
                envFrom:
                  - configMapRef:
                      localObjectReference:
                        name: ttng-classifier-config
      graph:
        children: []
        endpoint:
          type: REST
        name: classifier
        type: MODEL
      name: example
      replicas: 1

This worked on 0.2.6-SNAPSHOT release. The Namespace issue may still be there in 0.2.5. We will be doing a release in next few days.

@ukclivecox ukclivecox moved this from Needs triage to High priority in Bugs Feb 17, 2019
@ukclivecox ukclivecox moved this from To do to In progress in 0.2.6 Feb 17, 2019
@ukclivecox ukclivecox removed this from In progress in 0.2.6 Feb 18, 2019
@sroj
Copy link
Author

sroj commented Feb 19, 2019

Adding the localObjectReference key to the definition did work. I'll use that as a temporary workaround. And good to know a fix is coming up soon. Thanks!

@ukclivecox ukclivecox added this to To do in 0.2.7 Feb 21, 2019
@ukclivecox
Copy link
Contributor

This issue discusses two related problems. The NullPointer issue is fixed in an earlier release and the correct Yaml/JSON to use is a current limitation of how we parse protobuffers. Created a new issue for this #489

Bugs automation moved this from High priority to Closed Apr 4, 2019
@ukclivecox ukclivecox moved this from To do to Done in 0.2.7 Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
0.2.7
  
Done
Bugs
  
Closed
Development

No branches or pull requests

2 participants