Bug in elasticsearch index of metrics server #2971

fg91 · 2021-02-16T17:23:56Z

Describe the bug

Here, the elasticsearch_index appears to be constructed in the wrong way:

This:

# Currently only supports SELDON inference type (not kfserving)
                    elasticsearch_index = f"inference-log-{seldon_namespace}-seldon-{SELDON_DEPLOYMENT_ID}-{SELDON_PREDICTOR_ID}"

should rather be:

# Currently only supports SELDON inference type (not kfserving)
                    elasticsearch_index = f"inference-log-seldon-{seldon_namespace}-{SELDON_DEPLOYMENT_ID}-{SELDON_PREDICTOR_ID}"

Reason:
When deploying a model to the namespace production, the elasticsearch index is named as follows:
'inference-log-seldon-production-multiclass-model-default'

To reproduce

The easiest way to reproduce this bug is to follow this notebook from the seldon examples but instead of choosing the namespace seldon choose a different one so that the error can manifest:

kubectl create namespace seldon_debug || echo "namespace already created"
kubectl config set-context $(kubectl config current-context) --namespace=seldon_debug

Expected behaviour

When deploying a metrics server that get's triggered when feedback is sent, it is supposed to lookup the respective document in elasticsearch. This, however only works by chance if the namespace is seldon as then the bug described above does not manifest.

Environment

Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.14-gke.1600", GitCommit:"7c407f5cc8632f9af5a2657f220963aa7f1c46e7", GitTreeState:"clean", BuildDate:"2020-12-07T09:22:27Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}

Seldon Images:

      value: docker.io/seldonio/engine:1.6.0
      value: docker.io/seldonio/seldon-core-executor:1.6.0
    image: docker.io/seldonio/seldon-core-operator:1.6.0

Model Details

Logs of the metrics server:

[I 210216 16:33:11 __main__:111] Extra args: {}
[I 210216 16:33:11 server:122] Registering model:multiclassserver
[I 210216 16:33:11 server:113] Listening on port 8080
[I 210216 16:33:17 web:2243] 200 GET /v1/metrics (10.212.1.18) 3.39ms
[I 210216 16:33:32 web:2243] 200 GET /v1/metrics (10.212.1.18) 1.37ms
[I 210216 16:33:42 cm_model:103] PROCESSING Feedback Event.
[I 210216 16:33:42 cm_model:104] {'Host': 'seldon-multiclass-model-metrics.production.svc.cluster.local:80', 'User-Agent': 'Go-http-client/1.1', 'Content-Length': '156', 'Ce-Endpoint': 'default', 'Ce-Id': 'e3f56826-ad3a-4638-9059-0d550c6186e9', 'Ce-Inferenceservicename': 'multiclass-model', 'Ce-Knativearrivaltime': '2021-02-16T16:33:41.195465611Z', 'Ce-Modelid': 'classifier', 'Ce-Namespace': 'production', 'Ce-Requestid': 'abc71989-f625-4ff4-bc9b-6b02967968d0', 'Ce-Source': 'http://:8000/', 'Ce-Specversion': '1.0', 'Ce-Time': '2021-02-16T16:33:40.82106025Z', 'Ce-Traceparent': '00-11bf7dd2281de401dfdb00f41b9c6cc7-89240c365d3cefb2-00', 'Ce-Type': 'io.seldon.serving.feedback', 'Content-Type': 'application/json', 'Traceparent': '00-11bf7dd2281de401dfdb00f41b9c6cc7-b017a64a85547fde-00', 'Accept-Encoding': 'gzip'}
[I 210216 16:33:42 cm_model:105] ----
[W 210216 16:33:42 base:269] GET http://elasticsearch-master.seldon-logs.svc.cluster.local:9200/inference-log-production-seldon-multiclass-model-default/_doc/abc71989-f625-4ff4-bc9b-6b02967968d0 [status:404 request:0.331s]
[E 210216 16:33:42 web:1793] Uncaught exception POST / (10.212.1.24)
    HTTPServerRequest(protocol='http', host='seldon-multiclass-model-metrics.production.svc.cluster.local:80', method='POST', uri='/', version='HTTP/1.1', remote_ip='10.212.1.24')
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/tornado/web.py", line 1702, in _execute
        result = method(*self.path_args, **self.path_kwargs)
      File "/microservice/adserver/server.py", line 242, in post
        response = self.model.process_event(request, headers)
      File "/microservice/adserver/cm_model.py", line 159, in process_event
        error, status_code=400, reason="METRICS_SERVER_ERROR"
    seldon_core.flask_utils.SeldonMicroserviceException

The text was updated successfully, but these errors were encountered:

fg91 added bug triage Needs to be triaged and prioritised accordingly labels Feb 16, 2021

fg91 mentioned this issue Feb 16, 2021

Fix elasticsearch index #2972

Merged

seldondev closed this as completed in #2972 Feb 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in elasticsearch index of metrics server #2971

Bug in elasticsearch index of metrics server #2971

fg91 commented Feb 16, 2021

Bug in elasticsearch index of metrics server #2971

Bug in elasticsearch index of metrics server #2971

Comments

fg91 commented Feb 16, 2021

Describe the bug

To reproduce

Expected behaviour

Environment

Kubernetes version:

Seldon Images:

Model Details