Environmental variable error #215

sathiez · 2018-09-06T14:15:10Z

Hi,
I build seldon image using "s2i build 'src-folder' seldonio/seldon-core-s2i-python3:0.1 'imageName'".
I get the following error:
"ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory"

.s2i/environment:
MODEL_NAME=MyModel
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE =0
LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64

requirement.txt
tensorflow-gpu==1.10.1

help me, to resove this error. :)

ukclivecox · 2018-09-06T14:41:16Z

I think you would need to install CUDA in the Docker image.
We may need to think how best to solve this with s2i, e.g. https://github.com/openshift/source-to-image/blob/master/docs/user_guide.md#invoking-scripts-embedded-in-an-image

Using above you could extend the assemble part of s2i with your own assemble logic to add CUDA and other dependencies.

sathiez · 2018-09-10T14:44:50Z

Hi,
I successfully installed nvidia/cuda using assemble script but I get the following error
"Traceback (most recent call last):
File "microservice.py", line 154, in
interface_file = importlib.import_module(args.interface_name)
File "/usr/local/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'MyModel'
"

ukclivecox · 2018-09-10T15:07:13Z

Do you have a MyModel.py file with a MyModel class as discussed here when you did the wrapping?

sathiez · 2018-09-10T15:08:31Z

@cliveseldon

yes I have

ukclivecox · 2018-09-10T15:12:17Z

OK. I suggest you run the docker image with bash to investigate, e.g.,

docker run -it --rm <my-image> bash

sathiez · 2018-09-10T15:16:15Z

I run that image as you said
$root@XXX-XX:~# docker run -it --rm img bash
$root@e764be690a3a:/microservice#

ukclivecox · 2018-09-10T15:21:54Z

OK - can you show what is in that folder to check MyModel.py exists there?
And maybe call microservice.py as the s2i run.sh does?

sathiez · 2018-09-10T15:24:55Z

root@e764be690a3a:/microservice# ls -lart

cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64-deb
fbs
seldon_flatbuffers.py
transformer_microservice.py
seldon_requirements.txt
router_microservice.py
persistence.py
outlier_detector_microservice.py
model_microservice.py
microservice.py
init.py
proto
libcudnn7_7.0.5.15-1+cuda9.0_amd64.deb
MyModel.py is not available here but when I build my image I kept MyModel.py as mentioned in that tutorial.

ukclivecox · 2018-09-10T15:30:07Z

You do need a runtime inference file that is going to do the prediction. In the turorial one is shown:

https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md#python-file

sathiez · 2018-09-10T15:34:27Z

My build path contains these files
Tensorflow_model
MyModel.py
pycache
wrkspc.py
requirements.txt
.s2i

But while building image Tensorflow_model and MyModel.py is not computed.
I edited .s2i/bin/assemble. Does it cause any error?

sathiez · 2018-09-10T15:36:05Z

My assemble file contains
CUDA_REPO_PKG="cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/${CUDA_REPO_PKG}
dpkg -i ${CUDA_REPO_PKG}
apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x$
apt-get update
CUDA_REPO_PKG="cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/${CUDA_REPO_PKG}
dpkg -i ${CUDA_REPO_PKG}
apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x$
apt-get update
wget https://github.com/ashokpant/cudnn_archive/raw/master/v7.0/${CUDNN_PKG}
apt-get install -y cuda-command-line-tools-9-0

ukclivecox · 2018-09-10T16:33:20Z

Does your custom assemble script envoke the default assemble script as discussed , e.g. /usr/libexec/s2i/assemble?

sathiez · 2018-09-10T16:52:32Z

No. Custom assemble script is not evoking default script.

sathiez · 2018-09-11T05:08:33Z

I am unable to find the default assemble file in /usr/libexec/s2i/assemble this path.

ukclivecox · 2018-09-11T06:58:06Z

Not sure what to suggest, have you followed the instructions here on finding the location of the scripts?
Otherwise, if possible can you open source your work on github so we can have a look?

ukclivecox · 2018-09-11T14:48:36Z

I think for the python builder images the default assemble script will be at /s2i/bin

ukclivecox · 2018-09-11T14:51:44Z

I think your custom assemble script should take the form

#!/bin/bash
echo "Before assembling"

/s2i/bin/assemble
rc=$?

if [ $rc -eq 0 ]; then
    echo "After successful assembling"
    else
        echo "After failed assembling"
	fi

exit $rc

sathiez · 2018-09-11T15:27:03Z

Can you suggest me where to place custom assemble and how to invoke it while building image?

ukclivecox · 2018-09-11T16:37:08Z

You just need to put the above code in .s2i/bin/assemble and the add your custom GPU installation code. I think you almost got this far already?

sathiez · 2018-09-18T12:58:42Z

Hi,
I successfully deployed tensorflow model in kubernetes using kubeflow/seldon, but is it necessary to define https://github.com/SeldonIO/seldon-core/blob/master/docs/crd/readme.md "runtime inference graph" for a single model and also how to access predict(X) function in kubernetes.

ukclivecox · 2018-09-18T13:23:20Z

gRPC and REST endpoint will have been exposed via the API gateway or Ambassador. I suggest you run one of the notebook examples which have example python code to make predictions over these APIs.

sathiez · 2018-09-18T13:31:31Z

can you provide a link for that notebook examples?

ukclivecox · 2018-09-18T13:36:43Z

https://github.com/SeldonIO/seldon-core#quick-start

sathiez · 2018-09-18T13:55:15Z

Hi,
I can able to get
XXXX@XX:~$ kubectl get seldondeployments model -o jsonpath='{.status}'
map[predictorStatus:[map[name:model-model replicas:2 replicasAvailable:1]]]

I port forwarded,
XXXX@XX:~$ kubectl port-forward $(kubectl get pods -n ${NAMESPACE} -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n ${NAMESPACE} 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080

XXXX@XX:~$ curl -d 'json={"data":{"tensor":{"shape":[1,37],"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]}}}' http://localhost:8080/predict
404 page not found
I get 404 page not found. can you help me?

ukclivecox · 2018-09-18T14:03:33Z

If you look in https://github.com/SeldonIO/seldon-core/blob/master/notebooks/seldon_utils.py
You will see:

def rest_request_ambassador(deploymentName,endpoint="localhost:8003",data_size=5,rows=1):
        shape, arr = create_random_data(data_size,rows)
        payload = {"data":{"names":["a","b"],"tensor":{"shape":shape,"values":arr.tolist()}}}
        response = requests.post(
            "http://"+endpoint+"/seldon/"+deploymentName+"/api/v0.1/predictions",
            json=payload)
        print(response.status_code)
        print(response.text)

So the endpoint under Ambassador is http://<ambassador_endpoint>/seldon/<deploymentName>/api/v0.1/predictions

sathiez · 2018-09-18T14:14:34Z

Hi,

I get the following error,

{"timestamp":1537279713335,"status":415,"error":"Unsupported Media Type","exception":"org.springframework.web.HttpMediaTypeNotSupportedException","message":"Content type 'application/x-www-form-urlencoded' not supported","path":"/api/v0.1/predictions"}

ukclivecox · 2018-09-18T14:19:34Z

try adding the following to curl

-H "Content-Type: application/json"

sathiez · 2018-09-18T14:35:49Z

I have tried that and I get this error,
{
"code": 201,
"info": "json={"data":{"ndarray":[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]]}}",
"reason": "Invalid JSON",
"status": "FAILURE"
}

ukclivecox · 2018-09-18T14:44:58Z

You are sending "json=X" but you should be sending just the JSON.

sathiez · 2018-09-18T14:48:29Z

@cliveseldon
please can you elaborate?

ukclivecox · 2018-09-18T14:54:21Z

Suggest we connect on our Slack channel.

sathiez · 2018-09-18T15:01:44Z

Yes we can..

my slack url is sathiez.slack.com

sathiez · 2018-09-18T15:08:32Z

I tried as you said but still I get the following error,
{
"code": 201,
"info": "{data:[1,2]}",
"reason": "Invalid JSON",
"status": "FAILURE"
}

ukclivecox · 2018-09-18T15:38:11Z

Your json is invalid. Try your previous on {"data":{"ndarray":[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]]}}

sathiez · 2018-09-19T14:08:20Z

Hi,
Thanks a lot for your great support somehow I can able to get response, but the response is an error :p ..,
File "/microservice/model_microservice.py", line 55, in Predict
class_names = get_class_names(user_model, predictions.shape[1])
Actually I am testing that my predict method works without any issue.

ukclivecox · 2018-09-19T14:49:21Z

Can you provide the full stack trace and error?
How are you calling the microservice?
How did you build the microservice? Via s2i? If so can you provide the command line you used.

sathiez closed this as completed Oct 8, 2018

ukclivecox mentioned this issue Jan 11, 2019

tensorflow-gpu #380

Closed

agrski pushed a commit that referenced this issue Dec 2, 2022

read events in go routines (#215)

ba9597b

Environmental variable error #215

Environmental variable error #215

Comments

sathiez commented Sep 6, 2018 • edited Loading

ukclivecox commented Sep 6, 2018

sathiez commented Sep 10, 2018

ukclivecox commented Sep 10, 2018

sathiez commented Sep 10, 2018

ukclivecox commented Sep 10, 2018

sathiez commented Sep 10, 2018

ukclivecox commented Sep 10, 2018

sathiez commented Sep 10, 2018 • edited Loading

ukclivecox commented Sep 10, 2018

sathiez commented Sep 10, 2018

sathiez commented Sep 10, 2018 • edited Loading

ukclivecox commented Sep 10, 2018

sathiez commented Sep 10, 2018

sathiez commented Sep 11, 2018

ukclivecox commented Sep 11, 2018

ukclivecox commented Sep 11, 2018

ukclivecox commented Sep 11, 2018

sathiez commented Sep 11, 2018

ukclivecox commented Sep 11, 2018

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018 • edited Loading

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 18, 2018 • edited Loading

sathiez commented Sep 18, 2018

ukclivecox commented Sep 18, 2018

sathiez commented Sep 19, 2018

ukclivecox commented Sep 19, 2018

sathiez commented Sep 6, 2018 •

edited

Loading

sathiez commented Sep 10, 2018 •

edited

Loading

sathiez commented Sep 10, 2018 •

edited

Loading

sathiez commented Sep 18, 2018 •

edited

Loading

sathiez commented Sep 18, 2018 •

edited

Loading