Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environmental variable error #215

Closed
sathiez opened this issue Sep 6, 2018 · 36 comments
Closed

Environmental variable error #215

sathiez opened this issue Sep 6, 2018 · 36 comments

Comments

@sathiez
Copy link

sathiez commented Sep 6, 2018

Hi,
I build seldon image using "s2i build 'src-folder' seldonio/seldon-core-s2i-python3:0.1 'imageName'".
I get the following error:
"ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory"

.s2i/environment:
MODEL_NAME=MyModel
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE =0
LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64

requirement.txt
tensorflow-gpu==1.10.1

help me, to resove this error. :)

@ukclivecox
Copy link
Contributor

I think you would need to install CUDA in the Docker image.
We may need to think how best to solve this with s2i, e.g. https://github.com/openshift/source-to-image/blob/master/docs/user_guide.md#invoking-scripts-embedded-in-an-image

Using above you could extend the assemble part of s2i with your own assemble logic to add CUDA and other dependencies.

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

Hi,
I successfully installed nvidia/cuda using assemble script but I get the following error
"Traceback (most recent call last):
File "microservice.py", line 154, in
interface_file = importlib.import_module(args.interface_name)
File "/usr/local/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'MyModel'
"

@ukclivecox
Copy link
Contributor

Do you have a MyModel.py file with a MyModel class as discussed here when you did the wrapping?

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

@cliveseldon

yes I have

@ukclivecox
Copy link
Contributor

OK. I suggest you run the docker image with bash to investigate, e.g.,

docker run -it --rm <my-image> bash

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

I run that image as you said
$root@XXX-XX:~# docker run -it --rm img bash
$root@e764be690a3a:/microservice#

@ukclivecox
Copy link
Contributor

OK - can you show what is in that folder to check MyModel.py exists there?
And maybe call microservice.py as the s2i run.sh does?

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

root@e764be690a3a:/microservice# ls -lart

cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64-deb
fbs
seldon_flatbuffers.py
transformer_microservice.py
seldon_requirements.txt
router_microservice.py
persistence.py
outlier_detector_microservice.py
model_microservice.py
microservice.py
init.py
proto
libcudnn7_7.0.5.15-1+cuda9.0_amd64.deb
MyModel.py is not available here but when I build my image I kept MyModel.py as mentioned in that tutorial.

@ukclivecox
Copy link
Contributor

You do need a runtime inference file that is going to do the prediction. In the turorial one is shown:

https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md#python-file

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

My build path contains these files
Tensorflow_model
MyModel.py
pycache
wrkspc.py
requirements.txt
.s2i

But while building image Tensorflow_model and MyModel.py is not computed.
I edited .s2i/bin/assemble. Does it cause any error?

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

My assemble file contains
CUDA_REPO_PKG="cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/${CUDA_REPO_PKG}
dpkg -i ${CUDA_REPO_PKG}
apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x$
apt-get update
CUDA_REPO_PKG="cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb"
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/${CUDA_REPO_PKG}
dpkg -i ${CUDA_REPO_PKG}
apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x$
apt-get update
wget https://github.com/ashokpant/cudnn_archive/raw/master/v7.0/${CUDNN_PKG}
apt-get install -y cuda-command-line-tools-9-0

@ukclivecox
Copy link
Contributor

Does your custom assemble script envoke the default assemble script as discussed , e.g. /usr/libexec/s2i/assemble?

@sathiez
Copy link
Author

sathiez commented Sep 10, 2018

No. Custom assemble script is not evoking default script.

@sathiez
Copy link
Author

sathiez commented Sep 11, 2018

I am unable to find the default assemble file in /usr/libexec/s2i/assemble this path.

@ukclivecox
Copy link
Contributor

Not sure what to suggest, have you followed the instructions here on finding the location of the scripts?
Otherwise, if possible can you open source your work on github so we can have a look?

@ukclivecox
Copy link
Contributor

I think for the python builder images the default assemble script will be at /s2i/bin

@ukclivecox
Copy link
Contributor

I think your custom assemble script should take the form

#!/bin/bash
echo "Before assembling"

/s2i/bin/assemble
rc=$?

if [ $rc -eq 0 ]; then
    echo "After successful assembling"
    else
        echo "After failed assembling"
	fi

exit $rc

@sathiez
Copy link
Author

sathiez commented Sep 11, 2018

Can you suggest me where to place custom assemble and how to invoke it while building image?

@ukclivecox
Copy link
Contributor

You just need to put the above code in .s2i/bin/assemble and the add your custom GPU installation code. I think you almost got this far already?

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

Hi,
I successfully deployed tensorflow model in kubernetes using kubeflow/seldon, but is it necessary to define https://github.com/SeldonIO/seldon-core/blob/master/docs/crd/readme.md "runtime inference graph" for a single model and also how to access predict(X) function in kubernetes.

@ukclivecox
Copy link
Contributor

gRPC and REST endpoint will have been exposed via the API gateway or Ambassador. I suggest you run one of the notebook examples which have example python code to make predictions over these APIs.

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

can you provide a link for that notebook examples?

@ukclivecox
Copy link
Contributor

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

Hi,
I can able to get
XXXX@XX:~$ kubectl get seldondeployments model -o jsonpath='{.status}'
map[predictorStatus:[map[name:model-model replicas:2 replicasAvailable:1]]]

I port forwarded,
XXXX@XX:~$ kubectl port-forward $(kubectl get pods -n ${NAMESPACE} -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n ${NAMESPACE} 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080

XXXX@XX:~$ curl -d 'json={"data":{"tensor":{"shape":[1,37],"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]}}}' http://localhost:8080/predict
404 page not found
I get 404 page not found. can you help me?

@ukclivecox
Copy link
Contributor

If you look in https://github.com/SeldonIO/seldon-core/blob/master/notebooks/seldon_utils.py
You will see:

def rest_request_ambassador(deploymentName,endpoint="localhost:8003",data_size=5,rows=1):
        shape, arr = create_random_data(data_size,rows)
        payload = {"data":{"names":["a","b"],"tensor":{"shape":shape,"values":arr.tolist()}}}
        response = requests.post(
            "http://"+endpoint+"/seldon/"+deploymentName+"/api/v0.1/predictions",
            json=payload)
        print(response.status_code)
        print(response.text)

So the endpoint under Ambassador is http://<ambassador_endpoint>/seldon/<deploymentName>/api/v0.1/predictions

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

Hi,

I get the following error,

{"timestamp":1537279713335,"status":415,"error":"Unsupported Media Type","exception":"org.springframework.web.HttpMediaTypeNotSupportedException","message":"Content type 'application/x-www-form-urlencoded' not supported","path":"/api/v0.1/predictions"}

@ukclivecox
Copy link
Contributor

try adding the following to curl

-H "Content-Type: application/json"

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

I have tried that and I get this error,
{
"code": 201,
"info": "json={"data":{"ndarray":[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]]}}",
"reason": "Invalid JSON",
"status": "FAILURE"
}

@ukclivecox
Copy link
Contributor

You are sending "json=X" but you should be sending just the JSON.

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

@cliveseldon
please can you elaborate?

@ukclivecox
Copy link
Contributor

Suggest we connect on our Slack channel.

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

Yes we can..

my slack url is sathiez.slack.com

@sathiez
Copy link
Author

sathiez commented Sep 18, 2018

I tried as you said but still I get the following error,
{
"code": 201,
"info": "{data:[1,2]}",
"reason": "Invalid JSON",
"status": "FAILURE"
}

@ukclivecox
Copy link
Contributor

Your json is invalid. Try your previous on {"data":{"ndarray":[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]]}}

@sathiez
Copy link
Author

sathiez commented Sep 19, 2018

Hi,
Thanks a lot for your great support somehow I can able to get response, but the response is an error :p ..,
File "/microservice/model_microservice.py", line 55, in Predict
class_names = get_class_names(user_model, predictions.shape[1])
Actually I am testing that my predict method works without any issue.

@ukclivecox
Copy link
Contributor

Can you provide the full stack trace and error?
How are you calling the microservice?
How did you build the microservice? Via s2i? If so can you provide the command line you used.

@sathiez sathiez closed this as completed Oct 8, 2018
agrski pushed a commit that referenced this issue Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants