Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No result for Movielens 100K Worked Example #23

Open
ghost opened this issue Jun 30, 2016 · 8 comments
Open

No result for Movielens 100K Worked Example #23

ghost opened this issue Jun 30, 2016 · 8 comments

Comments

@ghost
Copy link

ghost commented Jun 30, 2016

I am trying to run ml100k example. All things are all right but when I tape seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4 and get this result

connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED]
response code 200
{"size":0,"requested":4,"list":[]}

which is different from what is mentioned in the tutorial.

Could you help me please?
Nadia

@ghost
Copy link
Author

ghost commented Jun 30, 2016

I have also an another question if possible: why seldon use both zookeeper and etcd while things are possible relying only on just etcd? Thank you in advance for your answer.

@ukclivecox
Copy link
Contributor

Can you rerun the ml100k job:

cd kubernetes/conf/examples/ml100k
kubectl create -f ml100k-import.json

And provide the logs?

Regards zookeeper and etcd. We only use zookeeper in Seldon. Etcd is part of Kubernetes.

@Moonba
Copy link

Moonba commented Aug 2, 2016

I'm not getting the official tutorial output either . I'm running Kubernetes locally via MiniKube .
`$ kubectl create -f ml100k-import.json

job "ml100k-import" created`

`$kubectl get jobs -l name=ml100k-import

NAME DESIRED SUCCESSFUL AGE
ml100k-import 1 1 4m`

Than when i run
seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4 I got this error :

raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='seldon-server', port=80): Max retries exceeded with url: /js/recommendations?type=1&item=50&limit=4&user=1&consumer_key=3BOJZ988Y840JTK5JE6C&jsonpCallback=j (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fbdec89bf90>: Failed to establish a new connection: [Errno 110] Connection timed out',)) error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

This is it all :

`$ bin/seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4
Traceback (most recent call last):
File "/opt/conda/bin/seldon-cli", line 4, in
connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED]
import('pkg_resources').run_script('seldon==2.0.2', 'seldon-cli')

File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 742, in run_script

File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 1667, in run_script
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/EGG-INFO/scripts/seldon-cli", line 5, in
seldon.cli.start_seldoncli()
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/init.py", line 3, in start_seldoncli
cli_main.main()
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cli_main.py", line 351, in main
cmds[cmd](opts,command_data, command_args)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 206, in cmd_api
actions["default"](gopts,command_data, opts)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 191, in action_call
call_js(gopts,command_data,opts,auth)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 127, in call_js
r = requests.get(url,params=params)
File "/opt/conda/lib/python2.7/site-packages/requests/api.py", line 69, in get
return request('get', url, params=params, *_kwargs)
File "/opt/conda/lib/python2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, *_kwargs)
File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, *_send_kwargs)
File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, *_kwargs)
File "/opt/conda/lib/python2.7/site-packages/requests/adapters.py", line 423, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='seldon-server', port=80): Max retries exceeded with url: /js/recommendations?type=1&item=50&limit=4&user=1&consumer_key=3BOJZ988Y840JTK5JE6C&jsonpCallback=j (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fbdec89bf90>: Failed to establish a new connection: [Errno 110] Connection timed out',))
error: error executing remote command: error executing command in container: Error executing in Docker Container: 1
`

When i run $Kubectl get pods i notice seldon-server-553220162-23o53 0/2 Pending

$kubectl describe pod seldon-server-553220162-23o53
`FirstSeen LastSeen Count From SubobjectPath Type Reason Message


12m 6s 48 {default-scheduler } Warning FailedScheduling pod (seldon-server-553220162-23o53) failed to fit in any node
fit failure on node (minikubevm): Insufficient Memory`

Knowing that i run minikube start --memory=6000 , when i describe nodes i get :

`Name: minikubevm
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=minikubevm
Taints:
CreationTimestamp: Tue, 02 Aug 2016 15:43:49 +0900
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available
Ready True Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletReady kubelet is posting ready status
Addresses: 10.0.2.15,10.0.2.15
Capacity:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 1
memory: 5958432Ki
pods: 110
Allocatable:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 1
memory: 5958432Ki
pods: 110
System Info:
Machine ID:
System UUID: 3A273620-856B-4504-80F4-1792529E648D
Boot ID: 22e7f8e6-3524-47ff-a47d-411a7a829c4a
Kernel Version: 4.4.14-boot2docker
OS Image: Boot2Docker 1.11.1 (TCL 7.1); master : 901340f - Fri Jul 1 22:52:19 UTC 2016
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.11.1
Kubelet Version: v1.3.3
Kube-Proxy Version: v1.3.3
ExternalID: minikubevm
Non-terminated Pods: (17 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


default influxdb-grafana-xl5ec 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default kafka-controller-lxjw3 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default kafka-stream-impressions 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default kafka-stream-predictions 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default memcached1-9wl8j 0 (0%) 0 (0%) 260Mi (4%) 0 (0%)
default memcached2-bck0z 0 (0%) 0 (0%) 260Mi (4%) 0 (0%)
default mysql 0 (0%) 0 (0%) 3Gi (52%) 0 (0%)
default seldon-control 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default spark-master-controller-x6cqg 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default spark-worker-controller-87px0 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default spark-worker-controller-e5ba6 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default td-agent-server 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default zookeeper-1 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default zookeeper-2 0 (0%) 0 (0%) 0 (0%) 0 (0%)
default zookeeper-3 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-addon-manager-minikubevm 5m (0%) 0 (0%) 50Mi (0%) 0 (0%)
kube-system kubernetes-dashboard-lnr8r 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
CPU Requests CPU Limits Memory Requests Memory Limits


5m (0%) 0 (0%) 3642Mi (62%) 0 (0%)
No events.`

So does that mean i can't run this demo on Minikube ?
Thank you .

@ukclivecox
Copy link
Contributor

It does look like a memory issue. Can you try increasing e.g.
minikube start --memory=10000

@Moonba
Copy link

Moonba commented Aug 3, 2016

Yes , it worked with 10GB memory allocation . I got the desired output after executing seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4

but when i wanted to go through it step by step and reached the step to setup the schema using JSON:
seldon-cli attr --action apply --client-name ml100k --json attr.json
I got the error :

IOError: [Errno 2] No such file or directory: 'attr.json' error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

Same error even though the file does exist under the seldon-server/docker/examples/ml100k directory , i retried with writing the whole path in the command but still same error msg that No such file or directory ! ?

Same error with the following steps , but instead its "Invalid file[items.csv]"
connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED] Invalid file[items.csv] error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

My machine is a MAC OSX ELCAPITAN , 16GB memory, so to "ensure we have a namesever for external DNS (seems to be required for local Docker running of Kubernetes)"
i can't use sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf (permission denied)
so i manually added 8.8.8.8 nameserver to /etc/hosts and executed
sudo networksetup -setdnsservers Wi-Fi 8.8.8.8 so it added nameserver 8.8.8.8 to etc/resolv.conf file automatically .

I also tried sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf on Minikube machine .

Here's the whole log for the JSON schema:

`connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED]
Traceback (most recent call last):

File "/opt/conda/bin/seldon-cli", line 4, in
import('pkg_resources').run_script('seldon==2.0.2', 'seldon-cli')
File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 742, in run_script
File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 1667, in run_script
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/EGG-INFO/scripts/seldon-cli", line 5, in
seldon.cli.start_seldoncli()
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/init.py", line 3, in start_seldoncli
cli_main.main()
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cli_main.py", line 351, in main
cmds[cmd](opts,command_data, command_args)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 234, in cmd_attr
actions[action](command_data, opts)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 172, in action_apply
store_json(command_data,opts)
File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 106, in store_json
f = open(opts.json)
IOError: [Errno 2] No such file or directory: './attr.json'
error: error executing remote command: error executing command in container: Error executing in Docker Container: 1`

I can't spot a problem somewhere else ?

`$kubectl get services --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE

default kafka-service 10.0.0.87 9092/TCP 6h
default kubernetes 10.0.0.1 443/TCP 1d
default memcached1 10.0.0.20 11211/TCP 6h
default memcached2 10.0.0.35 11211/TCP 6h
default monitoring-grafana 10.0.0.31 80/TCP 6h
default monitoring-influxdb 10.0.0.151 8083/TCP,8086/TCP 6h
default mysql 10.0.0.161 3306/TCP 6h
default seldon-server 10.0.0.84 80/TCP 6h
default spark-master 10.0.0.119 7077/TCP 6h
default spark-webui 10.0.0.166 8080/TCP 6h
default td-agent-server 10.0.0.177 24224/TCP,24224/UDP 6h
default zookeeper-1 10.0.0.244 2181/TCP,2888/TCP,3888/TCP 6h
default zookeeper-2 10.0.0.2 2181/TCP,2888/TCP,3888/TCP 6h
default zookeeper-3 10.0.0.86 2181/TCP,2888/TCP,3888/TCP 6h
kube-system kube-dns 10.0.0.10 53/UDP,53/TCP 1d
kube-system kubernetes-dashboard 10.0.0.207 80/TCP 1d
`

Same error with the Movie Lens 10M demo also .

Thank you in advance .

@ukclivecox
Copy link
Contributor

I think this is because you are trying to run a command in the seldon-control container and it does not have access to the local files on your system. One way would be to move the required files to a location that seldon-control can access such as the /seldon-data if you are running with hostPath.

We should make this clearer in the documentation.

@Moonba
Copy link

Moonba commented Aug 4, 2016

I'm running commands from my terminal as an admin not from seldon-control container bash terminal .
(and even when i tried that , $ kubectl exec -ti seldon-control -- /bin/bash i end up with the same error logs )
Yes ,I'm using HostPath for persistent storage .
DATA_VOLUME="hostPath": {"path": "/seldon-data"}

To create the default HostPath kubernetes conf files set for /seldon-data do the following:
cd kubernetes/conf make clean conf
Note : HostPath only makes sense for demo/testing where you have a Kubernetes cluster with a
single minion where all containers can share the location on the host.

You will need to create the host path folder on your single kubernetes minion.

I'm sorry i didn't get the last line , when i create /seldon-data directory on my minikube vm it tells me File already exists .

I get that seldon-data is a volume shared between seldon-server container and seldon-control container :
`seldon-server me$ kubectl exec seldon-server-553220162-63xiz ls /seldon-data

conf
grafana
influxdb
logs
mysql
seldon-models

seldon-server me$ kubectl exec seldon-control ls /seldon-data

conf
grafana
influxdb
logs
mysql
seldon-models `

Also the users.csv is empty when created ?
items.csv is okay .
nothing for users.csv .
and error for actions.csv :
cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 --output-delimiter=,) > actions.csv cut: illegal option -- - usage: cut -b list [-n] [file ...] cut -c list [file ...] cut -f list [-s] [-d delim] [file ...]

so i edited it to cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 -d ,) > actions.csv and it worked out , the file isn't empty .

Thank you in advance .

@ukclivecox
Copy link
Contributor

The main thing is you need to create the files in /seldon-data where seldon-control can see them.
If you are using host-path with your local folder /seldon-data then you can run commands like below which I tested:

Create a folder /seldon-data/ml100k.
Create the atttrs.json in /seldon-data/ml100k with the values as described in the docs.
then asusming you have downloaded, unzipped and ran iconv as described in docs for raw data then:

cat <(echo 'id,title,release,url') <(cat ml-100k/u.item.utf8 | awk -F '|' '{printf("%d,"%s","%s","%s"\n",$1,$2,$3,$5)}') > /seldon-data/ml100k/items.csv
cat <(echo "id") <(cat ml-100k/u.user | cut -d'|' -f1) > /seldon-data/ml100k/users.csv
cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 --output-delimiter=,) > /seldon-data/ml100k/actions.csv

and:

seldon-cli attr --action apply --client-name ml100k --json /seldon-data/ml100k/attrs.json
seldon-cli import --action items --client-name ml100k --file-path /seldon-data/ml100k/items.csv
seldon-cli import --action users --client-name ml100k --file-path /seldon-data/ml100k/items.csv
seldon-cli import --action users --client-name ml100k --file-path /seldon-data/ml100k/users.csv

You can then run MF and setup algs as described.

seldon-cli import --action actions --client-name ml100k --file-path /seldon-data/ml100k/actions.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants