Skip to content
This repository has been archived by the owner on Jan 7, 2021. It is now read-only.

Caffe - ERROR: Unable to create handle to FPGA #41

Closed
computingdolas opened this issue Sep 23, 2018 · 15 comments
Closed

Caffe - ERROR: Unable to create handle to FPGA #41

computingdolas opened this issue Sep 23, 2018 · 15 comments

Comments

@computingdolas
Copy link

opendir: Path /sys/bus/pci/devices/0000:00:1d.0/drm does not exist or could not be read: No such file or directory
[0]user:0x1042:0x7:[xdma:2017.1.47:65535]
xclProbe found 1 FPGA slots with xocl driver running
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD65535
WARNING: AwsXcl isGood: invalid user handle.
WARNING: xclOpen Handle check failed
[0]user:0xf010:0x1d51:[???:??:0]
device[0].user_instance : 0
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform

Is anyone aware of the problem.

Thanks

@wilderfield
Copy link
Contributor

wilderfield commented Sep 23, 2018 via email

@computingdolas
Copy link
Author

I am running with sudo permission. Let me try xbsak query and update you.

Thanks

@computingdolas
Copy link
Author

opendir: Path /sys/bus/pci/devices/0000:00:1d.0/drm does not exist or could not be read: No such file or directory
[0]user:0x1042:0x7:[xdma:2017.1.47:65535]
xclProbe found 1 FPGA slots with xocl driver running
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD65535
WARNING: AwsXcl isGood: invalid user handle.
WARNING: xclOpen Handle check failed
[0]user:0xf010:0x1d51:[???:??:0]
device[0].user_instance : 0
WARNING: AwsXcl - Cannot open userPF: /dev/dri/renderD0
WARNING: AwsXcl isGood: invalid user handle.
ERROR: xclOpen Handle check failed
ERROR: Failed to find an OpenCL platform
xbsak query
bash: xbsak: command not found
(ml-suite) [root@ip-172-31-82-205 centos]# xbsak query
bash: xbsak: command not found
(ml-suite) [root@ip-172-31-82-205 centos]#

You can see that, I run with root. xbsak query does not work as well.

What else can I try ?

@adu81020799
Copy link

I was trying to run the new developer notebook which was added recently from here:
https://github.com/Xilinx/ml-suite/blob/master/notebooks/Xilinx-ML-Developer-Lab/ml-suite-developer-lab.ipynb.
I launched the notebook using the new start_ami.sh file (this was really help full though kudos for you all). But in the last step i was not able to create the FPGA handle. I know that we need sudo permission to access the FPGA thus i tried to change the script to mask off the sudo users but then the jupyter failed to load notebooks. I notice the same error when I try to launch the jupyter notebook from the ml-suite environment.
Please provide your thoughts where I might be going wrong.

@adu81020799
Copy link

Doses any one has heads-up on this problem?

@computingdolas
Copy link
Author

Don't use ML SUITE AMI - use rather amazon FPGA developer AMI. It will work as expected. This is what I have done as well.

@adu81020799
Copy link

Hi, I am not using the ML-suite AMI . Currently I am using FPGA developer AMI. Did you try to run the notebooks?

@computingdolas
Copy link
Author

computingdolas commented Oct 1, 2018

Yeah, I can run them. What exactly are you facing ? Are you facing problem in launching Jupiter notebooks ?

@adu81020799
Copy link

I was trying to run the developer lab from the below notebook.
https://github.com/Xilinx/ml-suite/blob/master/notebooks/Xilinx-ML-Developer-Lab/ml-suite-developer-lab.ipynb
at the part where it opens the FPGA Handle
ret = pyxfdnn.createHandle(config['xclbin'], "kernelSxdnn_0", config['xfdnn_library'])
if ret:
print("ERROR: Unable to create handle to FPGA")
else:
print("INFO: Successfully created handle to FPGA")
The result prints :: Unable to create handle to FPGA

To run the jupyter notebooks I am using the start_ami.sh script .

Can you tell me how you are opening the jupyter notebook ?Coz if I open the jupyter notebook with ML suite I get the following erro:
sys.exit(main())
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/jupyter_core/application.py", line 266, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/config/application.py", line 657, in launch_instance
app.initialize(argv)
File "", line 2, in initialize
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/config/application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 1629, in initialize
self.init_webapp()
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 1379, in init_webapp
self.jinja_environment_options,
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 158, in init
default_url, settings_overrides, jinja_env_options)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 251, in init_settings
allow_remote_access=jupyter_app.allow_remote_access,
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/traitlets.py", line 556, in get
return self.get(obj, cls)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/traitlets.py", line 535, in get
value = self._validate(obj, dynamic_default())
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 872, in _default_allow_remote
for info in socket.getaddrinfo(self.ip, self.port, 0, socket.SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
(ml-suite) [root@ip-172-31-62-130 ml-suite]#

@wilderfield
Copy link
Contributor

wilderfield commented Oct 2, 2018 via email

@adu81020799
Copy link

Here is the output of commands i tried as per your reply.

[centos@ip-172-31-62-130 ~]$ sudo su
bash: conda: command not found...
[root@ip-172-31-62-130 centos]# source ~centos/.bashrc
(base) [root@ip-172-31-62-130 centos]# conda activate ml-suite
(ml-suite) [root@ip-172-31-62-130 centos]# source ml-suite/overlaybins/setup.sh aws
make: Entering directory /home/centos/ml-suite/apps/yolo/nms' cd ./nms_20180209 && make make[1]: Entering directory /home/centos/ml-suite/apps/yolo/nms/nms_20180209'
make[1]: Nothing to be done for all'. make[1]: Leaving directory /home/centos/ml-suite/apps/yolo/nms/nms_20180209'
make: Leaving directory `/home/centos/ml-suite/apps/yolo/nms'
(ml-suite) [root@ip-172-31-62-130 centos]# which jupyter
/home/centos/anaconda2/envs/ml-suite/bin/jupyter
(ml-suite) [root@ip-172-31-62-130 centos]# jupyter notebook --no-browser --ip=*
Traceback (most recent call last):
File "/home/centos/anaconda2/envs/ml-suite/bin/jupyter-notebook", line 11, in
sys.exit(main())
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/jupyter_core/application.py", line 266, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/config/application.py", line 657, in launch_instance
app.initialize(argv)
File "", line 2, in initialize
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/config/application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 1629, in initialize
self.init_webapp()
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 1379, in init_webapp
self.jinja_environment_options,
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 158, in init
default_url, settings_overrides, jinja_env_options)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 251, in init_settings
allow_remote_access=jupyter_app.allow_remote_access,
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/traitlets.py", line 556, in get
return self.get(obj, cls)
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/traitlets/traitlets.py", line 535, in get
value = self._validate(obj, dynamic_default())
File "/home/centos/anaconda2/envs/ml-suite/lib/python2.7/site-packages/notebook/notebookapp.py", line 872, in _default_allow_remote
for info in socket.getaddrinfo(self.ip, self.port, 0, socket.SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
(ml-suite) [root@ip-172-31-62-130 centos]#
This is the same error which I was getting before.
Regards
Adarsh

@wilderfield
Copy link
Contributor

wilderfield commented Oct 2, 2018 via email

@adu81020799
Copy link

I tried the command , did not work. Thank you for the help. Even I am not able to find the root cause of it

@wilderfield
Copy link
Contributor

wilderfield commented Jan 30, 2019

https://aws.amazon.com/marketplace/pp/B077FM2JNS

New AMI is available. Jupyter will be running at startup.

Go to publicDNS:8888

@aez-lab
Copy link

aez-lab commented Jun 19, 2019

Same issue in ml suite v1.4:

I am trying to run ml suite object_detection_yolov2.ipynb but I am getting this in step 6(https://github.com/Xilinx/ml-suite/blob/master/notebooks/object_detection_yolov2.ipynb):

"ERROR: Unable to create handle to FPGA"

in terminal, the error is :

[XBLAS] # kernels: 1
ERROR: No devices found
ERROR: Failed to find an OpenCL platform

In previous cell, step 5, I am getting a couple of "Exception 'Layer' object has no attribute 'pooling_param'". not sure if this related to this or not.

I am using ml suite v1.4 in aws with CentOS Linux release 7.5.1804

Any idea how to fix it?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants