Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined symbol: mem_fence #15

Closed
zyom opened this issue Mar 23, 2018 · 10 comments
Closed

undefined symbol: mem_fence #15

zyom opened this issue Mar 23, 2018 · 10 comments

Comments

@zyom
Copy link

zyom commented Mar 23, 2018

Recently I found your software and wanted to give it a try to model some powder diffraction experiments. I already was able to model my instrument but now I have troubles with the powder sample.

Whe I run the xrd example file I get the following error in the gui:

/usr/lib/python3/dist-packages/pyopencl/cffi_cl.py:1466: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz' on 'Portable Computing Language' at 0x55f51b58de30> succeeded, but said:

warning: /home/----/.cache/pocl/kcache/temp_N50SFB.cl:30:5: implicit declaration of function 'mem_fence' is invalid in C99

 warn(text, CompilerWarning)
OpenCL: bulding None ...
OpenCL: found 1 CPU
OpenCL: found none GPU
OpenCL: found none other accelerator
OpenCL for None: Autoselected device 0: pthread-Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
precisionOpenCL = float64
OpenCL: bulding None ...
OpenCL: found 1 CPU
OpenCL: found none GPU
OpenCL: found none other accelerator
OpenCL for None: Autoselected device 0: pthread-Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
precisionOpenCL = float64
Rays generation
/usr/bin/python3: symbol lookup error: /home/----/.cache/pocl/kcache/GI/ONBCCOFNMACGDHFMINNJLNIIMDMLHIJIFIAEH/undulator/3128-1-1/undulator.so: undefined symbol: mem_fence'

I'm working on a Debian stretch system.

Do you have any hints how to solve this problem?

Thanks
Armin

@zyom
Copy link
Author

zyom commented Mar 23, 2018

I forgot to mention that I'm working with the most recent sources from the git repository.

@kklmn
Copy link
Owner

kklmn commented Mar 23, 2018

Hello,
Please give the output of
...\tests\raycing\info_opencl.py
When it did work before, was it the same system?
Konstantin

@zyom
Copy link
Author

zyom commented Mar 23, 2018

Hello Konstantin,

I did not try opencel before because it was not required. But the powder sample necessarily needs opencl.

Here is the output of info_opencl.py

============================================================
OpenCL Platforms and Devices
============================================================
Platform - Name:  Portable Computing Language
Platform - Vendor:  The pocl project
Platform - Version:  OpenCL 2.0 pocl 0.13, LLVM 3.8.1
Platform - Extensions:  cl_khr_icd
Platform - Profile:  FULL_PROFILE
    --------------------------------------------------------
    Device - Name:  pthread-Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
    Device - Vendor:  GenuineIntel
    Device - Type:  3
    Device - Max Clock Speed:  3192 Mhz
    Device - Compute Units:  4
    Device - Local Memory:  3728352 KB
    Device - Constant Memory:  3728352 KB
    Device - Global Memory: 4 GB
    Device - FP:  6


<pyopencl.Context at 0x55ae66e3a0f0 on <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz' on 'Portable Computing Language' at 0x55ae66e39170>>

By the way I'm using the software inside a virtual machine because I was not able to get xrtGlow working on my windows machine. So I could also try to use another linux distribution if there is a preferred one.

Thanks for your assistance
Armin

@kklmn
Copy link
Owner

kklmn commented Mar 23, 2018

I've never worked with OpenCL in a virtual machine. I find it weird to seek for hardware acceleration in a virtualized solution. I don't even know if this should work.

As to the issue, I see your pocl is old. I see in their GitHub that mem_fence was implemented quite recently. You may have reasons to prefer pocl over Intel's CPU-only Runtime Packages but I would try the latter first.

As to the trouble with xrtGlow in Windows, please see one of the latest closed Issues. If in Anaconda, it's better to install pyopengl from the Gohlke's page. If in WinPython, you should install it by pip from it's standard pypi source. Don't know why this is so.

@yxrmz
Copy link
Collaborator

yxrmz commented Mar 24, 2018

It is possible to run opencl applications in virtualbox, but you still have to install proper opencl driver. Download and install the Intel OpenCL runtime from https://software.intel.com/en-us/articles/opencl-drivers
The main advantage of opencl is low-level access to hardware, so don't expect good performance in virtual machine. It would be much better to run xrt in native environment, as Konstantin said, try the Gohlke opencl and opengl *.whl packages.
By the way, I had to update the XRD example versus recent changes in the automatic alignment and reflect procedures.

@zyom
Copy link
Author

zyom commented Mar 27, 2018

As to the issue, I see your pocl is old. I see in their GitHub that mem_fence was implemented quite recently. You may have reasons to prefer pocl over Intel's CPU-only Runtime Packages but I would try the latter first.

Yes the reason is lack of knowledge and that this is the default dependency for pyopencl in debian ;-). And I was not aware that there are different implementations of opencl. I have now installed the intel opencl drivers on my windows host and pyopencl from conda-forge. This way the xrd example works but is rather slow. It takes approximately 90s for one step to calculate. Can this be expected on my ~5 year old cpu or is there still something misconfigured.

Regarding xrtGlow on Windows I will try to use the Gohlke *.whl packages.

Thanks so far for your assistance

@yxrmz
Copy link
Collaborator

yxrmz commented Mar 27, 2018

It takes approximately 90s for one step to calculate. Can this be expected on my ~5 year old cpu or is there still something misconfigured.

It's all about the double precision performance, the code calculates the reflectivities for each combination of HKL and takes the highest probable, so you could try to reduce the PowderSample.hkl for faster results (strictly speaking we should've used the Monte-Carlo discrimination here, but this would slow down the process dramatically. it's an easy fix after all, just let me know if you are interested in a bit more precise result for a price of overall performance).
For comparison, my numbers are the following: full tracing time 4.5s on Radeon R9 280x (1 TFLOPS FP64), 1.5s for powder diffraction only, 41s (38s) on i7-2600K, so 90 seconds look reasonable for an old CPU.

@kklmn kklmn closed this as completed Apr 20, 2018
@mmnmjm
Copy link

mmnmjm commented Jun 25, 2018

I have implemented OpenCl as suggested in this issue.
Hardware of my PC: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz.
Software: Windows-7, Python 2.7.13, Qt 5.6.2, PyQt5 5.6, pyopencl 2018.1.1, OpenGL 3.1.1a1, xrt-git-23092018
I can run xrtQook and Glow. But if I select targetOpenCL=[1,0] or r"CPU" in my OE an additional beam is emitted from the source. The screen is shifted to this second beam and is completely off the beamline. Could it be that I have similar problems as `mentioned by yxrmz:

By the way, I had to update the XRD example versus recent changes in the automatic alignment and reflect procedures.

@yxrmz
Copy link
Collaborator

yxrmz commented Jun 26, 2018

I can run xrtQook and Glow. But if I select targetOpenCL=[1,0] or r"CPU" in my OE an additional beam is emitted from the source. The screen is shifted to this second beam and is completely off the beamline.

Is this in the XRD example or it's something else?

@mmnmjm
Copy link

mmnmjm commented Jun 26, 2018

Meanwhile I have downloaded the last release xrt 1.3.2 and now it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants