New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing includes / libs in py2-numpy toolfile #2994
Comments
A new Issue was created by @riga Marcel R.. @davidlange6, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
I don't think we ever expected people to build Python packages at low-level within CMSSW. This is way our toolfiles only contain information only needed to use Python packages. I would probably suggest adding another tool into @davidlange6 I think, you were working on C++ API for tensorflow. |
On May 2, 2017, at 3:19 PM, davidlt ***@***.***> wrote:
I don't think we ever expected people to build Python packages at low-level within CMSSW. This is way our toolfiles only contain information only needed to use Python packages.
I would probably suggest adding another tool into py2-numpy-toolfile.spec maybe py2-numpy-devel.xml or py2-numpy-c-api.xml, which would define libraries and include directories. That one could be used for building packages which depend on these low-level APIs.
@davidlange6 I think, you were working on C++ API for tensorflow.
not me, though its on my to do list - @mharrend was (and succeeded I believe). Have a look here:
#2824
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
My guess is |
tensorflow is coming via a wheel, so indeed, maybe the c++ bindings are there too and I just didn't manage to access them..
… On May 2, 2017, at 3:41 PM, davidlt ***@***.***> wrote:
My guess is py2-tensorflow.spec contains tensorflow + Python bindings. Do you know why we didn't build tensorflow.spec and then py2-tensorflow.spec (only Python bindings)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks for your comments!
@davidlt Sounds good. Just tell me If I can provide or test something, I'm glad to help.
@davidlange6 Yep, I'm currently using his bundle for 80X ;) The wheel comes with C++ headers and libraries, however, I haven't yet managed to access them via editing my BuildFile. But this should be possible via a tool file, right? |
I looked at https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/stage and tried to check for headers -- found none to be part of our py2-tensorflow package. Could you double check? Btw, could you provide some information why you need C or C++ API (e.g. some run-time integration for CMSSW) or just because you want to use C++? |
Yeah, apparently model building is not yet supported by the C++ API, which is why there are no headers containing the ops. As far as I see, the graph evaluation should already work in C++, which requires the headers in On analysis level and for network training we use the Python API, and in general, there is no reason not to ;) But for model evaluation on an event-by-event basis right within a CMSSW module (e.g. for electron MVAs, b-tagging, etc), the C++ API would be very helpful. |
That's exactly what I wanted to hear. If we reached the point where we want to integrate model evaluation run-time into CMSSW that needs C or/and C++ API. For that we would need to build a tensorflow as a normal package inside CMSSW. |
That would be awesome. I know many groups within CMS that would really appreciate a working TF interface (including us, of course). In the meantime, in combination with the approach mentioned above, this additional py2-numpy-devel tool file would already do the trick. |
This was requested here: cms-sw#2994 This allows writing C extensions to Numpy within CMSSW. Signed-off-by: David Abdurachmanov <David.Abdurachmanov@cern.ch>
The following should work for you: #2998 IIRC, there are no explicit libraries we need to link to. |
This was requested here: cms-sw#2994 This allows writing C extensions to Numpy within CMSSW. Signed-off-by: David Abdurachmanov <David.Abdurachmanov@cern.ch>
Thanks @davidlt for providing this so fast! I'm not an expert...how can I use that file? Or will it be available at some point in the |
Once merged it will be available in IBs, which are built twice a day. If you want to use it now:
Note, you probably will need to adjust the paths. You can now use |
Works like a charm, thanks! |
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA. We can grab official binary builds and put that into CMSSW.
This would provide you C API, but not C++. The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden. |
I looked into tensorflow repository and found 2 issues already created in the last 10 days about symbol visibility. This is actually causing issues in real life. They think it's easy to resolve, but someone needs to do it. I also asked about binary distributions for C++ API, which they don't provide at this point. The only issue could be that C++11 ABI. |
Hi @davidlt -
This seems ok until we have a tensorflow tool inside a bigger application (which will be one week after having a tensorflow tool in cmssw). or is it easy for us to hide these symbols?
as I mentioned to you yesterday I was looking at how to plug in our own externals, but that is so far non trivial.
… On May 3, 2017, at 11:15 PM, davidlt ***@***.***> wrote:
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA.
We can grab official binary builds and put that into CMSSW.
$ tar xvzf libtensorflow-cpu-linux-x86_64-1.1.0.tar.gz
./
./include/
./include/tensorflow/
./include/tensorflow/c/
./include/tensorflow/c/c_api.h
./lib/
./lib/libtensorflow.so
./include/tensorflow/c/LICENSE
$ ldd ./lib/libtensorflow.so
linux-vdso.so.1 (0x00007fff0f9ca000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f489ea2e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f489e82a000)
libm.so.6 => /lib64/libm.so.6 (0x00007f489e521000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f489e199000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f489df82000)
libc.so.6 => /lib64/libc.so.6 (0x00007f489dbba000)
/lib64/ld-linux-x86-64.so.2 (0x000055dd55c99000)
This would provide you C API, but not C++.
The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
It is an easy one and someone is already working on this (seems unsuccessfully for now). There are a couple of ways to do it. The easiest one and it's good for C libraries are symbol version scripts. That's what is being worked on. This also brings ability to add version symbols to tensorflow API. It works less good in C++ application where compiler can generate gazillions of symbols (e.g. via templates). For that it would be better to use explicit attributes in code. I think, it's much more easier to work with upstream community / Google to ensure that binary distributions are properly built and just take them. IIUC, you don't want to use Eigen, jemalloc, etc from CMSSW while building tensorflow as it requires particular versions of these packages. It seems to be tightly integrated. Thus inside SPEC file you would be building the same binary distribution as you could get from upstream community / Google. That binary distributions also by default works for everything (CMS, ATLAS, Ubuntu, etc.) I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456 |
On May 4, 2017, at 1:18 PM, davidlt ***@***.***> wrote:
It is an easy one and someone is already working on this (seems unsuccessfully for now). There are a couple of ways to do it. The easiest one and it's good for C libraries are symbol version scripts. That's what is being worked on. This also brings ability to add version symbols to tensorflow API.
It works less good in C++ application where compiler can generate gazillions of symbols (e.g. via templates). For that it would be better to use explicit attributes in code.
I think, it's much more easier to work with upstream community / Google to ensure that binary distributions are properly built and just take them.
IIUC, you don't want to use Eigen, jemalloc, etc from CMSSW while building tensorflow as it requires particular versions of these packages. It seems to be tightly integrated. Thus inside SPEC file you would be building the same binary distribution as you could get from upstream community / Google.
That binary distributions also by default works for everything (CMS, ATLAS, Ubuntu, etc.)
I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456
Thanks - I have basically this same recipe aside from having to change where tmp files go (defaulting to afs where quota fills quickly) but have not gotten to the end.
so ok, I was going the wrong direction with regards to the externals inside tensorflow. I'll stop heading in the direction of removing them.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hi @davidlt - does libtensorflow.so end up being the same thing as
/cvmfs/cms.cern.ch/slc7_amd64_gcc630/external/py2-pippkgs_depscipy/3.0-mlhled3/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so
? (minus a python api perhaps)
… On May 3, 2017, at 11:15 PM, davidlt ***@***.***> wrote:
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA.
We can grab official binary builds and put that into CMSSW.
$ tar xvzf libtensorflow-cpu-linux-x86_64-1.1.0.tar.gz
./
./include/
./include/tensorflow/
./include/tensorflow/c/
./include/tensorflow/c/c_api.h
./lib/
./lib/libtensorflow.so
./include/tensorflow/c/LICENSE
$ ldd ./lib/libtensorflow.so
linux-vdso.so.1 (0x00007fff0f9ca000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f489ea2e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f489e82a000)
libm.so.6 => /lib64/libm.so.6 (0x00007f489e521000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f489e199000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f489df82000)
libc.so.6 => /lib64/libc.so.6 (0x00007f489dbba000)
/lib64/ld-linux-x86-64.so.2 (0x000055dd55c99000)
This would provide you C API, but not C++.
The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
>
> I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456
>
Thanks - I have basically this same recipe aside from having to change where tmp files go (defaulting to afs where quota fills quickly) but have not gotten to the end.
but to clarify, this is a spec for tensorflow+python api. we already have the equivalent in CMSSW (just not built with our compiler/python) - this doesn't bring the C (or C++) interface. But maybe that amounts to one file.
|
By the way, here is the recipe how I built the C++ Tensorflow API for CMSSW8, just in case you would like to build the C++ API as a CMSSW module.
|
This was requested here: #2994 This allows writing C extensions to Numpy within CMSSW. Signed-off-by: David Abdurachmanov <David.Abdurachmanov@cern.ch>
Using |
If, at some point, you want to test a possible solution, I'm glad to help out. |
At this point I am not sure if I want to jump into it. Seems that community already identified the same (or similar issues) and will try to resolve them, but it's not a priority for them at this point. I did start compiling tensorflow and playing around, but for now stopped. There are 3 main tasks:
It is interesting, but would cost some time. |
@riga, can we close this issue? |
@smuzaffar Yep! |
Hi!
I'm currently trying to write a CMSSW module (slc6_amd64_gcc530, CMSSW_8_0_26_patch2) that uses the Python and NumPy C API's to load and evaluate Tensorflow graphs (https://gitlab.cern.ch/mrieger/CMSSW-DNN). The "Python.h" header is included properly, but it seems like "numpy/arrayobject.h" could not be found at compile time.
I added the tools "python" and "py2-numpy" to my BuildFile.xml and could debug with "scram b -d" that python is used (
-I/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/python/2.7.11-ikhhed2/include/python2.7
) but numpy is missing. After digging deeper I saw that the py2-numpy-toolfile.spec does not define any<environment name="INCLUDE" default="..." />
tag, unlike (e.g.) the python tool.The numpy header files are located within the python module directory, e.g. here:
/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/py2-numpy/1.11.1-ikhhed2/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-x86_64.egg/numpy/core/include
The libraries seem to be at the same location.
I can try to provide a PR with an update tool file, but I'm not sure which branch to chose to start with. Is it possible to update the configuration of a tool in the BuildFile.xml of my module for testing purposes?
Thanks for your help! I'm really looking forward to use our DNNs in CMSSW =)
PS: As it would be much simpler to directly use the tensorflow C++ API, would it also be possible to provide the tensorflow include files / libraries?
The text was updated successfully, but these errors were encountered: