Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

library(cytolib); library(mzR) crashes R 4.0.0 alpha on Linux #37

Closed
hpages opened this issue Apr 9, 2020 · 6 comments
Closed

library(cytolib); library(mzR) crashes R 4.0.0 alpha on Linux #37

hpages opened this issue Apr 9, 2020 · 6 comments

Comments

@hpages
Copy link
Contributor

hpages commented Apr 9, 2020

Hi,

I don't know if I should fill an issue here or under mzR so I'm doing both, sorry for that.

So on Linux, with R 4.0.0 alpha and current devel versions of cytolib and mzR, trying to load the 2 packages in this order crashes my session:

> library(cytolib)
> library(mzR)
Loading required package: Rcpp
terminate called after throwing an instance of 'H5::DataSpaceIException'
Aborted (core dumped)

No problem if I load them in the reverse order:

> library(mzR)
Loading required package: Rcpp
> library(cytolib)
> sessionInfo()
R version 4.0.0 alpha (2020-03-31 r78116)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /home/biocbuild/bbs-3.11-bioc/R/lib/libRblas.so
LAPACK: /home/biocbuild/bbs-3.11-bioc/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cytolib_1.99.24 mzR_2.21.1      Rcpp_1.0.4     

loaded via a namespace (and not attached):
[1] compiler_4.0.0      RProtoBufLib_1.99.7 ProtGenerics_1.19.3
[4] parallel_4.0.0      Biobase_2.47.3      codetools_0.2-16   
[7] ncdf4_1.17          BiocGenerics_0.33.3 RcppParallel_5.0.0 

Thanks,
H.

@hpages
Copy link
Contributor Author

hpages commented Apr 9, 2020

Same issue filled under mzR: sneumann/mzR#219

@mikejiang
Copy link
Member

Here is (gdb) backtrace

...
#5  0x00007ffff30f4d54 in __cxa_throw () ...
#6  0x00007fffeec5b52b in H5::DataSpace::getConstant () at H5DataSpace.cpp:63
#7  0x00007fffe7455a2f in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at H5DataSpace.cpp:81
...
#13 0x00007ffff7de97ca in _dl_open (file=0x7ffffffec670 "../mzR/libs/mzR.so"

And here is the relevant lines in h5 source
H5DataSpace.cpp:81

//--------------------------------------------------------------------------
// Purpose      Constant for default dataspace.
//--------------------------------------------------------------------------
const DataSpace& DataSpace::ALL = *getConstant();

and H5DataSpace.cpp:63

   // If the constant pointer is not allocated, allocate it. Otherwise,
    // throw because it shouldn't be.
    if (ALL_ == 0)
        ALL_ = new DataSpace(H5S_ALL);
    else
        throw DataSpaceIException("DataSpace::getConstant", "DataSpace::getConstant is being invoked on an allocated ALL_");

Looks like the allocation of global constant variable H5::DataSpace::ALL is somehow triggered twice when loading mzR.so

@mikejiang
Copy link
Member

mikejiang commented Apr 9, 2020

It has to do with the way cytolib is used in R by other packages (e.g. flowCore, flowWorkspace, CytoML). To avoid the the duplication of cytolib binary , we build the cytolib as shared library instead of static library. But we had problem in the past with finding the cytolib.so at runtime by the other package. So inspired by https://github.com/RcppCore/RcppParallel/blob/master/R/hooks.R#L13.
we solved this problem by loading the symbols of cytolib.so to the global symbol table of R process, i.e. setting local = FALSE in dyn.load call

         dllInfo <<- dyn.load(cytolib, local = FALSE, now = TRUE)

so that cytolib is automatically visible to other packages during both linking and runtime .

This has been working great since it not only eases the process compiling the other cytolib-dependent packages ( no explicit linking to cytolib is needed ) but also reduce the memory footprint

7.7M CytoML.so
 13M flowWorkspace.so
4.1M flowCore.so

because they all share the same copy of

 38M cytolib.so

But unfortunately, the side effect is libhdf5_cpp.a (required by and statically linked to cytolib) also gets loaded into global symbol table along with cytolib. And h5 uses global variable extensively and it causes clash (for this particular case H5::DataSpace::ALL) with another copy of libhdf5_cpp that is also statically linked to mzR.

@mikejiang
Copy link
Member

After rebuilding cytolib as static lib,

 54M libcytolib.a

which is statically linked to other cyto packages (with size increased as expected, except flowCore since it only uses small portion of cytolib and the majority unused symbols are dropped by linker).

48M flowWorkspace.so
 42M CytoML.so
5.8M flowCore.so

All the package tests seems to running ok, except for the gatingset_to_flowjo tests of CytoML, which randomly crashes or freezes at the second test case,or simply throw the libprotobuf error at save_gs call

CHECK failed: (scc->visit_status.load(std::memory_order_relaxed)) == (SCCInfoBase::kRunning):

rebuilding flowWorkspace by adding --pthread to linker (as suggested by some) doesn't seem to fix the issue.

@hpages I am going to put this change on hold until the bioc 3.11 release is completed, hopefully by loading mzR before cytolib the two packages still can coexist and work in the same R process.

mikejiang pushed a commit to RGLab/CytoML that referenced this issue Jan 21, 2021
@mikejiang
Copy link
Member

addressed by #45

@hpages
Copy link
Contributor Author

hpages commented Jan 23, 2021

Thx Mike.

Note that I get the following warning when loading RProtoBufLib 2.3.2 on Linux and Mac:

> library(RProtoBufLib)
Warning message:
In fun(libname, pkgname) : libGatingSet.pb library  not found.

But the warning is spurious because I do see libGatingSet.pb.so and libGatingSet.pb.dylib at the expected locations. You probably want to replace "GatingSet.pb.o" with pbSupported[[sysname]] in this line, like you did in RProtoBufLib:::pbLibPath(). Also you're almost certainly aware of paste0() as a convenient replacement for paste(..., sep="") ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants