Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop sharing #4

Closed
fedeago opened this issue Jul 3, 2020 · 12 comments
Closed

Stop sharing #4

fedeago opened this issue Jul 3, 2020 · 12 comments

Comments

@fedeago
Copy link

fedeago commented Jul 3, 2020

Goodmorning again, is possible stop an object to be shared?
In the package that I am developping the matrix that internally I shared(with the option CopyOnWrite=F) are part of the output of the model but i would like, if possible, to stop that in order to give an output with a "normal" behaviour.

Thanks for your work, it is really precious for me.

@Jiefei-Wang
Copy link
Owner

Jiefei-Wang commented Jul 4, 2020

Hi @fedeago . Yes, we have an unshare function to stop sharing. it is only available at the devel branch now(I highly suggest using the devel version for we have updated many codes in the package). Please note that unshare would not unshare the object itself, but rather returns a normal R object that is equivalent to the old one. The old shared object still exists after you call unshare. Please let me know if you have more concerns.

Best,
Jiefei

@fedeago
Copy link
Author

fedeago commented Jul 14, 2020

Hi,
Thank you, that's exactly what I was looking for.

Suddenly this new version of SharedObject(1.3.9) gives me some problems. Running devtools::check() or R CMD check it fails to do the test of the package with this error:

error reading from connection
Backtrace:
  1. NewWave::newFit(counts, X = model.matrix(~bio), commondispersion = FALSE) tests/testthat/test_design.R:9:4
  2. NewWave::newFit(counts, X = model.matrix(~bio), commondispersion = FALSE) R/AllGenerics.R:177:21
  3. NewWave:::.local(Y, ...)
  4. NewWave::optimization(...) R/newFit.R:213:4
  5. parallel::clusterApply(...) R/newFit.R:584:8
  6. parallel:::staticClusterApply(cl, fun, length(x), argfun)
  7. base::lapply(cl[1:jobs], recvResult)
  8. parallel:::FUN(X[[i]], ...)
 10. parallel:::recvData.SOCKnode(con)
 11. base::unserialize(node$con)

It is really strange because the test do not fails when runned as a normal R script but it fails during the R check.

Maybe I should open an issue on the devtools repository but I tell you this becouse it do not happen with the stable version of SharedObject(1.2.2).

Thank you, best regards

@Jiefei-Wang
Copy link
Owner

Thanks for letting me know the issue, could you please provide me more info on the issue so I can also take a look at it? Does the issue persist when running check multiple times? What is your OS? It will be better if you have a reproducible example.

Best,
Jiefei

@fedeago
Copy link
Author

fedeago commented Jul 14, 2020

this is the output of sessionInfo:

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=it_IT.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=it_IT.UTF-8        LC_COLLATE=it_IT.UTF-8    
 [5] LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=it_IT.UTF-8   
 [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] NewWave_0.1.0               SharedObject_1.3.9         
 [3] SingleCellExperiment_1.10.1 SummarizedExperiment_1.18.2
 [5] DelayedArray_0.14.0         matrixStats_0.56.0         
 [7] Biobase_2.48.0              GenomicRanges_1.40.0       
 [9] GenomeInfoDb_1.24.2         IRanges_2.22.2             
[11] S4Vectors_0.26.1            BiocGenerics_0.34.0        

loaded via a namespace (and not attached):
 [1] httr_1.4.1             pkgload_1.1.0         
 [3] BiocSingular_1.4.0     jsonlite_1.7.0        
 [5] assertthat_0.2.1       BiocManager_1.30.10   
 [7] RBGL_1.64.0            GenomeInfoDbData_1.2.3
 [9] remotes_2.1.1          sessioninfo_1.1.1     
[11] backports_1.1.8        lattice_0.20-41       
[13] glue_1.4.1             RUnit_0.4.32          
[15] digest_0.6.25          XVector_0.28.0        
[17] Matrix_1.2-18          XML_3.99-0.4          
[19] devtools_2.3.0         rcmdcheck_1.3.3       
[21] zlibbioc_1.34.0        purrr_0.3.4           
[23] processx_3.4.3         stringdist_0.9.5.5    
[25] getopt_1.20.3          optparse_1.6.6        
[27] BiocParallel_1.22.0    biocViews_1.56.1      
[29] usethis_1.6.1          ellipsis_0.3.1        
[31] withr_2.2.0            cli_2.0.2             
[33] magrittr_1.5           crayon_1.3.4          
[35] memoise_1.1.0          ps_1.3.3              
[37] fs_1.4.2               fansi_0.4.1           
[39] xml2_1.3.2             pkgbuild_1.1.0        
[41] graph_1.66.0           tools_4.0.2           
[43] prettyunits_1.1.1      stringr_1.4.0         
[45] xopen_1.0.0            irlba_2.3.3           
[47] callr_3.4.3            packrat_0.5.0         
[49] compiler_4.0.2         rsvd_1.0.3            
[51] rlang_0.4.7            grid_4.0.2            
[53] RCurl_1.98-1.2         rstudioapi_0.11       
[55] bitops_1.0-6           testthat_2.3.2        
[57] codetools_0.2-16       curl_4.3              
[59] roxygen2_7.1.1         R6_2.4.1              
[61] knitr_1.29             rprojroot_1.3-2       
[63] desc_1.2.0             stringi_1.4.6         
[65] Rcpp_1.0.5             BiocCheck_1.24.0      
[67] xfun_0.15

The only way I think you can reproduce the error is clone my github repository called NewWave, uncomment all the code in /tests/test_design.R and then check the package.

Changing from version 1.2.2 to 1.3.9 make other test of the package fails in the same way(fails during check but not as normal R script) but I solved them adding

Sys.setenv("R_TESTS" = "")   

in the tests/testthat.R file as suggested in : r-lib/testthat#86
but it not solved the proble for one test file.

If you think that I should open an issue on the devtools package I will do it, I writed to you becouse it was not happening before.

@fedeago
Copy link
Author

fedeago commented Jul 21, 2020

Goodmorning,
using the devel version of SharedObjec I encoutered another problem, sometimes with little dataset and always with big dataset I recive this error message:

*** caught segfault ***
address 0x7f2ccc03b000, cause 'memory not mapped'

Traceback:
 1: ll_calc(mu = mu_sh, model = model, Y_sh = Y_sh, z = zeta_sh,     alpha = alpha_sh, beta = beta_sh, gamma = gamma_sh, W = W_sh,     commondispersion = commondispersion)
 2: optimization(Y = Y_sh, cluster = cl, children = children, model = m,     max_iter = maxiter_optimize, stop_epsilon = stop_epsilon,     commondispersion = commondispersion, n_gene_disp = n_gene_disp,     n_cell_par = n_cell_par, n_gene_par = n_gene_par, verbose = verbose,     mode = "matrix", cross_batch = cross_batch)
 3: .local(Y, ...)
 4: newFit(Y = dataY, X = X, V = V, K = K, commondispersion = commondispersion,     verbose = verbose, maxiter_optimize = maxiter_optimize, stop_epsilon = stop_epsilon,     children = children, random_init = random_init, random_start = random_start,     n_gene_disp = n_gene_disp, n_cell_par = n_cell_par, n_gene_par = n_gene_par,     cross_batch = cross_batch)
 5: newFit(Y = dataY, X = X, V = V, K = K, commondispersion = commondispersion,     verbose = verbose, maxiter_optimize = maxiter_optimize, stop_epsilon = stop_epsilon,     children = children, random_init = random_init, random_start = random_start,     n_gene_disp = n_gene_disp, n_cell_par = n_cell_par, n_gene_par = n_gene_par,     cross_batch = cross_batch)
 6: .local(Y, ...)
 7: newFit(dati, K = 10, X = "~batch", children = 10, n_gene_disp = 100,     n_gene_par = 100, n_cell_par = ncol(dati)/10)
 8: newFit(dati, K = 10, X = "~batch", children = 10, n_gene_disp = 100,     n_gene_par = 100, n_cell_par = ncol(dati)/10)
 9: system.time(res_newWave_commo_allminibatch[[i]] <- newFit(dati,     K = 10, X = "~batch", children = 10, n_gene_disp = 100, n_gene_par = 100,     n_cell_par = ncol(dati)/10))
An irrecoverable exception occurred. R is aborting now ...
Segmentation error (core dump created)

I can show you how to reproduce this error but only using the package I am delopping.
This error do not happens using SharedObject 1.2.2

Best regards

@fedeago
Copy link
Author

fedeago commented Jul 22, 2020

Goodmorning,
I found that this error occurs only when I do operations(like moltiplications) with some shared matrix that has been modified in this way:

   shared_matrix[ ] <- < new matrix with same dimension >

Suddenly I am not able to reproduce that outside the package.

@Jiefei-Wang
Copy link
Owner

Thanks for letting me know the issue, I appreciate your bug report! Last week is quite busy for me so I do not have an answer to your previous question yet, but I will try to reproduce and check the problem this week and let you know the result.

Best,
Jiefei

@Jiefei-Wang
Copy link
Owner

Jiefei-Wang commented Jul 27, 2020

Hi @fedeago , I have tried your package on Ubuntu. I use the commit ab456d41daf413f11cad319e0eedff17038fddbb that you made on July 13 but I am not able to reproduce the check error you have seen. Both R CMD check and devtools::check works fine on my side. This error is possibly very specific to your system. I suspect it is related to your shared memory size. Would you be able to run df --type=tmpfs in a terminal and show me your result?

For your second issue(shared_matrix[ ] <- < new matrix with same dimension >), is there any way to reproduce it? I am using your package so it is fine to only reproduce it within your package.

Best,
Jiefei

@fedeago
Copy link
Author

fedeago commented Jul 28, 2020

Hi, please download again the package and re-try the checks, I wrongly commented some part of the test.
Anyway this is the result of df --type=tmpfs:

 File system    1K-blocchi  Usati Disponib. Uso% Montato su
tmpfs             1622132   2124   1620008   1% /run
tmpfs             8110640 392868   7717772   5% /dev/shm
tmpfs                5120      4      5116   1% /run/lock
tmpfs             8110640      0   8110640   0% /sys/fs/cgroup
tmpfs             1622128     16   1622112   1% /run/user/121
tmpfs             1622128     40   1622088   1% /run/user/1027

The second issue is somehow related to only one specific dataset, the one i wolud use for benchmarking. It is really strange so I will investigate and try to go further giving you some reproducible example.

Thank you
Best regards

@Jiefei-Wang
Copy link
Owner

Thanks @fedeago , I just want to let you know that I have located the problem. It is caused by incorrectly releasing the shared memory while it is still in-used. It affects all shared matrices so your second bug might be the same issue. I will fix it in the next couple of days and let you know.

Jiefei-Wang added a commit that referenced this issue Jul 28, 2020
@fedeago
Copy link
Author

fedeago commented Jul 29, 2020

As you said it solved both of my issue, thank you! I am very grateful for your work!

When do you think the package would be disposable on Bioconductor devel?

@Jiefei-Wang
Copy link
Owner

Glad to hear it. I have pushed the changes to Bioconductor today and it should be available in the next two days. Please feel free to contact me if you encounter any more problems:)

I will close this issue after this comment. If you see any similar bugs you can reopen it or make a new issue. Thanks again for letting me know your concern and giving me the change to make the package better!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants