Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running carnival in parallel #61

Closed
ahmedasadik opened this issue May 11, 2021 · 7 comments
Closed

Running carnival in parallel #61

ahmedasadik opened this issue May 11, 2021 · 7 comments

Comments

@ahmedasadik
Copy link

Dear Carnival Developers,
First, thank you for the great tool, it really is amazing.

My issue, now, is trying to run Carnival in parallel, but that is not working. I made a toy example to run it but it is not working.

library(furrr)
plan(multisession, workers = 4)
library(CARNIVAL)

load(file = system.file("toy_inputs_ex1.RData",
                        package="CARNIVAL"))
load(file = system.file("toy_measurements_ex1.RData",
                        package="CARNIVAL"))
load(file = system.file("toy_network_ex1.RData",
                        package="CARNIVAL"))


input_lists <- rep(list(toy_inputs_ex1),times=20)
meas_lists <- rep(list(toy_measurements_ex1),times=20)
net_lists <- rep(list(toy_inputs_ex1),times=20)
input_lists <- rep(list(toy_inputs_ex1),times=20)

result = future_map(input_lists, function(lst){
  runCARNIVAL(inputObj = lst, measObj = toy_measurements_ex1, threads =1,
              netObj = toy_network_ex1, solverPath = "/usr/bin/cbc", solver = "cbc")
})

This is the error that I get when running it.

Writing constraints...
Solving LP problem...
Error: 'results_cbc_1_1.txt' does not exist in current working directory ('/home/ahmed/Comp_Bio/Projects/Discovery_Pipeline').
In addition: Warning message:
UNRELIABLE VALUE: Future (‘<none>’) unexpectedly generated random numbers without specifying argument 'seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

I looked at the one of the development branches, and it seems that this was somehow addressed with parallelIdx1, but it always has a fixed value of 1 and there is no argument in the runCarnival function to change 'condition' that would then send off different arguments to the cbc solver instead of always looking for the 'results_cbc_1_1.txt' file which is already in use by the first node.

Thanks,

@ivanovaos
Copy link
Collaborator

Hi @ahmedasadik,
Thanks for writing to us. The newer version of CARNIVAL (will be submitted this week) won't support multithreading by itself, but if you use e.g. cplex solver, it natively supports multithreading and we would rely on this. How big is the problem that you want to solve with CARNIVAL?

@ahmedasadik
Copy link
Author

I have many single-cell and bulk expression datasets that I need to use carnival for. So doing things in parallel is extremely important. Unfortunately, I don't have access to cplex and they refused an academic license because my institute is not a university.

@ivanovaos
Copy link
Collaborator

At the current implementation that you use, the easiest way to handle it is sending each sample to a separate cluster node (through .sh or snakemake scripts). Just be sure that you setup a different working directory for each run, so the files won't be accidentally rewritten. We are working on making default pipelining for running CARNIVAL on many samples simultaneously, but this will be public only in a couple of months.

@ahmedasadik
Copy link
Author

OK, but would it be possible to pass a "threads" and "randomseed" options to CBC, by modifying the carnivaloptions sent to the CBC command line? That way it would be much faster than it curretly is, especially that I built my CBC solver by enabling multithreading.
Otherwise, I would appreciate if you could tell me how to export the LP file send to the solver and then I can bash that in parallel.
I would appreciate your help very much.

@ivanovaos
Copy link
Collaborator

If you wait for until the end of the next week, we can add this option to cbc indeed. We are currently wrapping up the next bioconductor release, to add another option for a solver won't be an issue. Also, in the new release it will be easy to save and collect the LP files. Can you make a new issue with suggesting options for cbc? I will later add a branch to it and you will get notified when it is done.

@ahmedasadik
Copy link
Author

Thank you very much. I really appreciate it.

@gabora
Copy link
Member

gabora commented Mar 10, 2022

see #62

@gabora gabora closed this as completed Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants