Parallel backend is broken for doParallel >1.0.6 #7

mintyplanet · 2014-03-03T20:49:51Z

doParallel package version > 1.0.6 doesn't load as the parallel backend.

> library(NMF)
> data(esGolub)
> nmf(esGolub, 3, nrun=4, .opt="vP")
NMF algorithm: 'brunet'
Multiple runs: 4
Error: Foreach computation aborted: object 'info' not found

The error message refers to this line:
https://github.com/renozao/NMF/blob/master/R/parallel.R#L316

object$info <- doParallel:::info

The internal variable doParallel:::info has been removed since version 1.0.7

renozao · 2014-03-04T08:23:46Z

Thanks for reporting this.
Will look into it asap.

…he doParallel package

brdhungana · 2014-12-05T20:36:18Z

Recently, I encountered the same error while running NMF in Linux Platform but 'not' in window Platform. As you have indicated above the problem has been fixed. Could you please be explicit with example.

Here is my code:

system.time(NMFFit <- nmf(Train2, rank=4, method="ns", theta=0.7, seed = 123456, nrun=8, .opt = "vP8"));
NMF algorithm: 'nsNMF'
Multiple runs: 8
Error: Parallel computation aborted: object 'info' not found
Timing stopped at: 0.26 0.037 0.295

Here is my system information:
$platform: "x86_64-unknown-linux-gnu"
$version.string: "R version 3.0.0 (2013-04-03)"

other attached packages:
[1] doParallel_1.0.8 iterators_1.0.7 foreach_1.4.2
[4] NMF_0.17 bigmemory_4.4.6 BH_1.54.0-4
[7] bigmemory.sri_0.1.3 digest_0.6.4 rngtools_1.2.4
[10] pkgmaker_0.22 registry_0.2

I appreciate your help for fixing the parallel run issue in Linux. Thanks.

Regards,
BRD

renozao · 2014-12-06T05:37:48Z

Have you tried using the latest version on CRAN (0.20.5)?
On my Ubuntu box:

library(NMF)
x <- rmatrix(100, 20)
res <- nmf(x, rank=4, method="ns", theta=0.7, seed = 123456, nrun=8, .opt = "vP4")

Results:

> library(NMF)
Loading required package: pkgmaker
Loading required package: registry
Loading required package: rngtools
Loading required package: cluster
NMF - BioConductor layer [OK] | Shared memory capabilities [OK] | Cores 3/4
> x <- rmatrix(100, 20)
> res <- nmf(x, rank=4, method="ns", theta=0.7, seed = 123456, nrun=8, .opt = "vP4")
NMF algorithm: 'nsNMF'
Multiple runs: 8
Mode: parallel (4/4 core(s))
Runs: |==================================================| 100%
System time:
   user  system elapsed 
  8.993   0.291   3.648 
> res
<Object of class: NMFfitX1 >
  Method: nsNMF 
  Runs:  8 
  RNG:
   407L, -473780611L, -197192934L, -577462829L, -1713825544L, 1377146521L, 1787321734L 
  Total timing:
   user  system elapsed 
  8.993   0.291   3.648

brdhungana · 2014-12-19T02:47:54Z

I finally be able to run model in recent version of NMF 0.20.5 using multicore in Linux System without any bug as before. Thank you for responding.

brdhungana · 2015-01-16T01:48:56Z

My model successfully ran with run=4 but stopped running just before finishing for 50 runs with the same data set:

Here is the successful run case:

library(NMF);
system.time(ckmNMF4 <- nmf(Train.ckm_2, rank=4, method="ns", theta=0.7, seed = 123456, nrun=4, .opt = "vp4"));
NMF algorithm: 'nsNMF'
Multiple runs: 4
Mode: parallel (4/16 core(s))
Runs: |==================================================| 100%
System time:
user system elapsed
32823.259 621.439 15113.068
user system elapsed
32834.302 622.468 15125.747

Failed run: My two attempts yielded the following messages:
system.time(ckmNMF4 <- nmf(Train.ckm_2, rank=4, method="ns", theta=0.7, seed = 123456, nrun=100, .opt = "vp16"));
NMF algorithm: 'nsNMF'
Multiple runs: 100
Mode: parallel (16/16 core(s))
Runs: |==================================================| 100%
ERROR
Error: NMF::nmf - Unexpected error: no partial result seem to have been saved.
Timing stopped at: 788331.4 17865.29 94329.14
Timing stopped at: 788344.5 17866.6 94348.33

How do I debug the causes of this failure. Note that I ran this with 45 million rows and 15 variables. It seems to me memory was not an issue. I appreciate your suggestions.

Thanks,
Basanta

brdhungana · 2015-01-16T19:09:45Z

Debug option is not generating useful information for identifying the causes of failure:

nmf.options(debug = TRUE)
system.time(ckmNMF4R <- nmf(Train.ckm_2, rank=4, method="ns", theta=0.7, seed = 123456, nrun=16, .opt = "vp16"));

NMF call: .local(x = x, rank = rank, method = method, seed = 123456, nrun = 16,
  .options = "vp16", theta = 0.7)
NMF algorithm: 'nsNMF'
Multiple runs: 16

OPTIONS:

verbose: TRUE | parallel: 16 | garbage.collect: 50 | RNGstream: TRUE

Setting up requested foreach environment: try-parallel [par]

Check available cores ... [16]

Check requested cores ... [16]

Loading backend for specification par ... OK

Check host compatibility ... OK

Registering backend doParallel ... OK

Check allocated cores ... OK [16/16]

Setting up RNG ...

** Original RNG settings:

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 7L, ..., -1289165921L [4de1642ab154e963c6ea7ef488e195d8]

Generate RNGStream sequence using seed (403L, 624L, ..., 449848215L [ed7ba52c9c2666ca159b185949fd9d73]) ... OK

Using foreach backend: doParallelMC [version 1.0.8]

Mode: parallel (16/16 core(s))

Check shared memory capability ... NO [Package bigmemory required]

Setup temporary directory: '/home/XXXXXXX/XXXXXXXX/XXXXXX/NMF_1d015bc581a' ... OK

Running on 1 host(s): 'cma4-corp.XXXX.XX.XXX'

Using shared memory ... FALSE

Setting up libpath on workers for package(s) 'NMF' ... OK

libPaths:

/home/XXXXXXX/XXXXXXXX/R/x86_64-redhat-linux-gnu-library/3.1
/usr/lib64/R/library
/usr/share/R/library
numValues: 16, numResults: 0, stopped: TRUE
got results for task 1
numValues: 16, numResults: 1, stopped: TRUE
returning status FALSE
got results for task 2
numValues: 16, numResults: 2, stopped: TRUE
returning status FALSE
got results for task 3
numValues: 16, numResults: 3, stopped: TRUE
returning status FALSE
got results for task 4
numValues: 16, numResults: 4, stopped: TRUE
returning status FALSE
got results for task 5
numValues: 16, numResults: 5, stopped: TRUE
returning status FALSE
got results for task 6
numValues: 16, numResults: 6, stopped: TRUE
returning status FALSE
got results for task 7
numValues: 16, numResults: 7, stopped: TRUE
returning status FALSE
got results for task 8
numValues: 16, numResults: 8, stopped: TRUE
returning status FALSE
got results for task 9
numValues: 16, numResults: 9, stopped: TRUE
returning status FALSE
got results for task 10
numValues: 16, numResults: 10, stopped: TRUE
returning status FALSE
got results for task 11
numValues: 16, numResults: 11, stopped: TRUE
returning status FALSE
got results for task 12
numValues: 16, numResults: 12, stopped: TRUE
returning status FALSE
got results for task 13
numValues: 16, numResults: 13, stopped: TRUE
returning status FALSE
got results for task 14
numValues: 16, numResults: 14, stopped: TRUE
returning status FALSE
got results for task 15
numValues: 16, numResults: 15, stopped: TRUE
returning status FALSE

Processing partial results ... ERROR

Error: NMF::nmf - Unexpected error: no partial result seem to have been saved.
Timing stopped at: 2692.744 345.294 528.506

NMF computation exit status ... ERROR

Running rollback clean up ...

Restoring RNG settings ...

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 7L, ..., -1289165921L [4de1642ab154e963c6ea7ef488e195d8]

OK

Restoring NMF options ... OK

Restoring previous foreach backend '' ... OK

Deleting temporary directory '/XXXX/XXXXX/XXXXX/XXXXX/NMF_1d015bc581a' ... OK

Timing stopped at: 2698.415 345.833 549.012

brdhungana · 2015-01-20T20:36:17Z

I recently ran the same model with run=50 using 4 cores in Batch mode, it returned successfully! I am now experimenting with 16 cores and will update you as soon as I get results. At this point in time, I do not consider having any bug in the NMF source code.

Here is few lines from log of my 50th run:
numValues: 50, numResults: 50, stopped: TRUE
calling combine function
evaluating call object to combine results:
fun(accum, result.1, result.2, result.3, result.4, result.5,
result.6, result.7, result.8, result.9, result.10, result.11,
result.12, result.13, result.14, result.15, result.16, result.17,
result.18, result.19, result.20, result.21, result.22, result.23,
result.24, result.25, result.26, result.27, result.28, result.29,
result.30, result.31, result.32, result.33, result.34, result.35,
result.36, result.37, result.38, result.39, result.40, result.41,
result.42, result.43, result.44, result.45, result.46, result.47,
result.48, result.49, result.50)
returning status TRUE

Processing partial results ... OK

NMF computation exit status ... OK

Running normal exit clean up ...

Restoring NMF options ... OK

Restoring previous foreach backend '' ... OK

Updating RNG settings ... OK

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 1L, ..., 425501564L [c7f400f3798e6384ca89b63934b32173]

Deleting temporary directory '/home/XXXXXX/XXXXXX/XXXXX/NMF_53456e0662dc' ... OK

  user     system    elapsed

446964.134 8829.809 156572.488

Thanks,
BRD

brdhungana · 2015-01-22T20:14:21Z

I ran a model with nrun=100 but it failed to complete the compilation after finishing all the run. Please give me your email address to send you my debug log for your analysis. I experimented twice with different algorithms, both yielded the same error at the end.

renozao pushed a commit that referenced this issue Mar 4, 2014

Fixes issue #7: adapt doParallel backend registration to changes in t…

29ff43b

…he doParallel package

renozao closed this as completed Mar 20, 2014

brdhungana mentioned this issue Jun 23, 2015

nmfEstimateRank:: Error in (function (...) : All the runs produced an error #40

Open

kgeyer mentioned this issue Mar 10, 2022

Error: operator is invalid for atomic vectors #169

Open

WangJingwen21 mentioned this issue Sep 11, 2023

Met an error #176

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel backend is broken for doParallel >1.0.6 #7

Parallel backend is broken for doParallel >1.0.6 #7

mintyplanet commented Mar 3, 2014

renozao commented Mar 4, 2014

brdhungana commented Dec 5, 2014

renozao commented Dec 6, 2014

brdhungana commented Dec 19, 2014

brdhungana commented Jan 16, 2015

brdhungana commented Jan 16, 2015

NMF call: .local(x = x, rank = rank, method = method, seed = 123456, nrun = 16,

OPTIONS:

verbose: TRUE | parallel: 16 | garbage.collect: 50 | RNGstream: TRUE

Setting up requested `foreach` environment: try-parallel [par]

Check available cores ... [16]

Check requested cores ... [16]

Loading backend for specification `par` ... OK

Check host compatibility ... OK

Registering backend `doParallel` ... OK

Check allocated cores ... OK [16/16]

Setting up RNG ...

** Original RNG settings:

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 7L, ..., -1289165921L [4de1642ab154e963c6ea7ef488e195d8]

Generate RNGStream sequence using seed (403L, 624L, ..., 449848215L [ed7ba52c9c2666ca159b185949fd9d73]) ... OK

Using foreach backend: doParallelMC [version 1.0.8]

Check shared memory capability ... NO [Package `bigmemory` required]

Setup temporary directory: '/home/XXXXXXX/XXXXXXXX/XXXXXX/NMF_1d015bc581a' ... OK

Running on 1 host(s): 'cma4-corp.XXXX.XX.XXX'

Using shared memory ... FALSE

Setting up libpath on workers for package(s) 'NMF' ... OK

libPaths:

Processing partial results ... ERROR

NMF computation exit status ... ERROR

brdhungana commented Jan 20, 2015

brdhungana commented Jan 22, 2015

Parallel backend is broken for doParallel >1.0.6 #7

Parallel backend is broken for doParallel >1.0.6 #7

Comments

mintyplanet commented Mar 3, 2014

renozao commented Mar 4, 2014

brdhungana commented Dec 5, 2014

renozao commented Dec 6, 2014

brdhungana commented Dec 19, 2014

brdhungana commented Jan 16, 2015

brdhungana commented Jan 16, 2015

NMF call: .local(x = x, rank = rank, method = method, seed = 123456, nrun = 16,

OPTIONS:

verbose: TRUE | parallel: 16 | garbage.collect: 50 | RNGstream: TRUE

Setting up requested foreach environment: try-parallel [par]

Check available cores ... [16]

Check requested cores ... [16]

Loading backend for specification par ... OK

Check host compatibility ... OK

Registering backend doParallel ... OK

Check allocated cores ... OK [16/16]

Setting up RNG ...

** Original RNG settings:

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 7L, ..., -1289165921L [4de1642ab154e963c6ea7ef488e195d8]

Generate RNGStream sequence using seed (403L, 624L, ..., 449848215L [ed7ba52c9c2666ca159b185949fd9d73]) ... OK

Using foreach backend: doParallelMC [version 1.0.8]

Check shared memory capability ... NO [Package bigmemory required]

Setup temporary directory: '/home/XXXXXXX/XXXXXXXX/XXXXXX/NMF_1d015bc581a' ... OK

Running on 1 host(s): 'cma4-corp.XXXX.XX.XXX'

Using shared memory ... FALSE

Setting up libpath on workers for package(s) 'NMF' ... OK

libPaths:

Processing partial results ... ERROR

NMF computation exit status ... ERROR

Running rollback clean up ...

Restoring RNG settings ...

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 7L, ..., -1289165921L [4de1642ab154e963c6ea7ef488e195d8]

Restoring NMF options ... OK

Restoring previous foreach backend '' ... OK

Deleting temporary directory '/XXXX/XXXXX/XXXXX/XXXXX/NMF_1d015bc581a' ... OK

brdhungana commented Jan 20, 2015

Processing partial results ... OK

NMF computation exit status ... OK

Running normal exit clean up ...

Restoring NMF options ... OK

Restoring previous foreach backend '' ... OK

Updating RNG settings ... OK

RNG kind: Mersenne-Twister / Inversion

RNG state: 403L, 1L, ..., 425501564L [c7f400f3798e6384ca89b63934b32173]

Deleting temporary directory '/home/XXXXXX/XXXXXX/XXXXX/NMF_53456e0662dc' ... OK

brdhungana commented Jan 22, 2015

Setting up requested `foreach` environment: try-parallel [par]

Loading backend for specification `par` ... OK

Registering backend `doParallel` ... OK

Check shared memory capability ... NO [Package `bigmemory` required]