Adding parallel_voxel_fit decorator #1418

skoudoro · 2018-02-06T21:38:19Z

This PR is an updated version of PR #1135 and #642. I have simplified and refactored the code of @MrBago and @sahmed95.

You just need to add this decorator (@parallel_voxel_fit ) to your model for splitting your data into small chunks and run them through several processors.

To define the number of processors, you just need to do: model.fit(data, nb_processes=6)

Currently, all the code is in reconst/multi_voxe.py but I suppose that this is not the right place and any ideas are welcome.

Can you have a look @nilgoyette @MrBago @arokem @sahmed95 @Garyfallidis?

Thanks!

Note: In order for multiprocessing to be beneficial, the fitting task needs to be significant. Significant means bigger than the overhead created by multiprocessing during inter-process communication. If it is not the case, use multi_voxel_fit decorator.

Note: multiprocessing is using fork() on Linux when it starts a new child process. On Windows -- which does not support fork() -- multiprocessing is using the win32 API call CreateProcess. It creates an entirely new process from any given executable (which is quite slow)

nilgoyette

I didn't really check the code because I'm not too sure what it'll do, but here are some minor things.

nilgoyette · 2018-02-07T13:13:59Z

dipy/reconst/tests/test_multi_voxel.py

+        return SillyFit(self, data)
+
+    def predict(self, S0):
+        return np.ones(10) * S0


Why not use numpy.full()? Or if S0 is an array, 10 * S0?

nilgoyette · 2018-02-07T13:15:43Z

dipy/reconst/tests/test_multi_voxel.py

+def test_parallel_voxel_fit():
+    voxel_fit(SillyParallelModel, SillyFit)
+    model = SillyParallelModel()
+    # Test with a mask and few processors


Do you mean 'processes'?

nilgoyette · 2018-02-07T13:20:15Z

dipy/reconst/multi_voxel.py

+
+        # Get non null index from mask
+        indexes = np.argwhere(mask)
+        # convert indexes to tuple


You sometime start with a

# Majuscule # miniscule

Could you please keep it standard? Here and elsewhere.

nilgoyette · 2018-02-07T13:25:28Z

dipy/reconst/multi_voxel.py

+
+        # Get number of processes
+        nb_processes = int(kwargs['nb_processes']) if 'nb_processes' in kwargs else cpu_count()
+        nb_processes = cpu_count() if nb_processes < 1 else nb_processes


I think it's clearer this way.

nb_processes = int(kwargs.get('nb_processes', '0')) nb_processes = nb_processes if nb_processes >= 1 else cpu_count()

nilgoyette · 2018-02-07T13:27:07Z

dipy/reconst/multi_voxel.py

+        if nb_processes == 1:
+            return single_voxel_fit(model, data, *args, **kwargs)
+
+        # Get non null index from mask


Not singular, indexes or indices.

nilgoyette · 2018-02-07T13:29:18Z

dipy/reconst/multi_voxel.py

+    """
+    Each pool process calls this initializer.
+    Load the array to be populated into
+    that process's global namespace


Why the \n here? The doc can also reach 79 characters.

nilgoyette · 2018-02-07T13:32:55Z

dipy/reconst/multi_voxel.py

+        numpy ndarray that you want to convert.
+    lock : boolean
+        controls the access to the shared array. When you shared
+        array is a read only access, you do not need lock. Otherwise,


When your shared array has a ... ? need a lock?

skoudoro · 2018-02-07T21:51:16Z

Thank you @nilgoyette for the review, I made the changes.

pep8speaks · 2018-05-08T15:21:55Z

Hello @skoudoro, Thank you for updating !

In the file dipy/core/tests/test_parallel.py, following are the PEP8 issues :

Line 23:81: E501 line too long (91 > 80 characters)

In the file dipy/reconst/multi_voxel.py, following are the PEP8 issues :

Line 99:15: E221 multiple spaces before operator
Line 104:81: E501 line too long (82 > 80 characters)
Line 128:1: W293 blank line contains whitespace

Comment last updated on August 21, 2018 at 16:52 Hours UTC

codecov-io · 2018-05-08T19:30:41Z

Codecov Report

Merging #1418 into master will decrease coverage by 0.1%.
The diff coverage is 88.83%.

@@            Coverage Diff             @@
##           master    #1418      +/-   ##
==========================================
- Coverage   87.32%   87.22%   -0.11%     
==========================================
  Files         246      248       +2     
  Lines       31806    31936     +130     
  Branches     3450     3467      +17     
==========================================
+ Hits        27775    27855      +80     
- Misses       3210     3249      +39     
- Partials      821      832      +11

Impacted Files	Coverage Δ
dipy/reconst/tests/test_mapmri.py	`98.54% <ø> (-0.01%)`	⬇️
dipy/core/parallel.py	`100% <100%> (ø)`
dipy/reconst/tests/test_cross_validation.py	`100% <100%> (ø)`	⬆️
dipy/reconst/tests/test_dsi_metrics.py	`92% <100%> (+0.33%)`	⬆️
dipy/reconst/tests/test_dsi.py	`97.67% <100%> (+0.02%)`	⬆️
dipy/tracking/life.py	`97.8% <100%> (ø)`	⬆️
dipy/reconst/forecast.py	`91.7% <100%> (-0.52%)`	⬇️
dipy/reconst/ivim.py	`79.5% <100%> (ø)`	⬆️
dipy/reconst/fwdti.py	`83.01% <100%> (-11.33%)`	⬇️
dipy/reconst/mapmri.py	`90.02% <100%> (-0.26%)`	⬇️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e1da86...d7845b5. Read the comment docs.

nilgoyette · 2018-05-08T21:59:27Z

Oh, damn, I wasn't aware that you can't use np.full! I'll stop advising people to use it in DiPy.

skoudoro · 2018-05-08T22:23:40Z

or maybe, we should update our Numpy minimal version. I wonder what is the Numpy version of Centos

arokem

Generally looks good. I had a couple of comments.

Have you had a chance to profile this? Does it really lead to any improvement in a realistic test-case?

Also - could you please rebase this on master? I think that at least some of the test failures should be resolved now.

arokem · 2018-06-12T12:00:12Z

dipy/reconst/multi_voxel.py

+global shared_arr
+
+
+def shm_as_ndarray(mp_array, shape=None):


I wonder about the naming here: there's nothing specific to spherical harmonics here, is there? Or does "shm" stand for something else here?

right, bad naming, shm for shared multiprocessing array. I will changed that

arokem · 2018-06-12T12:00:27Z

dipy/reconst/multi_voxel.py

+    return np.asarray(result)
+
+
+def ndarray_to_shm(arr, lock=False):


And same here

arokem · 2018-06-12T12:09:06Z

dipy/reconst/multi_voxel.py

+        # shared_arr = np.ctypeslib.as_array(arr_to_populate)
+        # shared_arr = shared_arr.reshape(shape)
+        shared_arr = shm_as_ndarray(arr_to_populate, shape)
+


I would love to have some unit tests for the functions above. If I understand correctly, right now they are tested only through the decorator.

That's true, I will add them.

BTW, I still wonder if these functions (shm_as_array and ndarray_to_shm) are in the right place because they can be used for other purposes like peaks_from_model

Maybe in a new dipy.core.parallel module?

…work to do : - improve memory management. it is growing to much - improve speed, "only" 2 times faster on my machine.

we use now shared array

… has never passed on windows

Garyfallidis · 2018-08-08T20:55:56Z

dipy/reconst/multi_voxel.py

+        return a list of tuple(voxel index, model fitted instance)
+    """
+    model, chunks = arguments
+    return [(idx, model.fit(shared_arr[idx]))


Check list in case of old memory issue.

Garyfallidis · 2018-08-08T21:00:34Z

dipy/reconst/multi_voxel.py

+            raise ValueError("mask and data shape do not match")
+
+        # Get number of processes
+        nb_processes = int(kwargs.pop('nb_processes', '0'))


Maybe the default should be None for all available cpus.

Garyfallidis · 2018-08-08T21:06:20Z

dipy/reconst/multi_voxel.py

+                  for i in range(1, len(chunks_spacing))]
+
+        # Create shared array
+        shared_arr_in = ndarray_to_mparray(data)


Add in the comment that you make a single copy of the data here. If you make a copy.

Garyfallidis · 2018-08-08T21:09:41Z

dipy/core/parallel.py

+        copied shared array
+    """
+
+    array1d = arr.ravel(order='A')


This copy is not needed. Use directly arr.

Borda · 2019-04-27T15:48:20Z

dipy/reconst/multi_voxel.py

+        return a list of tuple(voxel index, model fitted instance)
+    """
+    model, chunks = arguments
+    return [(idx, model.fit(shared_arr[idx]))


write it in a single line?

I do not see the benefit. it will be harder to read and understand if I change that

Borda · 2019-04-27T15:51:00Z

dipy/reconst/tests/test_ivim.py

+                             2614.87695312, 2316.55371094, 2267.7722168])
+
+    noisy_multi = np.zeros((2, 2, 1, len(gtab.bvals)))
+    noisy_multi[0, 1, 0] = noisy_multi[


this break line is not very good for reading...

I agree, I will change that

oesteban · 2022-12-02T13:04:10Z

Note: multiprocessing is using fork() on Linux when it starts a new child process. On Windows -- which does not support fork() -- multiprocessing is using the win32 API call CreateProcess. It creates an entirely new process from any given executable (which is quite slow)

I would expect an enormous overhead in either platform - would it be worth trying to parallelize using threading? (that said, I have externally tried with threads, and it doesn't seem to yield any gains, which is surprising at the very least)

EDIT: to add more information, it seems to me that some models (I have tried only with DKI) may have a time floor, so unless some operation that takes time (regardless of the amount of data being passed in) is parallelized, I don't really see any gains in passing more or less data.

arokem · 2022-12-03T22:38:49Z

See also discussion in nipreps/eddymotion#101. Indeed, some models (DTI and DKI, in particular) will not benefit from more parallelization, because we're already using multi-threaded matrix operations at the numpy level. This kind of parallelization would be most beneficial for models that can't utilize that kind of parallelization, or at least can't do that easily.

arokem · 2022-12-03T22:40:35Z

Incidentally, is this PR superseded by #2539?

oesteban · 2022-12-04T14:32:45Z

I don't think that forking for every voxel will be of great benefit. An ITK approach (chunking long arrays of voxels into sections) may be more effective with subprocesses. I will check on the project linked by Ariel and report back.

skoudoro · 2022-12-13T20:10:10Z

Incidentally, is this PR superseded by #2539?

Yes, closing in favor of #2539

nilgoyette reviewed Feb 7, 2018

View reviewed changes

skoudoro closed this Apr 30, 2018

skoudoro deleted the parallel-voxel-fit branch April 30, 2018 17:07

skoudoro restored the parallel-voxel-fit branch April 30, 2018 17:07

skoudoro reopened this May 8, 2018

arokem reviewed Jun 12, 2018

View reviewed changes

skoudoro force-pushed the parallel-voxel-fit branch from aee0aa0 to 554be01 Compare June 12, 2018 15:53

skoudoro added 18 commits July 27, 2018 13:50

reduce the number of warning to make travis happy

86e4f9e

pep8

63c00e7

first implementation of parallel voxel fit and its tests. Still many …

c93c8c5

…work to do : - improve memory management. it is growing to much - improve speed, "only" 2 times faster on my machine.

improve memory management and simplify algorithm

b9a154d

we use now shared array

adding tests for parallel_voxel_fit

ccc7950

update decorator for some model

70cf23b

fix doc, typo and replace np.ones by np.full

a947430

fix postional argument

c0b6ec6

removing doctest example

ba28f15

restore doctest/example for parallel_voxel_fit

412a3f9

init nose via setup_module

5bb2122

replace np.ctypeslib.as_array by shm_as_ndarray

8c6c8d1

np.full not available in numpy 1.71 so we replace it

a9dae7b

fix pep8 issue

40bd113

make sure to reset global sharred array

5b60ddc

prevent zombies process

7b9343c

remove *args, **kwargs

2e48ee8

move some function in a parallel module

1bfb729

skoudoro force-pushed the parallel-voxel-fit branch from ba0063d to 1bfb729 Compare July 27, 2018 22:14

Deactivate a speed test. I do not know if this is relevant or not. it…

929a6ab

… has never passed on windows

Garyfallidis reviewed Aug 8, 2018

View reviewed changes

dipy/core/parallel.py

copied shared array

"""

array1d = arr.ravel(order='A')

Copy link

Contributor

Garyfallidis Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copy is not needed. Use directly arr.

skoudoro added 2 commits August 20, 2018 16:00

multi_voxel_fit call correction

b925388

update mask management

d7845b5

This was referenced Sep 15, 2018

Multprocessing the multivoxel fit #1026

Open

WIP : Multiprocessing - implemented a parallel_voxel_fit decorator #1135

Closed

WIP - NF parallel framework #642

Closed

arokem added this to the 1.0 milestone Mar 15, 2019

Borda reviewed Apr 27, 2019

View reviewed changes

skoudoro removed this from the 1.0 milestone Jul 25, 2019

skoudoro linked an issue Jun 22, 2021 that may be closed by this pull request

Multprocessing the multivoxel fit #1026

Open

skoudoro mentioned this pull request May 8, 2022

Embed parallelization into the multi_voxel_fit decorator. #2593

Open

skoudoro closed this Dec 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding parallel_voxel_fit decorator #1418

Adding parallel_voxel_fit decorator #1418

skoudoro commented Feb 6, 2018 •

edited

nilgoyette left a comment

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

nilgoyette Feb 7, 2018

skoudoro commented Feb 7, 2018

pep8speaks commented May 8, 2018 •

edited

codecov-io commented May 8, 2018 •

edited

nilgoyette commented May 8, 2018

skoudoro commented May 8, 2018

arokem left a comment

arokem Jun 12, 2018

skoudoro Jun 12, 2018

arokem Jun 12, 2018

arokem Jun 12, 2018

skoudoro Jun 12, 2018

arokem Jun 14, 2018

Garyfallidis Aug 8, 2018

Garyfallidis Aug 8, 2018

Garyfallidis Aug 8, 2018

Garyfallidis Aug 8, 2018

Borda Apr 27, 2019

skoudoro Apr 29, 2019

Borda Apr 27, 2019

skoudoro Apr 29, 2019

oesteban commented Dec 2, 2022 •

edited

arokem commented Dec 3, 2022

arokem commented Dec 3, 2022

oesteban commented Dec 4, 2022

skoudoro commented Dec 13, 2022

		return np.asarray(result)


		def ndarray_to_shm(arr, lock=False):

Adding parallel_voxel_fit decorator #1418

Adding parallel_voxel_fit decorator #1418

Conversation

skoudoro commented Feb 6, 2018 • edited

nilgoyette left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skoudoro commented Feb 7, 2018

pep8speaks commented May 8, 2018 • edited

Comment last updated on August 21, 2018 at 16:52 Hours UTC

codecov-io commented May 8, 2018 • edited

Codecov Report

nilgoyette commented May 8, 2018

skoudoro commented May 8, 2018

arokem left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oesteban commented Dec 2, 2022 • edited

arokem commented Dec 3, 2022

arokem commented Dec 3, 2022

oesteban commented Dec 4, 2022

skoudoro commented Dec 13, 2022

skoudoro commented Feb 6, 2018 •

edited

pep8speaks commented May 8, 2018 •

edited

codecov-io commented May 8, 2018 •

edited

oesteban commented Dec 2, 2022 •

edited