Skip to content

Commit

Permalink
Try "processes" scheduler in gallery example
Browse files Browse the repository at this point in the history
When running the plot_haar_extraction_selection_classification.py
directly it fails with the error attached at the end. However, executing
the file via sphinx gallery or its content in an IPython console seems
to work. So test it in the CI.

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

        To fix this issue, refer to the "Safe importing of main module"
        section in https://docs.python.org/3/library/multiprocessing.html
  • Loading branch information
lagru committed Sep 18, 2023
1 parent d774eeb commit 0079f69
Showing 1 changed file with 16 additions and 14 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,8 @@
"""
from time import time
from functools import partial
from multiprocessing.pool import Pool

import dask
import numpy as np
import matplotlib.pyplot as plt

Expand Down Expand Up @@ -66,11 +65,13 @@ def extract_feature_image(img, feature_type, feature_coord=None):
# To speed up the example, extract the two types of features only
feature_types = ['type-2-x', 'type-2-y']

# Use multiprocessing pool to compute tasks in parallel
worker = partial(extract_feature_image, feature_type=feature_types)
# Build a computation graph using Dask. This allows the use of multiple
# CPU cores later during the actual computation
X = [dask.delayed(extract_feature_image)(img, feature_types) for img in images]
X = dask.delayed(np.stack)(X)
# Compute the result
t_start = time()
X = [x for x in Pool().map(worker, images)]
X = np.stack(X)
X = X.compute(scheduler="processes")
time_full_feature_comp = time() - t_start

# Label images (100 faces and 100 non-faces)
Expand Down Expand Up @@ -137,15 +138,16 @@ def extract_feature_image(img, feature_type, feature_coord=None):
# but we would like to emphasize the usage of `feature_coord` and `feature_type`
# to recompute a subset of desired features.

# Use multiprocessing pool to compute tasks in parallel
worker = partial(
extract_feature_image,
feature_type=feature_type_sel,
feature_coord=feature_coord_sel
)
# Build a computation graph using Dask. This allows the use of multiple
# CPU cores later during the actual computation
X = [
dask.delayed(extract_feature_image)(img, feature_type_sel, feature_coord_sel)
for img in images
]
X = dask.delayed(np.stack)(X)
# Compute the result
t_start = time()
X = [x for x in Pool().map(worker, images)]
X = np.stack(X)
X = X.compute(scheduler="processes")
time_subs_feature_comp = time() - t_start

y = np.array([1] * 100 + [0] * 100)
Expand Down

0 comments on commit 0079f69

Please sign in to comment.