Skip to content

'task failed to un-serialize' when run cross_val_score #12891

@xubury

Description

@xubury

Description

I am a python newbie and recently taking a course about ANN.I follow the course but a got stuck in one part where i need to execute the cross_val_score function.
It show the error 'BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.' It work fines when the n_jobs = 1 but it takes too much time.
so i look around the repository and found similiar issue,but the solution doesn't work for me.
I try 1. import the function from another file
2. upgrade scikit-learn to 0.20.2
3. downgrade to python 3.5 (in py3.5 it doesn't pop up the error but hangs the process)
none above works.

Steps/Code to Reproduce

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:,1:]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense

def build_classifier(self):
    classifier = Sequential()
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Expected Results

get the trainning result

Actual Results

exception calling callback for <Future at 0x289f6f3e048 state=finished raised BrokenProcessPool>
sklearn.externals.joblib.externals.loky.process_executor.RemoteTraceback:
'''
Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 391, in process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "D:\Anaconda\lib\multiprocessing\queues.py", line 113, in get
return ForkingPickler.loads(res)
File "D:\Anaconda\lib\site-packages\keras_init
.py", line 3, in
from . import utils
File "D:\Anaconda\lib\site-packages\keras\utils_init
.py", line 6, in
from . import conv_utils
File "D:\Anaconda\lib\site-packages\keras\utils\conv_utils.py", line 9, in
from .. import backend as K
File "D:\Anaconda\lib\site-packages\keras\backend_init.py", line 88, in
sys.stderr.write('Using TensorFlow backend.\n')
AttributeError: 'NoneType' object has no attribute 'write'
'''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky_base.py", line 625, in _invoke_callbacks
callback(self)
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 309, in call
self.parallel.dispatch_next()
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit
raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
Traceback (most recent call last):

File "", line 3, in
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

File "D:\Anaconda\lib\site-packages\sklearn\model_selection_validation.py", line 402, in cross_val_score
error_score=error_score)

File "D:\Anaconda\lib\site-packages\sklearn\model_selection_validation.py", line 240, in cross_validate
for train, test in cv.split(X, y, groups))

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in call
self.retrieve()

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
self._output.extend(job.get(timeout=self.timeout))

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 521, in wrap_future_result
return future.result(timeout=timeout)

File "D:\Anaconda\lib\concurrent\futures_base.py", line 432, in result
return self.__get_result()

File "D:\Anaconda\lib\concurrent\futures_base.py", line 384, in __get_result
raise self._exception

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky_base.py", line 625, in _invoke_callbacks
callback(self)

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 309, in call
self.parallel.dispatch_next()

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
fn, *args, **kwargs)

File "D:\Anaconda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit
raise self._flags.broken

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Versions

System:
python: 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)]
executable: D:\Anaconda\pythonw.exe
machine: Windows-10-10.0.17134-SP0

BLAS:
macros:
lib_dirs:
cblas_libs: cblas

Python deps:
pip: 18.1
setuptools: 39.1.0
sklearn: 0.20.2
numpy: 1.15.1
scipy: 1.1.0
Cython: 0.28.5
pandas: 0.23.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions