Skip to content

The multi-threading issues on RandomForestClassifier #6023

Closed
@xchmiao

Description

@xchmiao

Hi,

I'm using RandomForestClassifier to train a model on Ubuntu14.04 with python2.7.11 thru anaconda package. Below is the core coding:

rf = RandomForestClassifier(n_jobs = -1, random_state = seed)
parameters = {'n_estimators': [2000],
'criterion': ['entropy'],
'max_depth': [10],
'min_samples_leaf': [3],
#'oob_score': [False, True],
'max_features': ['auto']}

print "Start paramter grid search..."
start = time()
clf = GridSearchCV(rf, parameters, n_jobs = 4, scoring = 'roc_auc',
cv = StratifiedKFold(y_train, n_folds = 4, shuffle = True, random_state = 128),
verbose=2, refit = True)

clf.fit(x_train, y_train)

I turned on the CPU monitor to watch the CPU status on a quad-core system. However, in the beginning, all CPUs are 99% used. However, after ~1hr, the CPU's usage drops to ~0.3%, which does not seem normal.

Below is the output on from the terminal:


Fitting 4 folds for each of 1 candidates, totalling 4 fits
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max_
n_samples_leaf=3
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max_
n_samples_leaf=3
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max_
n_samples_leaf=3
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max_
n_samples_leaf=3
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max
in_samples_leaf=3 -68.4min
[Parallel(n_jobs=4)]: Done 1 jobs | elapsed: 68.4min
[CV] max_features=auto, n_estimators=2000, criterion=entropy, max
in_samples_leaf=3 -68.8min


Below is the status of CPU usage:


top - 01:40:46 up 4:40, 0 users, load average: 0.00, 0.00, 0.95
Tasks: 66 total, 1 running, 65 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.3 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 6553600 total, 4579708 used, 1973892 free, 0 buffers
KiB Swap: 6553600 total, 5511600 used, 1042000 free. 36652 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
994 root 20 0 23116 60 60 S 0.0 0.0 0:00.00 ptyserved
997 root 20 0 39372 0 0 S 0.0 0.0 0:00.01 nginx
1000 root 20 0 39876 924 536 S 0.0 0.0 0:01.75 nginx
1002 root 20 0 12736 0 0 S 0.0 0.0 0:00.00 getty
1004 root 20 0 12736 0 0 S 0.0 0.0 0:00.00 getty
1435 root 20 0 18144 24 24 S 0.0 0.0 0:00.00 bash
1455 root 20 0 59568 0 0 S 0.0 0.0 0:00.00 su
1456 root 20 0 18140 0 0 S 0.0 0.0 0:00.00 bash
1467 root 20 0 21916 800 504 R 0.0 0.0 0:12.72 top
1888 root 20 0 61316 4 4 S 0.0 0.0 0:00.00 sshd
1950 postfix 20 0 27408 272 184 S 0.0 0.0 0:00.05 qmgr
2062 root 20 0 59568 92 92 S 0.0 0.0 0:00.00 su
2063 root 20 0 18144 60 60 S 0.0 0.0 0:00.00 bash
2074 root 20 0 5861480 3260 888 S 0.0 0.0 1:26.11 python***
2087 root 20 0 8268200 2.020g 216 S 0.0 32.3 66:39.13 python***
2090 root 20 0 8268200 2.034g 1676 S 0.0 32.5 66:38.78 python***
2184 postfix 20 0 27356 524 240 S 0.0 0.0 0:00.00 pickup
2210 root 20 0 5861480 2836 104 S 0.0 0.0 0:00.00 python ****
2225 root 20 0 5861480 4040 820 S 0.0 0.1 0:00.00 python ****


Although the training data is about 207MB with 300 features, the CPU usage drop doesn't seem to usual.

Do you know what is going on?

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions