Skip to content

The 'predict' method of RandomForestRegressor runs too slowly when data is "Big" #4935

Closed
@jdk8

Description

@jdk8

Please execuse my poor English.

My codes is like this:
clf = RandomForestRegressor(random_state=0, n_estimators=100,n_jobs=4)
traindataX=StandardScaler().fit_transform(traindataX)
clf=clf.fit(traindataX,traindatay)

traindataX has 16 features and 889054 samples, each feature's value of traindataX is 0~1 before StandardScaler().fit_transform.
My machine has 4GB RAM, intel core i3 2.53GHZ CPU. Win7 64bits.

After about 1 hour's training, I need to predict test samples. But the prediction for a single sample will take about 4 minutes, and the memory is full, the prediction time is too long for me.
Is there any way to optimize the prediction speed of RandomForestRegressor?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions