Parallelize training #100

thcrock · 2017-04-11T20:58:02Z

This is the one time-consuming that isn't parallelized in LocalParallelPipeline yet.

- Restructure ModelTrainer to make tasks using generate_train_tasks, and process them using process_train_task. train_models and generate_trained_models still exist but use this internally. - Have LocalParallelPipeline use new ModelTrainer interface to parallelize training - Remove deprecated matrix_store arg in ModelTrainer constructor, all references and test - Add assertion to prevent InMemoryModelStorageEngine from being used with LocalParallelPipeline to prevent need for shared memory management between trainer processes (could be added in future if desired), and change pipeline test to use FSModelStorageEngine - Have FSStore make directory structure if it doesn't exist - Move ModelTrainer#replace to constructor

Parallelize Training [Resolves #100]

thcrock added the performance label Apr 11, 2017

thcrock self-assigned this Apr 11, 2017

ecsalomon closed this as completed in bba2f64 Apr 18, 2017

ecsalomon added a commit that referenced this issue Apr 18, 2017

Merge pull request #102 from dssg/parallelize_training

dce910a

Parallelize Training [Resolves #100]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize training #100

Parallelize training #100

thcrock commented Apr 11, 2017

Parallelize training #100

Parallelize training #100

Comments

thcrock commented Apr 11, 2017