[MRG] Multithreaded HMM training #30

jmschrei · 2015-09-16T19:55:28Z

In progress

jmschrei · 2015-09-21T22:45:07Z

This is almost ready. Just need to add benchmarks, add more thorough unit tests, and we should be good to go.

jmschrei · 2015-09-24T05:23:40Z

Preliminary single core results:

master

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
FORWARD         : time: 11.998, logp: -62.823
BACKWARD        : time: 14.207, logp: -62.823
VITERBI         : time: 9.1404, logp: -64.18
FORWARD-BACKWARD: time: 34.526
BW TRAINING     : time: 4.9141, improvement: 4495.0

multivariate gaussian emissions
FORWARD         : time: 27.994, logp: -922.76
BACKWARD        : time: 28.35, logp: -922.76
VITERBI         : time: 23.949, logp: -922.76
FORWARD-BACKWARD: time: 77.954
BW TRAINING     : time: 2.3619, improvement: 8.7799e+04

branch

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
FORWARD         : time: 4.9733, logp: -62.823
BACKWARD        : time: 5.0479, logp: -62.823
VITERBI         : time: 8.3918, logp: -64.18
FORWARD-BACKWARD: time: 17.787
BW TRAINING     : time: 1.3614, improvement: 4495.0

multivariate gaussian emissions
FORWARD         : time: 5.474, logp: -922.76
BACKWARD        : time: 5.6663, logp: -922.76
VITERBI         : time: 23.631, logp: -922.76
FORWARD-BACKWARD: time: 33.408
BW TRAINING     : time: 0.3183, improvement: 8.7799e+04

Only forward, backward, and training have been optimized, so it makes sense that forward-backward and Viterbi don't see huge improvements (forward-backward calls both forward and backward, so it is sped up a little). Master branch includes the speed improvements gained in the GIL released distribution section; these improvements are based only on improvements to the HMM code.

jmschrei · 2015-09-24T05:53:34Z

Multithreading training currently helps a lot with big models, but can be harmful for small models. This is because I'm calling the GIL a lot to ensure all the data structures are threadsafe. This model has 151 states.

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
BW TRAINING (1 thread) : time: 4.2026, improvement: 3846.1
BW TRAINING (4 threads): time: 2.2716, improvement: 3846.1

multivariate gaussian emissions
BW TRAINING (1 thread) : time: 8.8531, improvement: 7.3234e+05
BW TRAINING (4 threads): time: 4.6516, improvement: 7.3234e+05

In contrast, this model has 16 states.

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
BW TRAINING (1 thread) : time: 0.063518, improvement: 587.19
BW TRAINING (4 threads): time: 0.50892, improvement: 587.19

multivariate gaussian emissions
BW TRAINING (1 thread) : time: 0.15548, improvement: 7.1882e+04
BW TRAINING (4 threads): time: 1.2511, improvement: 7.1882e+04

jmschrei · 2015-09-25T05:24:41Z

branch

discrete distribution
FORWARD         : time: 8.1359, logp: -30.923
BACKWARD        : time: 8.2184, logp: -30.923
VITERBI         : time: 12.186, logp: -37.664
FORWARD-BACKWARD: time: 28.234
BW TRAINING     : time: 1.4654, improvement: 697.88

master

discrete distribution
FORWARD         : time: 16.173, logp: -30.923
BACKWARD        : time: 18.447, logp: -30.923
VITERBI         : time: 12.245, logp: -37.664
FORWARD-BACKWARD: time: 45.645
BW TRAINING     : time: 4.4667, improvement: 697.88

jmschrei · 2015-09-26T01:16:19Z

The only issue is one data structure is not thread-safe. I am currently looking for it.

jmschrei · 2015-09-27T04:14:05Z

Everything works, finally. Merging!

[MRG] Multithreaded HMM training

jmschrei changed the title ~~[WIP] Multithreaded HMM training~~ [MRG] Multithreaded HMM training Sep 25, 2015

jmschrei force-pushed the remove_gil_hmm branch 8 times, most recently from 60463f3 to 79ee86e Compare September 25, 2015 20:31

ENH gil released for forward, backward, bw training

685542a

jmschrei force-pushed the remove_gil_hmm branch from b8aee5c to 685542a Compare September 25, 2015 22:22

FIX discrete distribution summarize now threadsafe

7bbb980

jmschrei force-pushed the remove_gil_hmm branch from 0d60a88 to 7bbb980 Compare September 27, 2015 04:01

jmschrei pushed a commit that referenced this pull request Sep 27, 2015

Merge pull request #30 from jmschrei/remove_gil_hmm

3f40c06

[MRG] Multithreaded HMM training

jmschrei merged commit 3f40c06 into master Sep 27, 2015

jmschrei deleted the remove_gil_hmm branch October 13, 2015 03:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Multithreaded HMM training #30

[MRG] Multithreaded HMM training #30

jmschrei commented Sep 16, 2015

jmschrei commented Sep 21, 2015

jmschrei commented Sep 24, 2015

jmschrei commented Sep 24, 2015

jmschrei commented Sep 25, 2015

jmschrei commented Sep 26, 2015

jmschrei commented Sep 27, 2015

[MRG] Multithreaded HMM training #30

[MRG] Multithreaded HMM training #30

Conversation

jmschrei commented Sep 16, 2015

jmschrei commented Sep 21, 2015

jmschrei commented Sep 24, 2015

jmschrei commented Sep 24, 2015

jmschrei commented Sep 25, 2015

jmschrei commented Sep 26, 2015

jmschrei commented Sep 27, 2015