Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Multithreaded HMM training #30

Merged
merged 2 commits into from Sep 27, 2015
Merged

[MRG] Multithreaded HMM training #30

merged 2 commits into from Sep 27, 2015

Conversation

jmschrei
Copy link
Owner

In progress

@jmschrei
Copy link
Owner Author

This is almost ready. Just need to add benchmarks, add more thorough unit tests, and we should be good to go.

@jmschrei
Copy link
Owner Author

Preliminary single core results:

master

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
FORWARD         : time: 11.998, logp: -62.823
BACKWARD        : time: 14.207, logp: -62.823
VITERBI         : time: 9.1404, logp: -64.18
FORWARD-BACKWARD: time: 34.526
BW TRAINING     : time: 4.9141, improvement: 4495.0

multivariate gaussian emissions
FORWARD         : time: 27.994, logp: -922.76
BACKWARD        : time: 28.35, logp: -922.76
VITERBI         : time: 23.949, logp: -922.76
FORWARD-BACKWARD: time: 77.954
BW TRAINING     : time: 2.3619, improvement: 8.7799e+04

branch

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
FORWARD         : time: 4.9733, logp: -62.823
BACKWARD        : time: 5.0479, logp: -62.823
VITERBI         : time: 8.3918, logp: -64.18
FORWARD-BACKWARD: time: 17.787
BW TRAINING     : time: 1.3614, improvement: 4495.0

multivariate gaussian emissions
FORWARD         : time: 5.474, logp: -922.76
BACKWARD        : time: 5.6663, logp: -922.76
VITERBI         : time: 23.631, logp: -922.76
FORWARD-BACKWARD: time: 33.408
BW TRAINING     : time: 0.3183, improvement: 8.7799e+04

Only forward, backward, and training have been optimized, so it makes sense that forward-backward and Viterbi don't see huge improvements (forward-backward calls both forward and backward, so it is sped up a little). Master branch includes the speed improvements gained in the GIL released distribution section; these improvements are based only on improvements to the HMM code.

@jmschrei
Copy link
Owner Author

Multithreading training currently helps a lot with big models, but can be harmful for small models. This is because I'm calling the GIL a lot to ensure all the data structures are threadsafe. This model has 151 states.

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
BW TRAINING (1 thread) : time: 4.2026, improvement: 3846.1
BW TRAINING (4 threads): time: 2.2716, improvement: 3846.1

multivariate gaussian emissions
BW TRAINING (1 thread) : time: 8.8531, improvement: 7.3234e+05
BW TRAINING (4 threads): time: 4.6516, improvement: 7.3234e+05

In contrast, this model has 16 states.

HIDDEN MARKOV MODEL BENCHMARKS
gaussian emissions
BW TRAINING (1 thread) : time: 0.063518, improvement: 587.19
BW TRAINING (4 threads): time: 0.50892, improvement: 587.19

multivariate gaussian emissions
BW TRAINING (1 thread) : time: 0.15548, improvement: 7.1882e+04
BW TRAINING (4 threads): time: 1.2511, improvement: 7.1882e+04

@jmschrei jmschrei changed the title [WIP] Multithreaded HMM training [MRG] Multithreaded HMM training Sep 25, 2015
@jmschrei
Copy link
Owner Author

branch

discrete distribution
FORWARD         : time: 8.1359, logp: -30.923
BACKWARD        : time: 8.2184, logp: -30.923
VITERBI         : time: 12.186, logp: -37.664
FORWARD-BACKWARD: time: 28.234
BW TRAINING     : time: 1.4654, improvement: 697.88

master

discrete distribution
FORWARD         : time: 16.173, logp: -30.923
BACKWARD        : time: 18.447, logp: -30.923
VITERBI         : time: 12.245, logp: -37.664
FORWARD-BACKWARD: time: 45.645
BW TRAINING     : time: 4.4667, improvement: 697.88

@jmschrei jmschrei force-pushed the remove_gil_hmm branch 8 times, most recently from 60463f3 to 79ee86e Compare September 25, 2015 20:31
@jmschrei
Copy link
Owner Author

The only issue is one data structure is not thread-safe. I am currently looking for it.

@jmschrei
Copy link
Owner Author

Everything works, finally. Merging!

jmschrei pushed a commit that referenced this pull request Sep 27, 2015
[MRG] Multithreaded HMM training
@jmschrei jmschrei merged commit 3f40c06 into master Sep 27, 2015
@jmschrei jmschrei deleted the remove_gil_hmm branch October 13, 2015 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant