Skip to content

Conversation

@HilmiYildirim
Copy link

Implemented a parallel version of Hidden Markov Models. It uses a parallel version of the Baum-Welch algorithm to train the model and uses the Viterbi algorithm to predict the sequence labels.

The prediction quality of my examples is low. The reason for this is probably the small size of the model prameters

@chiwanpark
Copy link
Member

Hi @HilmiYildirim, Thanks for opening pull request. I would like to shepherd this PR. Could you rename this PR to include JIRA issue number? I think "[FLINK-3003] Implement a parallel version of the Hidden Markov Model" would be better.

@uce
Copy link
Contributor

uce commented Nov 12, 2015

3003 is wrong, its [FLINK-3007]

…allel version of the Baum-Welch algorithm to train the model and uses the Viterbi algorithm to predict the sequence labels

Implemented a parallel version of Hidden Markov Models. It uses a parallel version of the Baum-Welch algorithm to train the model and uses the Viterbi algorithm to predict the sequence labels
@HilmiYildirim HilmiYildirim changed the title Implemented a parallel version of Hidden Markov Models. It uses a par… [Flink-3007] Implemented a parallel version of Hidden Markov Models. It uses a par… Nov 12, 2015
@chiwanpark
Copy link
Member

@uce Thanks for correction. :-)

@HilmiYildirim
Copy link
Author

I have renamed the pull request and adapted the source code so that all checks have passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this file to HMMITSuite? Otherwise this test will always be run when Unit tests are executed.

@tillrohrmann
Copy link
Contributor

Thanks for your contribution @HilmiYildirim. I had some comments.

I'm not so sure about the correctness of the implementation. Where did you get the testing data from @HilmiYildirim. Is that from some published source?

The current implementation lacks the integration with the FlinkML's pipelining mechanism. Furthermore, it only works on integers. What if the observations are characters for example.

The PR needs more documentation for better understanding the code and maintaining it.

Could we maybe add a smaller test file. One of the files is 2MB large. This should definitely be smaller.

I think this PR needs a little bit of more effort to get in shape.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to make the formatting consistent with the code base.

@zentol
Copy link
Contributor

zentol commented Feb 28, 2019

Closing since flink-ml is effectively frozen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants