Skip to content
This repository has been archived by the owner on Sep 15, 2021. It is now read-only.

hmsm fit-ghmm #78

Closed
cxhernandez opened this issue Mar 13, 2014 · 13 comments
Closed

hmsm fit-ghmm #78

cxhernandez opened this issue Mar 13, 2014 · 13 comments

Comments

@cxhernandez
Copy link
Member

Not sure exactly what the issue is, but this latest commit doesn't work on my project. I just get a blank json file and stops after:

Loading data into memory + vectorization: 2381.359713 s
Fitting with 2177 timeseries from 1287 trajectories with 14722812 total observations
@rmcgibbo
Copy link
Contributor

Did you check top? It's not just running slowly?

@cxhernandez
Copy link
Member Author

It was a qsub job that ended without any errors. :-/

@rmcgibbo
Copy link
Contributor

Did you collect both stdout and stderr? Maybe the error only went to stderr?

On Thu, Mar 13, 2014 at 11:05 AM, Carlos Hernández <notifications@github.com

wrote:

It was a qsub job that ended without any errors. :-/

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37566735
.

@rmcgibbo
Copy link
Contributor

Maybe you can try fitting on only a subset of the trajectories (like 1) of
them, and just run it on the head node? If that works, we can scale up from
there. If it doesn't, we'll have more info on what's going on.

On Thu, Mar 13, 2014 at 11:14 AM, Robert McGibbon rmcgibbo@gmail.comwrote:

Did you collect both stdout and stderr? Maybe the error only went to
stderr?

On Thu, Mar 13, 2014 at 11:05 AM, Carlos Hernández <
notifications@github.com> wrote:

It was a qsub job that ended without any errors. :-/

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37566735
.

@cxhernandez
Copy link
Member Author

Yeah, I did, but stderr was empty. I'll try your suggestion.

@rmcgibbo
Copy link
Contributor

I looked through the code on master again -- nothing jumped out at me.
There are various places where exceptions can be raised, but that'll print
to stderr. If there's a malloc failure inside the e-step, that exit the
interpreter, but it also prints to stderr first.

On Thu, Mar 13, 2014 at 11:18 AM, Carlos Hernández <notifications@github.com

wrote:

Yeah, I did, but stderr was empty. I'll try your suggestion.

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37568261
.

@cxhernandez
Copy link
Member Author

Okay, running the job interactive, it's at the 3-hour mark and still running according to top, which I take to be a good sign (it stops after about 2 hours on qsub). However I got this warning:

Loading data into memory + vectorization: 3694.696191 s
Fitting with 2177 timeseries from 1287 trajectories with 14722812 total observations
/home/cxh/anaconda/lib/python2.7/site-packages/mixtape/ghmm.py:283: UserWarning: Maximum likelihood reversible transition matrixoptimization failed: ABNORMAL_TERMINATION_IN_LNSRCH
  self.transmat_, self.populations_ = _reversibility.reversible_transmat(counts)

Can I ignore this warning, or will the results be severely affected?

@cxhernandez
Copy link
Member Author

new warning, this time self-explanatory:

/home/cxh/anaconda/lib/python2.7/site-packages/mixtape/ghmm.py:283: UserWarning: Maximum likelihood reversible transition matrixoptimization failed: too many function evaluations or too many iterations
  self.transmat_, self.populations_ = _reversibility.reversible_transmat(counts)

@rmcgibbo
Copy link
Contributor

The line search failure is fairly typical, and not catastrophic. I haven't seen the 'too many function evaluations before', but I don't think it's catastrophic either. It basically means that the transition matrix might not be 100% optimized, but it'll still usually be a pretty good solution (near the local minimum, it just didn't quite converge.)

How many states are you using?

@cxhernandez
Copy link
Member Author

Around 10 with 8 lag times each

@cxhernandez
Copy link
Member Author

To clarify: ten different states (4,5,7,10,12,15,20,25,30,25,40)

@rmcgibbo
Copy link
Contributor

The cost scales with the cube of the number of the states. I haven't actually tried over ~10 states myself. You have a lot of data too. I'm reasonably confident that your calculation is just taking a long time -- not dying unexpectedly.

@cxhernandez
Copy link
Member Author

Yeah, it seems like it. But it still doesn't explain why qsub shortchanges my walltime.... I'll go bother the proclus admins then. Thanks Robert!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants