hmsm fit-ghmm #78

cxhernandez · 2014-03-13T17:36:16Z

Not sure exactly what the issue is, but this latest commit doesn't work on my project. I just get a blank json file and stops after:

Loading data into memory + vectorization: 2381.359713 s
Fitting with 2177 timeseries from 1287 trajectories with 14722812 total observations

The text was updated successfully, but these errors were encountered:

rmcgibbo · 2014-03-13T18:04:23Z

Did you check top? It's not just running slowly?

cxhernandez · 2014-03-13T18:05:33Z

It was a qsub job that ended without any errors. :-/

rmcgibbo · 2014-03-13T18:14:10Z

Did you collect both stdout and stderr? Maybe the error only went to stderr?

On Thu, Mar 13, 2014 at 11:05 AM, Carlos Hernández <notifications@github.com

wrote:

It was a qsub job that ended without any errors. :-/

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37566735
.

rmcgibbo · 2014-03-13T18:16:35Z

Maybe you can try fitting on only a subset of the trajectories (like 1) of
them, and just run it on the head node? If that works, we can scale up from
there. If it doesn't, we'll have more info on what's going on.

On Thu, Mar 13, 2014 at 11:14 AM, Robert McGibbon rmcgibbo@gmail.comwrote:

Did you collect both stdout and stderr? Maybe the error only went to
stderr?

On Thu, Mar 13, 2014 at 11:05 AM, Carlos Hernández <
notifications@github.com> wrote:

It was a qsub job that ended without any errors. :-/

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37566735
.

cxhernandez · 2014-03-13T18:18:40Z

Yeah, I did, but stderr was empty. I'll try your suggestion.

rmcgibbo · 2014-03-13T18:23:45Z

I looked through the code on master again -- nothing jumped out at me.
There are various places where exceptions can be raised, but that'll print
to stderr. If there's a malloc failure inside the e-step, that exit the
interpreter, but it also prints to stderr first.

On Thu, Mar 13, 2014 at 11:18 AM, Carlos Hernández <notifications@github.com

wrote:

Yeah, I did, but stderr was empty. I'll try your suggestion.

Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-37568261
.

cxhernandez · 2014-03-18T09:22:07Z

Okay, running the job interactive, it's at the 3-hour mark and still running according to top, which I take to be a good sign (it stops after about 2 hours on qsub). However I got this warning:

Loading data into memory + vectorization: 3694.696191 s
Fitting with 2177 timeseries from 1287 trajectories with 14722812 total observations
/home/cxh/anaconda/lib/python2.7/site-packages/mixtape/ghmm.py:283: UserWarning: Maximum likelihood reversible transition matrixoptimization failed: ABNORMAL_TERMINATION_IN_LNSRCH
  self.transmat_, self.populations_ = _reversibility.reversible_transmat(counts)

Can I ignore this warning, or will the results be severely affected?

cxhernandez · 2014-03-18T09:52:12Z

new warning, this time self-explanatory:

/home/cxh/anaconda/lib/python2.7/site-packages/mixtape/ghmm.py:283: UserWarning: Maximum likelihood reversible transition matrixoptimization failed: too many function evaluations or too many iterations
  self.transmat_, self.populations_ = _reversibility.reversible_transmat(counts)

rmcgibbo · 2014-03-18T10:10:01Z

The line search failure is fairly typical, and not catastrophic. I haven't seen the 'too many function evaluations before', but I don't think it's catastrophic either. It basically means that the transition matrix might not be 100% optimized, but it'll still usually be a pretty good solution (near the local minimum, it just didn't quite converge.)

How many states are you using?

cxhernandez · 2014-03-18T10:11:59Z

Around 10 with 8 lag times each

cxhernandez · 2014-03-18T10:14:17Z

To clarify: ten different states (4,5,7,10,12,15,20,25,30,25,40)

rmcgibbo · 2014-03-18T10:18:09Z

The cost scales with the cube of the number of the states. I haven't actually tried over ~10 states myself. You have a lot of data too. I'm reasonably confident that your calculation is just taking a long time -- not dying unexpectedly.

cxhernandez · 2014-03-18T10:22:47Z

Yeah, it seems like it. But it still doesn't explain why qsub shortchanges my walltime.... I'll go bother the proclus admins then. Thanks Robert!

cxhernandez closed this as completed Mar 18, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hmsm fit-ghmm #78

hmsm fit-ghmm #78

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

rmcgibbo commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

rmcgibbo commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

hmsm fit-ghmm #78

hmsm fit-ghmm #78

Comments

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 13, 2014

rmcgibbo commented Mar 13, 2014

cxhernandez commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

rmcgibbo commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

cxhernandez commented Mar 18, 2014

rmcgibbo commented Mar 18, 2014

cxhernandez commented Mar 18, 2014