Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflection on the construction of A and B for the calculation of iter_err #14

Open
ericgermaining opened this issue Jun 14, 2019 · 4 comments

Comments

@ericgermaining
Copy link
Collaborator

In traveltimeHMM.R, all individual objects put into A and B (mu_s, sigma^2_s, small_gamma, big_gamma, E) have equal weigths in the calculation of iter_err, as we simply perform sum(abs(A-B)). This should be reviewed as the resulting "distance" seems meaningless from an optimization perspective. See comments in code.

    A = c(mu_speed[-linksLessMinObs,], var_speed[-linksLessMinObs,])
    B = c(mu_speedNew[-linksLessMinObs,], var_speedNew[-linksLessMinObs,])
    if(grepl('HMM', model)){
        A = c(A, init[-init_L,], tmat[-unique(linksLessMinObs,only_init),])
        B = c(B, initNew[-init_L,], tmatNew[-unique(linksLessMinObs,only_init),])
    }
    if(grepl('trip',model)) { A = c(A, E); B = c(B,E_new)}
    iter_error = error.fun(A, B)
@melmasri
Copy link
Owner

@ericgermaining, I completely agree with you. In fact the model never converges to less than the specified threshold because of this issue. In the paper, I do not recall that they advised on this! Do you have an idea how to do it?

@ericgermaining
Copy link
Collaborator Author

@melmasri The only cue I see from the paper is "We iterate until there is no change in the parameter estimates up to the third significant figure". That means mu_s, variances, init and tmat. The trip effect should not be included, it is even not the same dimension as the others.

Indeed the model more or less converges very slowly to some value much higher than the default value of 10. Please have a look at the three graphs attached. I plotted the evolution of the error function through the iterations up to no 120 with the example supplied in the man page. I used separately the full THETA (the one from the code), the means only, and the variances only. The trip effect is not included as I used an HMM model.

Basically the code as it is "works" even though the threshold is much too low and even though the THETAs don't make sense numerically. The curves for means and variances are very much alike despite the lower values for variances. I suggest we take only the means and adjust the threshold so that it is more realistic. But then it all depends on your goals.
Graph1
Graph2
Graph3

@melmasri
Copy link
Owner

HI @ericgermaining, this is amazing. I didn't have time to do this analysis, this excellent and very insightful. Regarding the points

  • Yes it was wrong adding the the E variables, however should we include tau2? since it is the variance of E. From the paper, it seems that she doesn't, and maybe better not.
  • One reason why the mean and variance converge to much higher value than 10, maybe because of the line that orders the states. If you comment out that line, maybe things will drop much further. Woodard wanted to make sure of the interpretability of the states. For example, state 1 is congestion versus 2 non-congestion, hence the mean of 1 < 2. Numerically I am not sure how to do that. Therefore, I made the re-order function. If you comment out that line, the code would work. If you have time try the same plots with commenting out that line.
  • What would you suggest as a threshold to stop? how about number of parameters * 0.001, that means a maximum of 0.0001 error per parameter. In out case, we have about 40k links this would be about 40 error threshold?, what do you think?

Thanks a lot for those plots they are very helpful.

@ericgermaining
Copy link
Collaborator Author

Hi @melmasri,

I redid the plots with reordering disabled. Here they are.
debug1_orderingDisabled
debug2_orderingDisabled
debug3_orderingDisabled

There is an improvement, but not of an order of magnitude. That being said, I believe it is a good idea not to perform the reordering, as I don't see any non-intrusive way of doing so. May we assume that the ordering is preserved anyway? The difference in error reduction might be attributable to random...?

Regarding the ideal threshold: this is a tough question. I don't think there is a single answer. Currently, just choosing a different model type will affect the threshold. With an HMM model, the transition matrix and initial state vector are include in the computation of Theta. With a no-dependence model, they don't contribute. Also, the algorithm's performance is likely to be influenced by factors such as nQ. Therefore I suggest not to spend any time on this. Or maybe we should instead try to detect if there is ANY improvement over, say, the past 10 iterations?

Because of what I just said, I think it would be best to only include mu and sigma in the calculation of theta, for now at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants