Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

markovchainfit on a list doesn't identify the absorbing state #155

Open
ChristelSwift opened this issue Dec 12, 2018 · 6 comments
Open

markovchainfit on a list doesn't identify the absorbing state #155

ChristelSwift opened this issue Dec 12, 2018 · 6 comments
Assignees

Comments

@ChristelSwift
Copy link

ChristelSwift commented Dec 12, 2018

i'm trying to fit a markov chain from a large list where each element is a customer journey between states. Each journey ends with an "end" state (so "end" is an absorbing state), however in the returned estimated transition matrix, all the entries for the "end" row are zeros, when i would have expected a 1 for the element where row = "end" and column = "end". Am i doing something wrong?

Here's a simple reproducible example:

c1<-c("a","b","c","c","e") 
c2<-c("a","b","d","e") 
c3<-c("a","c","b","c","d") 
c4<-c("a","b","b","d","b","c","d","e") 
c5<-c("a","c","c","d","d") 
c6<-c("a","c","d","d","b","b","e") 

mylist<-list(c1,c2,c3,c4,c5,c6) 
mylistMc<-markovchainFit(data=mylist) 
mylistMc$estimate

mylistMc$estimate
MLE Fit
A 5 - dimensional discrete Markov Chain defined by the following states:
a, b, c, d, e
The transition matrix (by rows) is defined as follows:
a b c d e
a 0 0.5000000 0.500 0.0000000 0.0000000
b 0 0.2500000 0.375 0.2500000 0.1250000
c 0 0.1250000 0.250 0.5000000 0.1250000
d 0 0.3333333 0.000 0.3333333 0.3333333
e 0 0.0000000 0.000 0.0000000 0.0000000

@spedygiorgio
Copy link
Owner

spedygiorgio commented Dec 12, 2018

Hi @ChristelSwift, we will check and tell

@ncordon
Copy link
Collaborator

ncordon commented Dec 26, 2018

@ChristelSwift I do think there is an error and the matrix should be:

a b c d e
a 0 0.5000000 0.500 0.0000000 0.0000000
b 0 0.2500000 0.375 0.2500000 0.1250000
c 0 0.1250000 0.250 0.5000000 0.1250000
d 0 0.3333333 0.000 0.3333333 0.3333333
e 0 0.0000000 0.000 0.0000000 1.0000000

In fact the matrix we output is not even stochastic. And e should be in fact absorbing, shouldn't it?

@ncordon ncordon added the bug label Dec 26, 2018
@ChristelSwift
Copy link
Author

ChristelSwift commented Dec 26, 2018

@ncordon yes that's right, the last row of the matrix should be as you describe, but the estimate comes out as a row of zeros (not of ones as i originally wrote).

@spedygiorgio
Copy link
Owner

spedygiorgio commented Dec 29, 2018

@ChristelSwift , the reason is that e is the last element of some sequence and there is no possible transition to learn from. If "e" would be an absorbing state we would have expected all occurrences of "e" followed by "e". @ncordon , as discussed yesterday, we can add an option to the markovchainFit function (the name of the options still to be determined, maybe, no_transitions_hp, suggestions welcome) with the following options:

  • "zeros": current behaviour, all row are zero;
  • "assume absorbing": the state with no further transition is assumed to be absorbing and one is set; this is what @ChristelSwift suggests;
  • "uniform hp": all states have 1/dim(matrix) probability estimates;

@ChristelSwift
Copy link
Author

thanks very much for the explanation. In the meantime all i have to do is repeat the absorbing state twice at the end of each journey which is easily done. thank you!

@ncordon
Copy link
Collaborator

ncordon commented Jan 26, 2019

@ChristelSwift , the reason is that e is the last element of some sequence and there is no possible transition to learn from. If "e" would be an absorbing state we would have espected all occurrences of "e" followed by "e". @ncordon , as discussed yesterday, we can add an option to the markovchainFit function (the name of the options still to be determined, maybe, no_transitions_hp, suggestions welcome) with the following options:

* "zeros": current behaviour, all row are zero;

* "assume absorbing": the state with no further transition is assumed to be absorbing and one is set; this is what @ChristelSwift suggests;

* "uniform hp": all states have 1/dim(matrix) probability estimates;

I think we should discard the option "zeros". It is not useful from the mathematical point of view. A transition matrix for a markov chain should always be a by-row-stochastic matrix (i.e. each row probabilities sum up to one). In addition, it is something we already check in the constructor of markov chains (that the sum of each row is 1). From my perspective, it does not make sense to output a thing that does not comply with our requirements to be a markov chain. I'm adding the other two possibilities though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants