Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Timeseries Comparison #15

Open
theideasmith opened this issue May 25, 2016 · 5 comments
Open

Better Timeseries Comparison #15

theideasmith opened this issue May 25, 2016 · 5 comments

Comments

@theideasmith
Copy link
Collaborator

theideasmith commented May 25, 2016

Using Time Delayed Mutual Information instead of cross correlation. http://arxiv.org/pdf/1110.4102v1.pdf

@theideasmith
Copy link
Collaborator Author

theideasmith commented Jun 2, 2016

see this: http://www.stat.berkeley.edu/~binyu/summer08/L2P2.pdf

and some slightly modified mutual information code from stackoverflow (it works, why change it?)

def mutual_information(X,Y,bins):
    # See this page on wikipedia: https://en.wikipedia.org/wiki/Mutual_information
    c_XY = np.histogram2d(X,Y,bins)[0]
    c_X = np.histogram(X,bins)[0]
    c_Y = np.histogram(Y,bins)[0]

    H_X = shan_entropy(c_X)
    H_Y = shan_entropy(c_Y)
    H_XY = shan_entropy(c_XY)

    MI = H_X + H_Y - H_XY
    return MI

def shan_entropy(c):
    c_normalized = c / float(np.sum(c))
    c_normalized = c_normalized[np.nonzero(c_normalized)]
    H = -sum(c_normalized* np.log2(c_normalized))  
    return H

There are two issues I know of with mutual information, only one of which I think is relevant. The first is that MI is computationally expensive because you need to actually calculate your PDF for all variables as well as a joint PDF. The second is that the results are highly dependent on the chosen binning parameters. I know there has been some work by Liam Paninsky on this, but before adding MI to analysis.py we should definitely take some time to think about these issues.

@theideasmith
Copy link
Collaborator Author

I realize this is becoming somewhat a series of my personal random thoughts on this matter – so let it be.

I was thinking we should not only compare topological and functional clusters, but also functional connectivity and static connectivity. Something to keep in mind here (I think) is that functional/statistical connectivity may at first appear different than structural connectivity but then if you take into account physiological details (such as type of neurotransmitter, etc.), it may make sense. Also, we can also see if information flow trajectories as predicted by static connectome match functional information flow pipelines (with a delayed cross correlated / delayed mutual information analysis).

@lukeczapla
Copy link
Collaborator

Yes, I think taking account the neurotransmitter, for instance, may give a lot more insight. It may be missing something if it doesn't look at the relationship between neurotransmitters, network connectivity, and function.
I was following what you described regarding mutual information, and calculating joint PDFs can be a difficult task - whether in the context of analyzing known data or (even more difficultly) in analyzing models. The answer to some questions themselves can be found in that kind of analysis. When there is a multidimensional space it could be understood by a single parameter or a handful of parameters, rather than the whole space itself. Multidimensional spaces are complicated alone. By the way, is there a meeting for tomorrow on chat?

@Uiuran
Copy link

Uiuran commented May 28, 2019

see this: http://www.stat.berkeley.edu/~binyu/summer08/L2P2.pdf

and some slightly modified mutual information code from stackoverflow (it works, why change it?)

def mutual_information(X,Y,bins):
    # See this page on wikipedia: https://en.wikipedia.org/wiki/Mutual_information
    c_XY = np.histogram2d(X,Y,bins)[0]
    c_X = np.histogram(X,bins)[0]
    c_Y = np.histogram(Y,bins)[0]

    H_X = shan_entropy(c_X)
    H_Y = shan_entropy(c_Y)
    H_XY = shan_entropy(c_XY)

    MI = H_X + H_Y - H_XY
    return MI

def shan_entropy(c):
    c_normalized = c / float(np.sum(c))
    c_normalized = c_normalized[np.nonzero(c_normalized)]
    H = -sum(c_normalized* np.log2(c_normalized))  
    return H

There are two issues I know of with mutual information, only one of which I think is relevant. The first is that MI is computationally expensive because you need to actually calculate your PDF for all variables as well as a joint PDF. The second is that the results are highly dependent on the chosen binning parameters. I know there has been some work by Liam Paninsky on this, but before adding MI to analysis.py we should definitely take some time to think about these issues.

Is there a trustable python TDMI code ? Isnt better to direct code from the paper or translate the matlab version ? histogram is too noisy for a measure aimed on sensitivity ...

@Uiuran
Copy link

Uiuran commented Jun 3, 2019

Is possible to use knn to estimate tdmi with

jakobrunge/tigramite#36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants