Skip to content

bark-rach/ContinuousMIEstimation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

======BASIC DESCRIPTION:=====

This software implements the modified Kraskov-Stoegbauer-Grassberger (KSG)
estimator of mutual information between two continuous variables, with
modifications introduced by Holmes and Nemenman in

1. Holmes, CM. & Nemenman, I.  Estimation of mutual information for
real-valued data with error bars and controlled bias. Submitted, 2018.

Please cite the reference [1] above when using this software. The software
is COPYRIGHTED by Holmes and Nemenman, 2018, and is distributed under
GPL 3. Please read the license information in the attached file.

The original KSG estimator was introduced in

2. Kraskov, A., Stoegbauer, H., & Grassberger, P. (2004). Estimating
mutual information. Phys Rev E, 69(6), 066138.

3. Stoegbauer, H., Kraskov, A., Astakhov, S.A., & Grassberger,
P. (2004), Least Dependent Component Analysis Based on Mutual
Information. Physical Review E, 70, 066123.

The code for the KSG estimator was copyrighted and distributed under
GPL 3 in 2009, by Astakhov, Kraskov, Stoegbauer, and Grassberger. The
original KSG code is distributed with no changes together  with this
software for ease of use and completeness. Please cite references [2]
and [3] when using this software as well.

All comments, suggestions, and proposed modifications should be
communicated to the authors:
	     Caroline M. Holmes <cholmes@princeton.edu>
	     Ilya Nemenman <ilya.nemenman@emory.edu>

The main GitHub repository for this software is:
    	     https://github.com/EmoryUniversityTheoreticalBiophysics/ContinuousEntropyEstimation
The software can be downloaded and modified there.

The software requires a C compiler with standard libraries, as well as
a newer version of Matlab (we have run it using Matlab R2016b, but
newer and older versions may also work).


=====CONTENT OF THE PACKAGE=====

The contents of this folder are as follows:

1. A folder titled ‘kraskovStoegbauerGrassberger’. This contains the
files that perform the nearest-neighbor mutual information estimation
using the KSG estimator (see Refs. [2,3]). The files are distributed
with no change and copyrighted by their original authors. You will need
to do a few things in this folder before you can use the estimator:

	a. In MatLab command line, execute the following command:
	      	     setenv('PATH', [getenv('PATH') ':[PACKAGE_PATH]/kraskovStoegbauerGrassberger']);
            where [PACKAGE PATH] is the path to the directory where
	    this package was installed (path to the directory
	    containing this README). This should work on Linux, Mac
	    and other Unix-based system; path in Windows will require
	    different syntaxis.
		     
	    
	b. In terminal, navigate to the kraskovGrassberger folder and
	    compile the code, using something like:
              gcc -c miutils.C -o miutils.o
	      gcc MIxnyn.C -o MIxnyn miutils.o
	
2. The .m files described in the text of Ref. [1]. For examples of how
to run these files on data, see the ExampleScripts below.

	findMI_KSG_subsampling.m - This function calculates the mutual
	   information between X and Y both for the full data set and for
	   a series of nonoverlapping subsets, and outputs both that
	   information for each of the subsets. This also allows the user
	   to check for sample size dependent bias in the mutual
	  information estimate. 
	findMI_KSG_stddev.m - This function calculates the error bars
	   for the mutual information estimate, using the chi-squared
	   method described in [1], which involves extrapolating from
	   the variances at smaller N’s to the variance at the full
	   sample size. This function takes as an input the outputs of
	   findMI_KSG_subsampling.m.  
	findMI_KSG_bias_kN.m - This function calls both
	   findMI_KSG_subsampling.m and findMI_KSG_stddev.m. It
	   performs the information estimates at various values of k,
	   allowing the user to check the k-dependence of the mutual
	   information estimate, and outputs the mutual information
	   and error bars for all requested values of k. 
	reparamaterize_data.m - As discussed in Ref. [1],
	   reparamaterizing data to a gaussian can aid in mutual
	   information estimates. This function reparamaterizes the
	   input variable to a gaussian.  

3. A folder titled ‘ExampleScripts’, which contains scripts to perform
analyses similar to those in the figures of Ref. [1]. In order to run these, all 
contents of the continuousMIEstimation package should be on the path.

	a. gaussianExample.m - this function performs mutual information
	    estimation on correlated gaussian data. It generates
	    figures similar to Figs. 2, 3 in Ref. [1], which show how
	    error bars are calculated, and how mutual information
	    estimates depends on k and on N.
	
	b. logNormalExample.m - this shows how error bars are
	    calculated and how mutual information estimates depend on
	    k and N for log-normal bivariate date. It also
	    demonstrates the effects of reparameterizing the data and
	    generates the equivalent of Fig. 4 in Ref. [1].

	c. higherDimensionalExample.m - This script performs analyses
	    similar to (a) and (b) but with higher dimensional
	    Gaussian inputs. It is equivalent to Fig. 6 in Ref. [1].

	d. NfkappaBDataAnalysis.m -- this function performs mutual
	   information estimation on N-kappaBData.mat. It generates two
   	   plots comparing the N-dependence of mutual information
	   estimates with and without reparamaterization. The version
	   with reparamaterization is equivalent to Fig. 7 in Ref. [1].  
	
	e. NfkappaBData.mat -- this is the data file for data used in
            Fig. 7 in Ref. [1]. The data lists single cell NF-kappaB
            (P65 nuclear localization) and p-ATF-2 activation and in
            response to a doze of the TNF stimulus. See Ref. [1] and

	    	  Cheong, R., Rhee, A., Wang, C. J., Nemenman, I., &
	    	  Levchenko, A. (2011). Information transduction
	    	  capacity of noisy biochemical signaling
	    	  networks. Science 334(6054), 354–358.

	    for additional details about the data.
		  
	f.  finchDataAnalysis.m -- [CMH: INCLUDE AND DESCRIBE]

	g. finchData.mat - this file contains the data used by
	    finchDataAnalysis.m.  [CMH: DESCRIBE THE DATA, REFERENCE
	    EXPERIMENTAL PAPER]
	

About

Holmes-Nemenman modified KSG continuous entropy estimator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 76.7%
  • C 21.4%
  • C++ 1.8%
  • Makefile 0.1%