Skip to content

Latest commit

 

History

History
171 lines (135 loc) · 4.92 KB

examples_conditional_mutual_information.md

File metadata and controls

171 lines (135 loc) · 4.92 KB

[Conditional mutual information](@id quickstart_mutualinfo)

When estimated using a ConditionalMutualInformationEstimator, some form of bias correction is usually applied. The FPVP estimator is a popular choice.

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = randn(1000)
y = randn(1000) .+ x
z = randn(1000) .+ y
condmutualinfo(GaussianCMI(), x, z, y) # defaults to `CMIShannon()`
using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(FPVP(k = 5), x, z, y) # defaults to `CMIShannon()`
using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(MesnerShalizi(k = 10), x, z, y) # defaults to `CMIShannon()`
using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIShannon(base = 10), Rahimzamani(k = 10), x, z, y)
using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIRenyiPoczos(base = 2, q = 1.2), PoczosSchneiderCMI(k = 5), x, z, y)

Estimation using MutualInformationEstimators

Any MutualInformationEstimator can also be used to compute conditional mutual information using the chain rule of mutual information. However, the naive application of these estimators don't perform any bias correction when taking the difference of mutual information terms.

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIShannon(base = 2), KSG1(k = 5), x, z, y)

Any DifferentialEntropyEstimator can also be used to compute conditional mutual information using a sum of entropies. However, the naive application of these estimators don't perform any bias application when taking the sum of entropy terms.

using CausalityTools
using Distributions
n = 1000
# A chain X → Y → Z
x = rand(Epanechnikov(0.5, 1.0), n)
y = rand(Erlang(1), n) .+ x
z = rand(FDist(5, 2), n)
condmutualinfo(CMIShannon(), Kraskov(k = 5), x, z, y)

Estimation using ProbabilitiesEstimators

Any ProbabilitiesEstimator can also be used to compute conditional mutual information using a sum of entropies. However, the naive application of these estimators don't perform any bias application when taking the sum of entropy terms.

using CausalityTools
using Distributions
n = 1000
# A chain X → Y → Z
x = rand(Epanechnikov(0.5, 1.0), n)
y = rand(Erlang(1), n) .+ x
z = rand(FDist(5, 2), n)
est = ValueHistogram(RectangularBinning(5))
condmutualinfo(CMIShannon(), est, x, z, y), condmutualinfo(CMIShannon(), est, x, y, z)