Create InverseCovariance Subclass of EmpiricalCovariance #3

mnarayan · 2016-06-12T02:16:54Z

The QUIC algorithm estimates the inverse covariance using the sample covariance estimate $S$ as an input
$ That = max_{\Theta} logdet(Theta) - Trace(Theta_S) - \lambda_| \Theta |_1 $

As a result it makes sense for the sample covariance for QUIC to be computed using the methods inherited from the EmpiricalCovariance class.

Additionally, the log-likelihood and error metrics for the inverse covariance will differ from that of the covariance matrix. Thus corresponding methods in EmpiricalCovariance will need to be overridden where relevant. We need the ones below

Negative Log likelihood for That is: Trace(That*S) - logdet(That).
KL-Loss for (T, That): Trace(That^{-1}T) - log (That^{-1}T) - p
Quadratic Loss: Trace((That*Sigma - I)^2)
Frobenius loss (remains the same, but computed in the precision rather than covariance)

jasonlaska · 2016-06-12T02:44:08Z

I see, are there any particular methods of interest that we would be inheriting from EmpiricalCovariance?

If not, it seems like inheriting from a class that is not intended to be a "base class" or "mixin" could introduce unexpected dependencies / bugs / behaviors. I do see that GraphLasso does inherit from EmpiricalCovariance so perhaps you are right.

mnarayan · 2016-06-12T17:19:41Z

I would want to inherit all the defaults in EmpiricalCovariance for estimating the covariance, but override precision related things as well as the loss functions. Do you have an alternative suggestion?

Or are you concerned that EmpiricalCovariance is not a good base class?
Most of the classes in sklearn.covariance inherits from EmpiricalCovariance. For example shrunk_covariance has many ridge type shrinkage estimators that inherit from EmpiricalCovariance. The GraphLasso class is the odd one of all the classes defined in that it estimates the inverse. sklearn.covariance does not have much dedicated infrastructure/loss functions etc for the inverse.

mnarayan · 2016-06-12T17:58:03Z

I went back to look at sklearn's EmpiricalCovariance class. Not so sure using it as a Base is a good idea.
That class is conceptually a bit of a mess, as functions mix things up. I had assumed the class has a nice clean conceptual separation between methods for unknown covariance vs. methods for inverse covariance.

For example the log-likelihood function depends on what the parameter of interest is. EmpiricalCovariance defines the log-likelihood for the inverse covariance

 log_likelihood_ = - np.sum(emp_cov * precision) + fast_logdet(precision)

instead of log_likelihood for the covariance being defined as

 log_likelihood_ = - np.sum(emp_cov * precision) - fast_logdet(covariance)

Maybe implementation wise, this is just a minor issue? For instance, we could stick with current EmpiricalCovariance as the base and later mark the needed modifications to both EmpiricalCovariance and InverseCovariance?

References:
http://biomet.oxfordjournals.org/content/98/4/807.abstract

jasonlaska · 2016-06-14T04:00:22Z

A few Qs:

for the KL-loss above, what is $p$ ? According to this: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions, I guess it's the dimension of the matrix?
Forgive me, but it's been a while since I've written code like this, is there a better way to compute That^{-1}T than pinv(Tha) * T in python? We're looking something like the equivalent matlab T\That? I guess the internet says this: linalg.solve(a,b)
- (Trace(That*S) - logdet(That) ) above works out to - np.sum(emp_cov * precision) + fast_logdet(precision) if Theta is the inverse covariance matrix; was your comment about the names of the parameters in their log_likelihood_ function?

mnarayan · 2016-06-14T14:08:08Z

I'm a bit swamped. Will take some time before I address all comments.

You are correct about p.
Regarding point 2, we don't want to use pinv anywhere in our losses as all
estimates stored in precision_.
Given some true covariance and precision estimate just do
covariance*precision_

For our class what you printed is correct for inverse covariance log
likelihood.

On Mon, Jun 13, 2016, 9:00 PM Jason Laska notifications@github.com wrote:

A few Qs:

for the KL-loss above, what is $p$ ? According to this:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions,
I guess it's the dimension of the matrix?

Forgive me, but it's been a while since I've written code like this,
is there a better way to compute That^{-1}T than pinv(Tha) * T in
python? We're looking something like the equivalent matlab T\That?

(Trace(That*S) - logdet(That) ) above works out to - np.sum(emp_cov

precision) + fast_logdet(precision) if Theta is the inverse
covariance matrix; was your comment about the names of the parameters in
their log_likelihood_ function?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAov9zz6bu9OZY7mbBti71BESE_m_4Jlks5qLidWgaJpZM4Izrzl
.

jasonlaska mentioned this issue Jun 14, 2016

Add metrics, wire-in, rework some parts of interface #10

Merged

jasonlaska closed this as completed Jul 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create InverseCovariance Subclass of EmpiricalCovariance #3

Create InverseCovariance Subclass of EmpiricalCovariance #3

mnarayan commented Jun 12, 2016

jasonlaska commented Jun 12, 2016

mnarayan commented Jun 12, 2016

mnarayan commented Jun 12, 2016

jasonlaska commented Jun 14, 2016 •

edited

mnarayan commented Jun 14, 2016

for the KL-loss above, what is $p$ ? According to this:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions,
I guess it's the dimension of the matrix?

Forgive me, but it's been a while since I've written code like this,
is there a better way to compute That^{-1}T than pinv(Tha) * T in
python? We're looking something like the equivalent matlab T\That?

Create InverseCovariance Subclass of EmpiricalCovariance #3

Create InverseCovariance Subclass of EmpiricalCovariance #3

Comments

mnarayan commented Jun 12, 2016

jasonlaska commented Jun 12, 2016

mnarayan commented Jun 12, 2016

mnarayan commented Jun 12, 2016

jasonlaska commented Jun 14, 2016 • edited

mnarayan commented Jun 14, 2016

for the KL-loss above, what is $p$ ? According to this: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions, I guess it's the dimension of the matrix?

Forgive me, but it's been a while since I've written code like this, is there a better way to compute That^{-1}T than pinv(Tha) * T in python? We're looking something like the equivalent matlab T\That?

jasonlaska commented Jun 14, 2016 •

edited

for the KL-loss above, what is $p$ ? According to this:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions,
I guess it's the dimension of the matrix?

Forgive me, but it's been a while since I've written code like this,
is there a better way to compute That^{-1}T than pinv(Tha) * T in
python? We're looking something like the equivalent matlab T\That?