Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create InverseCovariance Subclass of EmpiricalCovariance #3

Closed
mnarayan opened this issue Jun 12, 2016 · 5 comments
Closed

Create InverseCovariance Subclass of EmpiricalCovariance #3

mnarayan opened this issue Jun 12, 2016 · 5 comments

Comments

@mnarayan
Copy link
Member

The QUIC algorithm estimates the inverse covariance using the sample covariance estimate $S$ as an input
$ That = max_{\Theta} logdet(Theta) - Trace(Theta_S) - \lambda_| \Theta |_1 $

As a result it makes sense for the sample covariance for QUIC to be computed using the methods inherited from the EmpiricalCovariance class.

Additionally, the log-likelihood and error metrics for the inverse covariance will differ from that of the covariance matrix. Thus corresponding methods in EmpiricalCovariance will need to be overridden where relevant. We need the ones below

  • Negative Log likelihood for That is: Trace(That*S) - logdet(That).
  • KL-Loss for (T, That): Trace(That^{-1}T) - log (That^{-1}T) - p
  • Quadratic Loss: Trace((That*Sigma - I)^2)
  • Frobenius loss (remains the same, but computed in the precision rather than covariance)
@jasonlaska
Copy link
Member

I see, are there any particular methods of interest that we would be inheriting from EmpiricalCovariance?

If not, it seems like inheriting from a class that is not intended to be a "base class" or "mixin" could introduce unexpected dependencies / bugs / behaviors. I do see that GraphLasso does inherit from EmpiricalCovariance so perhaps you are right.

@mnarayan
Copy link
Member Author

I would want to inherit all the defaults in EmpiricalCovariance for estimating the covariance, but override precision related things as well as the loss functions. Do you have an alternative suggestion?

Or are you concerned that EmpiricalCovariance is not a good base class?
Most of the classes in sklearn.covariance inherits from EmpiricalCovariance. For example shrunk_covariance has many ridge type shrinkage estimators that inherit from EmpiricalCovariance. The GraphLasso class is the odd one of all the classes defined in that it estimates the inverse. sklearn.covariance does not have much dedicated infrastructure/loss functions etc for the inverse.

@mnarayan
Copy link
Member Author

I went back to look at sklearn's EmpiricalCovariance class. Not so sure using it as a Base is a good idea.
That class is conceptually a bit of a mess, as functions mix things up. I had assumed the class has a nice clean conceptual separation between methods for unknown covariance vs. methods for inverse covariance.

For example the log-likelihood function depends on what the parameter of interest is. EmpiricalCovariance defines the log-likelihood for the inverse covariance

 log_likelihood_ = - np.sum(emp_cov * precision) + fast_logdet(precision)

instead of log_likelihood for the covariance being defined as

 log_likelihood_ = - np.sum(emp_cov * precision) - fast_logdet(covariance)

Maybe implementation wise, this is just a minor issue? For instance, we could stick with current EmpiricalCovariance as the base and later mark the needed modifications to both EmpiricalCovariance and InverseCovariance?

References:
http://biomet.oxfordjournals.org/content/98/4/807.abstract

@jasonlaska
Copy link
Member

jasonlaska commented Jun 14, 2016

A few Qs:

  • for the KL-loss above, what is $p$ ? According to this: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions, I guess it's the dimension of the matrix?
  • Forgive me, but it's been a while since I've written code like this, is there a better way to compute That^{-1}T than pinv(Tha) * T in python? We're looking something like the equivalent matlab T\That? I guess the internet says this: linalg.solve(a,b)
  • - (Trace(That*S) - logdet(That) ) above works out to - np.sum(emp_cov * precision) + fast_logdet(precision) if Theta is the inverse covariance matrix; was your comment about the names of the parameters in their log_likelihood_ function?

@mnarayan
Copy link
Member Author

I'm a bit swamped. Will take some time before I address all comments.

You are correct about p.
Regarding point 2, we don't want to use pinv anywhere in our losses as all
estimates stored in precision_.
Given some true covariance and precision estimate just do
covariance*precision_

For our class what you printed is correct for inverse covariance log
likelihood.

On Mon, Jun 13, 2016, 9:00 PM Jason Laska notifications@github.com wrote:

A few Qs:

for the KL-loss above, what is $p$ ? According to this:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Kullback.E2.80.93Leibler_divergence_for_multivariate_normal_distributions,
I guess it's the dimension of the matrix?

Forgive me, but it's been a while since I've written code like this,
is there a better way to compute That^{-1}T than pinv(Tha) * T in
python? We're looking something like the equivalent matlab T\That?

  • (Trace(That*S) - logdet(That) ) above works out to - np.sum(emp_cov
  • precision) + fast_logdet(precision) if Theta is the inverse
    covariance matrix; was your comment about the names of the parameters in
    their log_likelihood_ function?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAov9zz6bu9OZY7mbBti71BESE_m_4Jlks5qLidWgaJpZM4Izrzl
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants