Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Add Covariance and PearsonCorrelation metrics #1684

Merged
merged 10 commits into from Aug 29, 2018

Conversation

nelson-liu
Copy link
Contributor

This PR implements an online algorithm for calculating Covariance and the sample Pearson correlation coefficient.

This was actually nontrivial, I mostly referenced the tensorflow streaming_covariance metric in implementing this. Their implementation is a vectorized version of the weighted algorithm on this wikipedia page

The tests simply ensure that the streaming Covariance and PearsonCorrelation match up with what numpy would calculate, which I believe is a reasonable correctness check.

def __init__(self) -> None:
self._total_prediction_mean = 0.0
self._total_label_mean = 0.0
self._total_comoment = 0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's very hard for my brain not to look at comoment and feel like someone made a typo in comment, which is jarring. would you consider co_moment instead? 😀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(feel free to tell me I'm being a crazy person)

@nelson-liu nelson-liu merged commit 2a45f44 into allenai:master Aug 29, 2018
@nelson-liu nelson-liu deleted the pearson_correlation_metric branch August 29, 2018 15:44
gabrielStanovsky pushed a commit to gabrielStanovsky/allennlp that referenced this pull request Sep 7, 2018
This PR implements an online algorithm for calculating Covariance and the sample Pearson correlation coefficient.

This was actually nontrivial, I mostly referenced the tensorflow [streaming_covariance metric](https://github.com/tensorflow/tensorflow/blob/4dcfddc5d12018a5a0fdca652b9221ed95e9eb23/tensorflow/contrib/metrics/python/ops/metric_ops.py#L3127-L3264) in implementing this. Their implementation is a vectorized version of the weighted algorithm [on this wikipedia page](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online)

The tests simply ensure that the streaming Covariance and PearsonCorrelation match up with what numpy would calculate, which I believe is a reasonable correctness check.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants