Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configerable average function #23

Merged
merged 13 commits into from Feb 21, 2021

Conversation

multimeric
Copy link
Contributor

Closes #22, see discussion there.

Uncertainties:

  • Have I covered all public APIs, ensuring they can all be configured?
  • The test statistic ends up being negative, and therefore with a p-value of 1 when used to compare a standard normal and t distribution in the test_different_distributions. Does this make sense, or is it revealing a flaw in the code somewhere?

@codecov
Copy link

codecov bot commented Feb 18, 2021

Codecov Report

Merging #23 (b10e89b) into develop (e735155) will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #23      +/-   ##
===========================================
+ Coverage    95.95%   96.03%   +0.07%     
===========================================
  Files           18       18              
  Lines         1137     1159      +22     
===========================================
+ Hits          1091     1113      +22     
  Misses          46       46              
Impacted Files Coverage Δ
dcor/homogeneity.py 100.00% <ø> (ø)
dcor/tests/test_independence.py 100.00% <ø> (ø)
dcor/_energy.py 91.66% <100.00%> (+0.75%) ⬆️
dcor/tests/test_homogeneity.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e735155...b10e89b. Read the comment docs.

@vnmabus
Copy link
Owner

vnmabus commented Feb 18, 2021

* Have I covered all public APIs, ensuring they can all be configured?

I think so, although the documentation for the public function is missing (and in a future PR I should merge the public and _imp versions of the functions, as this was only done to achieve keyword only parameters for Python 2, which is no longer supported).

* The test statistic ends up being negative, and therefore with a p-value of 1 when used to compare a standard normal and t distribution in the `test_different_distributions`. Does this make sense, or is it revealing a flaw in the code somewhere?

You mean the statistic with the mean, median or both?

@multimeric
Copy link
Contributor Author

You mean the statistic with the mean, median or both?

I mean with the median. Using the mean is one of your tests, which passes.

@vnmabus
Copy link
Owner

vnmabus commented Feb 19, 2021

You mean the statistic with the mean, median or both?

I mean with the median. Using the mean is one of your tests, which passes.

Ok, I have checked the implementation and it looks ok. The only explanation that I see is that the differences between the Gaussian and t-Student distributions are in the tails of the distribution, and the median is not taking this information into account. Maybe it will notice the difference with a higher number of samples, but it would be very costly. So I would add a test between two different enough distributions with the same mean, and call it a day, unless you have a better explanation.

dcor/_energy.py Outdated Show resolved Hide resolved
dcor/_energy.py Outdated Show resolved Hide resolved
dcor/_energy.py Outdated Show resolved Hide resolved
dcor/homogeneity.py Outdated Show resolved Hide resolved
dcor/homogeneity.py Outdated Show resolved Hide resolved
dcor/homogeneity.py Outdated Show resolved Hide resolved
dcor/tests/test_homogeneity.py Outdated Show resolved Hide resolved
dcor/tests/test_homogeneity.py Show resolved Hide resolved
dcor/homogeneity.py Outdated Show resolved Hide resolved
@vnmabus vnmabus merged commit 71c36e0 into vnmabus:develop Feb 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Energy distance using medians?
2 participants