Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct the sigmas in TwoHalfNorm #143

Merged
merged 7 commits into from
Mar 5, 2024
Merged

Correct the sigmas in TwoHalfNorm #143

merged 7 commits into from
Mar 5, 2024

Conversation

zihaoxu98
Copy link
Collaborator

This PR fixes problem #142

If we have the two asymmetric errors, and simply use them as the sigmas in the TwoHalfNorm, the 16 and 84 percentile won't be correct. This PR solves this issue by numerically solving the sigmas that require 16 and 84 percentile to be the errors. To test it,

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.stats import chi2

import appletree as apt
from appletree.randgen import TwoHalfNorm

errors = [100, 20]

sigmas = errors
X = TwoHalfNorm.rvs(sigma_pos=sigmas[0], sigma_neg=sigmas[1], size=int(1e6))
inv_cdf = interp1d(np.linspace(0, 1, len(X)), np.sort(X))
plt.hist(X, bins=100)
plt.show()
print("Without correction:")
print(f"+1 sigma is at {inv_cdf((1 + chi2.cdf(1, 1)) / 2):.2f}, which should be {errors[0]}")
print(f"-1 sigma is at {inv_cdf((1 - chi2.cdf(1, 1)) / 2):.2f}, which should be {-errors[1]}")

sigmas = apt.utils.errors_to_two_half_norm_sigmas(errors)
X = TwoHalfNorm.rvs(sigma_pos=sigmas[0], sigma_neg=sigmas[1], size=int(1e6))
inv_cdf = interp1d(np.linspace(0, 1, len(X)), np.sort(X))
plt.hist(X, bins=100)
plt.show()
print("With correction:")
print(f"+1 sigma is at {inv_cdf((1 + chi2.cdf(1, 1)) / 2):.2f}, which should be {errors[0]}")
print(f"-1 sigma is at {inv_cdf((1 - chi2.cdf(1, 1)) / 2):.2f}, which should be {-errors[1]}")

which outputs

Without correction:
+1 sigma is at 131.10, which should be 100
-1 sigma is at -1.20, which should be -20
With correction:
+1 sigma is at 100.03, which should be 100
-1 sigma is at -20.03, which should be -20

and the distributions before and after the correction

Screenshot 2024-02-28 at 9 56 42 PM Screenshot 2024-02-28 at 9 56 56 PM

@coveralls
Copy link

coveralls commented Feb 29, 2024

Pull Request Test Coverage Report for Build 8158678085

Details

  • 14 of 14 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.08%) to 83.815%

Totals Coverage Status
Change from base Build 8150218355: 0.08%
Covered Lines: 2061
Relevant Lines: 2459

💛 - Coveralls

Copy link
Collaborator

@dachengx dachengx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally think that the functionality is what the description promised. But how users should interpret what sigma_pos and sigma_neg represent and the roots given by errors_to_two_half_norm_sigmas? We might need to define the meaning of the two groups somewhere.

@dachengx
Copy link
Collaborator

dachengx commented Mar 3, 2024

I generally think that the functionality is what the description promised. But how users should interpret what sigma_pos and sigma_neg represent and the roots given by errors_to_two_half_norm_sigmas? We might need to define the meaning of the two groups somewhere.

Also, can the roots given by errors_to_two_half_norm_sigmas provide the narrowest range among all the ranges with the same coverage?

@zihaoxu98
Copy link
Collaborator Author

I generally think that the functionality is what the description promised. But how users should interpret what sigma_pos and sigma_neg represent and the roots given by errors_to_two_half_norm_sigmas? We might need to define the meaning of the two groups somewhere.

the sigma_pos and sigma_neg are the std of the glued gaussians, while they are not the correct percentiles. So errors_to_two_half_norm_sigmas converts 16&84 percentiles to the correct sigmas. I can add more documentation somewhere.

Also, can the roots given by errors_to_two_half_norm_sigmas provide the narrowest range among all the ranges with the same coverage?

The equation to be solved requires 1. x=0 has the highest pdf 2. 16&84 percentiles are at x=+-error. In that sense the solution is unique.

@dachengx dachengx merged commit 65fff49 into master Mar 5, 2024
7 checks passed
@dachengx dachengx deleted the two_half_gaussian branch March 5, 2024 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants