Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue17 #19

Merged
merged 2 commits into from
Jul 21, 2016
Merged

Fix issue17 #19

merged 2 commits into from
Jul 21, 2016

Conversation

pjz
Copy link
Contributor

@pjz pjz commented Jun 15, 2016

Potential fix for #17. Since there's only one value present, I assume a normal distribution with a standard deviation that gets lower the more samples there are of that value.

def test_quantile_with_single_centroid_at_zero(self, empty_tdigest):
td = empty_tdigest
td.update(0)
assert td.quantile(0) == 0.5
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be 0.5? The quantile is the value p s.t. F(q) = p. If F is equal to 0 before 0, and 1 at and after 0, then I would expect this to be 1. (That's kind of confusing because of all the 0 and 1...)

@CamDavidsonPilon
Copy link
Owner

I was thinking of a solution like this:

def ...

        if len(self) == 1:                                 
            return int(q >= self.C.min_key())

        for i, key in enumerate(self.C.keys()):

The CDF is a degenerate distribution, so the values should return 0 or 1, depending on if q is above what we have seen thus far. What do you think?

@pjz
Copy link
Contributor Author

pjz commented Jun 22, 2016

You're probably correct ; my thought process was more like "there's one centroid, they asked for the mean, so clearly they should get 0.5". Not only is your reasoning better, but also since quantile and percentile are inverses, it should be true that x == td.quantile(td.percentile(x)) and x = td.percentile(td.quantile(x)), which fails if it's 0.5 as I suggested, but works if it's 1, as you suggested.

@pjz
Copy link
Contributor Author

pjz commented Jul 21, 2016

The failure here is a bit inexplicable, not even in the code I changed, I think, and only in py3.3: 2.7 and 3.5 pass fine. Can you prod it to re-run the tests?

@CamDavidsonPilon
Copy link
Owner

Rerunning - apologies, I didn't see you update the code

@CamDavidsonPilon CamDavidsonPilon merged commit 7770eeb into CamDavidsonPilon:master Jul 21, 2016
This was referenced Jul 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants