Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make median aggregator always return a double. #653

Merged
merged 1 commit into from
Feb 6, 2020

Conversation

ftence
Copy link
Contributor

@ftence ftence commented Feb 5, 2020

TLDR: Median aggregator must always return a double to avoid confusion.

A wildly accepted definition of the median, as on NIST, makes it so the median of an even number of elements is the mean of two elements. This means the median has a probability of 1 of being a floating point number for a list of floating point number and a probability of 0.25 of being a floating point number for a list of integers (0.5 the list has an odd number of elements, 0.25 the list has an even number of elements and the difference between the two elements use to compute the mean is even and 0.25 the list has an even number of elements and the difference between the two elements use to compute the mean is odd). The simplest solution is thus to always return a double.

This PR also fixes this bug/unexpected behavior:

[
  [ 1 2 3 ] [] [] []
  [ 1 2 3 ]
  MAKEGTS
  [ 1   2   3   ] [] [] []
  [ 2.3 3.2 4.1 ]
  MAKEGTS
]
// Returns a GTS of doubles
[ SWAP NULL reducer.median ] REDUCE

[
  [ 1 2 3 ] [] [] []
  [ 1 2 3 ]
  MAKEGTS
  [     2   3   ] [] [] []
  [     3.2 4.1 ]
  MAKEGTS
]
// Returns a GTS of longs
[ SWAP NULL reducer.median ] REDUCE

@hbs hbs merged commit 21f9efa into senx:master Feb 6, 2020
@ftence ftence deleted the median_double branch February 7, 2020 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants