Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

f32 or i32 support #91

Closed
selvavm opened this issue Jun 19, 2020 · 9 comments
Closed

f32 or i32 support #91

selvavm opened this issue Jun 19, 2020 · 9 comments

Comments

@selvavm
Copy link

selvavm commented Jun 19, 2020

First, thanks for your contribution. Your crate is really well documented and easy to use. However, I am not able to use it because of supporting only u64.

It would be nice if there is a support for negative numbers (i32) and floating point number (f32). For f32, I can just multiply with scaling factors to use as i32 but supporting negative number is a really needed feature for me.

@jonhoo
Copy link
Collaborator

jonhoo commented Jun 19, 2020

Thank you! Much of it is also due to @marshallpierce's excellent work.

The way HDR Histograms work isn't compatible with negative numbers, at least not without modifications. This is also the case for the original Java version; see:
https://github.com/HdrHistogram/HdrHistogram/blob/665acdea6840ac499c36cdfa18d43c63ebec12dc/src/main/java/org/HdrHistogram/AbstractHistogram.java#L2404

Floating point values can be supported, but they are basically a completely separate implementation that wraps an inner integer-based histogram. I'd be neat to port DoubleHistogram to Rust as well, but unfortunately it's a larger effort that I don't have time for myself at the moment. If you want to give it a try, I'd be happy to try and review? It should be a fairly straightforward translation of the Java version I linked above! I wonder if we may even be able to have a single Histogram type by providing inherent implementations for Histogram<f64>... Not entirely clear. Initially I'd just have a separate DoubleHistogram type.

@marshallpierce
Copy link
Collaborator

For negative numbers, you could use two u16 histograms to cover the same range as a single i32. Store the positive numbers in one and the negative in another and then sum the two bucket by bucket whenever you're done.

@selvavm
Copy link
Author

selvavm commented Jun 20, 2020

Thanks @jonhoo for pointing to JavaDoc. I didnt know negative numbers are not supported.
Thanks @marshallpierce, for your suggestion. I also thought about your suggestion but was lost on how to implement it. Maybe I will explain my requirement.

I want my result to have 50 bins. Since, HDRHistogram has support for percentile iter, I though I will iterate by 2% (ticks_per_half_distance is 1%). This will perfectly work for me for positive numbers. Also, it works If my timeseries contain -m...+m values; I can split 25 bins for negative histogram and 25 bins for positive histogram.

However, my timeseries contains random values. For example, unique values are [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] or [-2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10]. Any idea how I can use your suggestion?.

After, @jonhoo pointed out, I looked into their issues and found an extract,

You can probably do that by using two histograms (one for positive and one for negative). But it raises interesting questions: e.g. what is the 99%'ile value supposed to represent (99%'lie from lowest signed value, or 99%'lie of absolute value. Both have uses).

I believe If they had provided a solution for 99%lie from lowest signed value, I think I will be able to use it.

@marshallpierce
Copy link
Collaborator

Perhaps you could get to the "99%ile from lowest signed value" this way:

  • have two histograms, one for positive and one for negative
  • record each value into the appropriate histogram
  • at the end of the run, calculate the absolute value of the most negative number, let's say m
  • iterate all the values from both histograms. For each value, add m to it, and store the result in a third histogram. This effectively shifts the values so that all are non-negative.
  • calculate whatever percentiles you please from the third histogram

@selvavm
Copy link
Author

selvavm commented Jul 3, 2020

Thanks @marshallpierce; I will try this. Sorry for late response

@selvavm
Copy link
Author

selvavm commented Jul 3, 2020

I think I will close this issue as well. Thanks both for the contribution and support :)

@selvavm selvavm closed this as completed Jul 3, 2020
@justinlovinger
Copy link

justinlovinger commented Sep 17, 2023

I am interested in floating-point numbers. Has there been any development on that front since this issue was made?

@jonhoo
Copy link
Collaborator

jonhoo commented Sep 23, 2023

There has not, no. The main work would be porting the (essentially completely different) DoubleHistogram from Java to Rust, which is a non-trivial effort as outlined in #91 (comment).

@justinlovinger
Copy link

I opened a separate issue to track the floating-point number issue independent of the negative number discussion in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants