-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: remove mizani as dependency, re-implement logic internally #271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT changes to snapshots are due to differences in rounding (and/or the resolution of mizani's table lookup).
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #271 +/- ##
==========================================
+ Coverage 77.01% 81.87% +4.86%
==========================================
Files 40 41 +1
Lines 4229 4310 +81
==========================================
+ Hits 3257 3529 +272
+ Misses 972 781 -191 ☔ View full report in Codecov by Sentry. |
Thank you for the huge amount of work you put into this and the related PR! Will review shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@abstractqqq in case it's useful for polars_ds, this PR should remove pandas as a dependency (except for importing from |
This PR removes mizani as a dependency, so that we don't transitively depend on scipy and pandas. Note that our internal implementation does not depend on numpy, so that we can drop it as a dependency in a later PR.
This implementation is not as elegant as mizani's, or quite as optimized, but should do well for our purposes.
Speed
Our default implementation was about 50% slower than mizani in a simple test. But both are very fast for simple table displays (1.52 ms for 1,000 points for ours, 1ms for mizani.)
The main difference is that we're using a simple bisect lookup to find cutoff corresponding to a value (in order to get coefficients for transforming within a cutoff band). Mizani uses a table lookup, which cuts the input/response space into 256 bins.
Fixes: #7