Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2: conversion to utilize go1.18 generics #14

Merged
merged 4 commits into from
Nov 14, 2022
Merged

v2: conversion to utilize go1.18 generics #14

merged 4 commits into from
Nov 14, 2022

Conversation

mroth
Copy link
Owner

@mroth mroth commented Mar 22, 2022

This will improve the ergonomics of this library since we currently use an interface{} container type.

However, that will require go1.18+, which will be a breaking change, thus the plan will be to roll a v1.0 release on the previous version, and do a v2.0 semantic release for this change (requiring an import path change for modules consumers).

Due to the mechanics of v2 releases in Go modules, this will require a bit of thinking as to how I will handle. For now, this branch/PR represents the playground for adding generics support.

(Additionally, I have experimental fuzz testing in the seperate fuzz branch, which may be rolled into here for the release as well.)

@mroth
Copy link
Owner Author

mroth commented Mar 22, 2022

Quick benchmarks show very minor but inconsistent performance improvements, a longer bake time looks like PickParallel might pick up 2-3% due to lack of type conversions on the hot path:

name                    old time/op  new time/op  delta
Pick/10-8               25.8ns ± 5%  25.8ns ±10%    ~     (p=0.912 n=10+10)
Pick/100-8              38.6ns ± 1%  38.1ns ± 2%  -1.18%  (p=0.002 n=10+10)
Pick/1000-8             52.2ns ± 0%  51.9ns ± 0%  -0.52%  (p=0.000 n=10+10)
Pick/10000-8            68.6ns ± 0%  68.4ns ± 0%  -0.36%  (p=0.000 n=9+10)
Pick/100000-8           93.7ns ± 0%  93.2ns ± 0%  -0.53%  (p=0.000 n=10+10)
Pick/1000000-8           128ns ± 0%   124ns ± 1%  -3.02%  (p=0.000 n=7+10)
PickParallel/10-8       2.96ns ± 9%  2.99ns ± 7%    ~     (p=0.579 n=10+10)
PickParallel/100-8      4.79ns ± 1%  4.67ns ± 1%  -2.56%  (p=0.000 n=10+10)
PickParallel/1000-8     6.59ns ± 0%  6.47ns ± 0%  -1.83%  (p=0.000 n=9+10)
PickParallel/10000-8    9.10ns ±11%  8.60ns ± 2%  -5.46%  (p=0.000 n=9+9)
PickParallel/100000-8   12.0ns ± 0%  12.0ns ± 0%  +0.24%  (p=0.018 n=9+10)
PickParallel/1000000-8  16.5ns ± 0%  16.0ns ± 0%  -3.41%  (p=0.000 n=9+9)

Confirming this is primarily an ergonomics change, rather than a performance one.

@codecov
Copy link

codecov bot commented Mar 23, 2022

Codecov Report

Merging #14 (8a10b41) into main (87e006d) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main       #14   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            1         1           
  Lines           39        41    +2     
=========================================
+ Hits            39        41    +2     
Impacted Files Coverage Δ
weightedrand.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87e006d...8a10b41. Read the comment docs.

@mroth
Copy link
Owner Author

mroth commented Mar 23, 2022

Since v2 will break API compatibility it is a good time to consider any other ergonomic changes. One might be allowing the more common int types instead of uint for weights; and just ignoring negative values.

@mroth
Copy link
Owner Author

mroth commented Mar 28, 2022

cd3b27e introduces an Integer interface which allows any underlying int-like type to be used for a weight in a Choice (for now, the Chooser still stores internally as int).

Might also require some documentation updates to specify how negative weights are handled (or not handled, since they are ignored).

Needs validation of how things can be gracefully transitioned to use constraints.Integer if it is introduced to stdlib at a later time. I don't want to import the experimental package if possible so defining my own type for now seems simple enough, but the need for it to be public could cause problems with API compatibility, since I could technically never get rid of it without doing another breaking semver change in case someone imported it (I don't think anyone would, but if we're playing by the rules). I could type Alias it in the future, it still would be there making our GoDocs ugly...

@mroth mroth force-pushed the v2 branch 2 times, most recently from 51345e7 to 8a10b41 Compare April 30, 2022 02:36
@steve-gray
Copy link

Great package, enjoyed using it so far - so was keen to follow the generification - but ended up being impatient.

With the use of an Integer type, this makes the package a little more complicated to use in the context of scientific applications/mathematics as almost all "interesting" data tends to be various degrees of floating point. With that in mind, I'd rolled my own version of a weighted random picker WeightedPicker[T].

When using a fixed float64 weight, and a "On the user be-it" model to ensure the ranges don't over-run the range (the inserts fail), I got performance pretty much within a rounding error versus yours (mine left, your package right), so I didn't see any sense around playing with the underlying value:

image

However, to allow for varadic input weight types, what I've done is introduce the concept of a domain-mapper function that maps from whatever numeric type I want to float64:

// DomainMapper maps the value from the input type to the domain of floats
type DomainMapper[V generics.Numeric] func(v V) float64

// UnsignedFloatMapper clamps all values to the positive float range
var UnsignedFloatMapper = func(v float64) float64 {
	return math.Abs(v)
}

Then internally, its float64 to the moon. Any thoughts around something like this making it into the main code? Then in principle users can choose their poison on the weights.

@mroth
Copy link
Owner Author

mroth commented Aug 21, 2022

Then internally, its float64 to the moon. Any thoughts around something like this making it into the main code? Then in principle users can choose their poison on the weights.

Very cool. I've thought about float support, but that would change some of the semantics of how the "score" works here conceptually: if I was going that direction, I'd might use a different algorithm (Vose's alias method) that is even faster I've been wrapping my head around, and put in a new library where I can change the API accordingly: https://www.keithschwarz.com/darts-dice-coins/

Needs perf testing and validation of how things can be gracefully
transitioned to use contraints.Integer if it is introduced to stdlib
at a later time. Might also require some documentation updates to
specify how negative weights are handled (or not handled, since
they are ignored).
@mroth mroth merged commit 0892404 into main Nov 14, 2022
@mroth mroth deleted the v2 branch November 14, 2022 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants