Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Variants of Map.find #7277
Original bug ID: 7277
I'm suggesting to add the following functions (impl is trivial):
val find_le : key -> 'a t -> (key * 'a) option * 'a option
find_le k m returns a pair where the left component is the binding with the biggest key smaller than k (if existing) and the right component is the value for k (if existing). find_ge symmetrically returns the value for k and the binding with the smallest key bigger than k.
We can of course already do that with Map.split, but in this case the submaps are constructed which may be expensive. So the motivation is performance.
Use case: Imagine you need a mapping where the keys are disjoint ranges (left,right), and you want to implement a function that returns the mapped value for any number in that range. With find_le you can do that by only putting the left points of the ranges into the map (and an additional member check).
Alternatives: The API could be a little bit "smaller" (not returning the exact match):
val find_lt : key -> 'a t -> (key * 'a) option
Or it could be "bigger" (always returning the previous and the next elements):
val find_around : key -> 'a t -> (key * 'a) option * 'a option * (key * 'a) option
IMHO, this is matter of taste. The use case I know wouldn't profit from find_around.
I'll happily submit a PR when the addition is accepted (and the style has been agreed upon).
Comment author: @alainfrisch
I'm concerned by the number of small variants around the proposal that would all look as useful (e.g. find_lt, find_lte, ...). Since performance is the key reason here, the most general form (find_around) might not be enough (but still sometimes useful, since it would be more efficient than multiple calls to simple functions).
Gerd: in your case, would find_lt/find_gt be enough?
A slightly more general interface would be:
val find_first: (key -> bool) -> 'a t -> (key * 'a) option
where the predicate is assumed to be monotonic. From these function, one can create find_lt/find_gt, but also variants with non-strict comparison (find_lte/find_gte), and the overhead could be acceptable.
Comment author: gerd
Well, the interface design would be a little bit simpler if flambda could optimize the unused parts of find_around away. However, to my understanding flambda cannot do this (you need to optimize the whole recursion, and track unused values through it - can somebody verify? - I've attached a sample impl of find_around).
Sure, find_lt/find_gt would be good enough for my application, i.e. a one-sided function. There is still the question where to return an option or whether to raise Not_found. The latter would be more consistent with the existing interface.
find_first/last is also a nice idea (and would solve my problem).
Comment author: @paurkedal
I like Alain's proposal, as it reminds my of Dedekind cuts, and after a bit reflection I realised that it extends the interface beyond the four possible inequalities: