Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stdlib] Implement find_first, find_last for maps and sets #869

Merged
merged 6 commits into from
Nov 12, 2016

Conversation

g2p
Copy link
Contributor

@g2p g2p commented Oct 24, 2016

Finds the first or last binding or element that satisfies a monotonic predicate.

See discussion in #665.

@g2p g2p force-pushed the map-find-first branch 2 times, most recently from bff462e to 2a8db17 Compare October 24, 2016 21:13
@g2p g2p changed the title Implement find_first, find_last for maps and sets [Stdlib] Implement find_first, find_last for maps and sets Oct 25, 2016
@alainfrisch
Copy link
Contributor

Making the case for extending the stdlib is always difficult. Can you for instance provide pointers to publicly available code that would benefit from the new functions?

@bluddy
Copy link
Contributor

bluddy commented Oct 31, 2016

I've had to implement these functions before, and because of type opaqueness, had to copy-paste the entire map implementation. I think this is a valuable addition.

@yakobowski
Copy link
Contributor

@alainfrisch https://github.com/Frama-C/Frama-C-snapshot/blob/master/src/libraries/stdlib/FCSet.mli, functions nearest_elt_le and nearest_elt_ge, are very similar to what is proposed here.

@alainfrisch
Copy link
Contributor

Thanks @bluddy and @yakobowski .

@yakobowski : Would the addition of the functions proposed here avoid a custom implementation of sets in Frama-C?

@yakobowski
Copy link
Contributor

@alainfrisch I think so, yes. The standard library has caught up with the other changes we used to have (such as merge).

@dbuenzli
Copy link
Contributor

Couldn't new functions use options rather than Not_found exceptions ?

@OvermindDL1
Copy link

As a user, please, 'something option is so much better than exceptions...

/me would give a lot to have a set of functions for the stdlib like MyMap.get myKey myMap that returned an option instead of throwing exceptions all over the place, plus an Option module that had things like map or andThen and so forth...

@bluddy
Copy link
Contributor

bluddy commented Oct 31, 2016

Consistency. Unless we deprecate and rebuild the whole stdlib (not a bad idea but not the place to do it), we can't just start using options. Also, doesn't option allocate more than using exceptions? You can't always count on Flambda to eliminate the allocation.

@dbuenzli
Copy link
Contributor

Consistency. Unless we deprecate and rebuild the whole stdlib (not a bad idea but not the place to do it), we can't just start using options.

Why not ? I don't think one should enshrine design errors in the name of consistency. There's absolutely no loss here at not being consistent you'll be guided by the types. It only makes the API easier and safer to use.

@OvermindDL1 you might be interested by asetmap.

@OvermindDL1
Copy link

OvermindDL1 commented Oct 31, 2016

@OvermindDL1 you might be interested by asetmap.

Lots of useful libraries out there, but I am trying to depend only on stdlib and nothing else for a library that I am building. I keep having to write wrappers around various stdlib things, it is compounding a lot. ^.^

And yes, I find 'something option significantly safer, forces you to at least think about the error cases instead of just trying to remember to Blah.mem first, or whatever lookup test there is in a module, the naming is anything but consistent; it would indeed be very nice to rewrite stdlib from scratch with all modern styles that have been learned over time for efficiency and safety both, release an OCaml 5.0.0 for that.

Why can't Flambda eliminate the allocation? It seems that both option and Result.t would be two very good cases to optimize if not generically then specifically.

EDIT0: Re-making the stdlib after Implicit Modules exist that is, and preferably concurrency and no GIL so everything can be made properly immutable or safe in some way...

EDIT1: Do feel free to ignore me, just hard to resist piping in when I saw a conversation between throwing or option, throwing is very... flow-control disrupting, makes things harder to reason...

@g2p
Copy link
Contributor Author

g2p commented Oct 31, 2016

I want find_first to be consistent with find and throw an exception.
Thanks to exception matching, usage is just as simple as matching on an option.
My code that depends on this isn't public yet (these functions are used to implement a B+Tree), but with this and #665 we have at least five users relying on independent reimplementations of similar functions.

@OvermindDL1
Copy link

OvermindDL1 commented Oct 31, 2016

Thanks to exception matching, usage is just as simple as matching on an option

Exception matching is nice, but it is still easily ignorable and forgettable, where an option mandates making sure to handle both cases and not worrying about something weird happening, like the callstack suddenly unwinding who knows how high just because of some quickly written code. ^.^

@alainfrisch
Copy link
Contributor

About Not_found vs option: I tend to be on @dbuenzli side here. Consistency is a nice goal, but being consistent with a bad design for the sake of consistency only does not seem right. A point could be made that if we plan to add option-returning variants with a specific naming scheme (e.g. find_opt, assoc_opt, etc), the new functions should directly follow that naming scheme

@bluddy
Copy link
Contributor

bluddy commented Nov 1, 2016

Wow, @alainfrisch -- that was unexpected.

If anything though, we'd want to label the exception-based functions with _exn as Jane Street does. That would hint that they're less desirable, and convey some information about their usage of exception.

@alainfrisch
Copy link
Contributor

Also, doesn't option allocate more than using exceptions?

Yes, but setting up the exception handler can be costly as well, and you more or less have to do it around each call anyway.

A very quick test with:

let rec find p = function
  | [] -> raise Not_found
  | x :: l -> if p x then x else find p l

let rec find_opt p = function
  | [] -> None
  | x :: l -> if p x then Some x else find_opt p l

let f1 () =
  try find p l
  with Not_found -> 0

let f2 () =
  match find_opt p l with
  | Some x -> x
  | None -> 0

suggests that the find_opt + match..with version is about 50% faster than find + try..with with let p _ = true, and between 0% (for long lists) and 80% faster (for short lists) with let p _ = false.

@bluddy
Copy link
Contributor

bluddy commented Nov 1, 2016

Good to know. Could this be FLambda at work? Because it can't ever optimize setting up exceptions, but it can optimize options being returned if the call is inlined.

@dbuenzli
Copy link
Contributor

dbuenzli commented Nov 1, 2016

If anything though, we'd want to label the exception-based functions with _exn as Jane Street does.

This is quite ugly. Usually you can find better names conveing your intent for that. For example you can have find for the paths where you are unsure something matches (and hence returns an option) and get for the paths where you know by construction something has to match (and raises Invalid_argument if that thing was not in there).

@alainfrisch
Copy link
Contributor

Could this be FLambda at work?

No, timings were without flambda. Allocating short-lived blocks (such as an immediately deconstructed option) is very cheap (at least with the official OCaml runtime system; Javascript backends are another story).

@dra27
Copy link
Member

dra27 commented Nov 2, 2016

+1 to still having an exception version, but with a name which makes it clear why you'd use it (i.e. find+get) - assuming that it's the case that the exception version with no exception handler is faster than the 'a option version?

@alainfrisch
Copy link
Contributor

(#885 proposes to add non-raising variants for existing functions.)

@OvermindDL1
Copy link

assuming that it's the case that the exception version with no exception handler is faster than the 'a option version?

Most of the time you'd still have something like a MyMap.mem check before a MyMap.get, which would likely be more costly anyway.

@dra27
Copy link
Member

dra27 commented Nov 2, 2016

Most - but definitely not all the time. An obvious example being keyword symbols in a parser, and also any situation where you would have one mem call before more than one get call. Either Daniel's suggestion of using Invalid_argument (which is never supposed to be caught) and/or documentation should make it very clear that find_opt is probably what's wanted 95% of the time.

@alainfrisch
Copy link
Contributor

So, assuming #885 is accepted, what should we do here? To have a consistent naming, the new option-returning functions should be called *_opt, but if we don't plan to include the Not-found-raising versions, the suffix would be technically useless.

@dra27
Copy link
Member

dra27 commented Nov 4, 2016

Given your work in #885, there should obviously be no question that there must be option-returning versions. I also think there's enough here to suggest that there should also be exception-raising versions. The two big questions are:

  1. Should the exception-raising versions raise Not_found or Invalid_argument _. Despite the inconsistency, I think I'd be in favour of Invalid_argument, as it moves in the direction of encouraging correct use: although that's balanced against quite a "gotcha" of catching Not_found for analogous functions elsewhere.
  2. The precise names - for this, while I really prefer Daniel's suggest of intent-based naming (get/find, etc.), I think consistency of naming is probably better, so the option-returning versions should have _opt suffices, and the exception-raising versions be undecorated.

@gasche
Copy link
Member

gasche commented Nov 10, 2016

You could be forgiven by answering Alain's question, giving your external opinion on whether you think this is a good interface. (cc: @c-cube?)

@g2p
Copy link
Contributor Author

g2p commented Nov 10, 2016

The docstrings were changed per @hcarty's suggestions.

The interface was discussed in #665 as well, I picked the best option from that discussion.

@c-cube
Copy link
Contributor

c-cube commented Nov 10, 2016

I don't have a strong opinion on this interface, it looks ok to me. Maybe it would be good to document more explicitely what "monotonic" is (i.e. if [f k && compare k k' <= 0] holds, then [f k'] must hold), and maybe show a possible use case?

@g2p
Copy link
Contributor Author

g2p commented Nov 10, 2016

I added an example; I think it's enough to illustrate what a monotonic function is, since it's not especially ambiguous.

@hcarty
Copy link
Member

hcarty commented Nov 10, 2016

@gasche The interface looks OK to me as well. It seems reasonably clear from the type + name of each function what to expect.

@gasche
Copy link
Member

gasche commented Nov 10, 2016

I'll still let @alainfrisch make merge decisions. I don't want to commit to working on the standard-library-extension project, because I have too much on my plate already and it's frustrating work.

Re-reading this PR discussion, it looks like there is agreement that the interface is good, but it's not completely clear to me whether there is a consensus on the name of the exception-raising variants. I would personally be tempted to "let those who do the work decide" and go for @g2p's preference, but stdlib matters are thorny as current choices impact future choices and it's easy to have regrets in the future. Not my call!

@g2p: could you amend the Changelog entry to indicate the people whose review/discussion/proposals helped the PR move forward? You don't need to cite everyone involved in the discussion, but people whose reviewing effort you found substantial and impactful.

@dbuenzli
Copy link
Contributor

but it's not completely clear to me whether there is a consensus on the name of the exception-raising variants.

Since exception-raising variants are provided I think the current PR is the best naming scheme you can find.

@g2p
Copy link
Contributor Author

g2p commented Nov 10, 2016

Credits added, you're right. Thanks everyone for discussing and reviewing.

Finds the first/last binding where the key satisfies a monotonic
predicate.

Addresses ocaml#665, supersedes ocaml#868.
Thanks @alainfrisch for the idea and most of the implementation.
Finds the first/last element that satisfies a monotonic predicate.
Finds the first/last binding where the key satisfies a monotonic
predicate, returning an option.

Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate,
returning an option.

Followup to ocaml#885.
@g2p
Copy link
Contributor Author

g2p commented Nov 12, 2016

(rebased to fix a conflict in the change log)

@alainfrisch alainfrisch merged commit bf4b142 into ocaml:trunk Nov 12, 2016
@alainfrisch
Copy link
Contributor

Thanks to @g2p and everyone involving in the discussion for moving this forward.

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017
Finds the first/last binding where the key satisfies a monotonic
predicate.

Addresses ocaml#665, supersedes ocaml#868.
Thanks @alainfrisch for the idea and most of the implementation.
@vicuna vicuna mentioned this pull request Mar 14, 2019
stedolan added a commit to stedolan/ocaml that referenced this pull request Oct 25, 2022
sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
25188da flambda-backend: Missed comment from PR802 (ocaml#887)
9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802)
d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874)
4bbde7a flambda-backend: Simpler symbols (ocaml#753)
ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862)
a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869)
045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868)
3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866)
c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860)
ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861)
c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825)
ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857)
39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854)
c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850)
6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852)
f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843)
9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841)
8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833)
65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831)
7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830)
ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787)
3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820)
2f57378 flambda-backend: Static check noalloc (ocaml#778)
aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812)
17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803)
e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800)
ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784)
14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790)
487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795)
a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792)
96c9c60 flambda-backend: Merge ocaml-jst
a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767)
f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757)
c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756)
b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750)
8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749)

git-subtree-dir: ocaml
git-subtree-split: 25188da
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Factor out shared yaml file parsing code between the different ood data items. Simplifies the process of adding new data formats to ood.

* Result monad style
* Streamlining
* Spread
* Use Result in monad style
* Remove redefinition of Result.fold
* Formatting
* Use Result.map
* Use custom let instead of >>=
* Bring-up outreachy
* Use let* instead of >>=, cont't
* All Yaml key to be different from filename
* Refactoring
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Factor out shared yaml file parsing code between the different ood data items. Simplifies the process of adding new data formats to ood.

* Result monad style
* Streamlining
* Spread
* Use Result in monad style
* Remove redefinition of Result.fold
* Formatting
* Use Result.map
* Use custom let instead of >>=
* Bring-up outreachy
* Use let* instead of >>=, cont't
* All Yaml key to be different from filename
* Refactoring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.