New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add some examples to the stdlib's documentation #11476
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think that this is an excellent idea. See inline comments.
stdlib/atomic.mli
Outdated
# let () = | ||
let threads = Array.init 8 (fun _ -> Thread.create read_file ()) in | ||
Array.iter Thread.join threads; | ||
Printf.printf "read %d bytes\n" (Atomic.get count_bytes_read) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a bit awkward to exemplify concurrent atomics with a single-domain example that does not actually need atomics. Alternatives include using Domain
directly (con: arguably too low-level in general) or domainslib (con: outside the stdlib, may change or get replaced in the medium-term future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been testing it in a 4.14 repl on the side (in which it was surprisingly hard to get the Thread
toplevel module?!), so I don't know exactly how I should go about that. Maybe I just need a 5.0-alpha switch to run this.
Domain.spawn
should be on par in terms of complexity with Thread.create
, right? Ignoring the ticker thread…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to show not a complete program with Domain.spawn, but just the code of a library function that may be called concurrently. (For example: a "unique id" generator that works by incrementing an internal counter.) You can show the sequential version, point out that it is not domain-safe, and move to Atomic instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to Domain
. Hopefully someone can check the code, unless I get to trying 5.0-alpha on that.
@pierreweis may be interested in reviewing this.
|
note: this is not tested, my current switch is still 4.14
Could we please have the examples at the end of the modules in a dedicated section (e.g. If you write them in the module preamble they get in the way when you peruse the API reference which what you eventually end up doing 99.9% of the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left a couple of nits if that's okay :-)
stdlib/atomic.mli
Outdated
A basic use case is to have global counters that are updated in a | ||
thread-safe way: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't an even simpler example work? e.g., n
domains each incrementing the counter k times by 1
I am just inclined to think that one shouldn't really need to understand how to work with files to understand how to use atomics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to have "real" examples, I suppose. What do the maintainers think? @Octachron @gasche for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this example a tad long as well, on the other hand the BFS sounded like a natural choice for queues. (I'm often too lazy to reach for a queue when I BFS and I use two lists, "frontier" and "next", but let's forget about this nitpick.) The Atomic example also had the issue of using Thread instead of Domain. It's fixed now (I haven't tried to run it yet, I'm in holidays until the end of the month and I somehow convinced myself that replying to messages is okay but running the compiler on PRs is not), but there is still a wide margin for debate. For example, the message we send is that Domain
is the low-level, core module and that users should use higher-level abstractions (outside for the stdlib for now).
I think that the duty is on us to find a "real example" that is simpler but that @c-cube still likes, and maybe also manages to tiptoe around those difficulties.
stdlib/queue.mli
Outdated
]} | ||
|
||
For a more elaborate example, a classic algorithmic use of queues | ||
is to implement a BFS (breadth-first search) through a graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BFS through a tree would be less involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never really needed a BFS through a tree. I think BFS through a graph (of some kind) is one of the main application of queues, so I illustrate with a real example. Same as above I suppose, it depends on what maintainers think the docs of the stdlib should look like.
For |
Oh, I missed this one. Thanks for reminding me, and keep up the good work :-) |
What do we need to move this PR forward? I think that "restructuring the .mli file" could be split off to a separate PR. My understanding is that people would like a better/different example for the atomics, and are basically fine with the rest. |
I'm not sure what the simpler example could be. I agree that the IO stuff is a bit too long (even though it is, imho, a real use case for atomic counters: metrics). |
Concerning the examples in |
I expanded a bit the second example, and replaced the first one with a groundbreaking Proof Of Work™ implementation utilizing multiple cores to break Difficult Mathematical Problems™. Basically just showing how to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new first example of Atomic better, and overall I like the PR.
I think we should consider merging it; it's better to have examples than not, and I'm happy with the current ones. (We can always refine a bit later if we have concerns.)
Approved. (Note that stdilb PRs require two approvals.)
let () = | ||
let criterion n = n <= 100 in | ||
let threads = | ||
Array.init 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of 8
you could use Domain.recommended_domain_count
here. (This helps drive down the point that domains are not lightweight threads you can spawn as many as you want.)
(* find a number that satisfies [p], by... trying random numbers | ||
until one fits. *) | ||
let find_number_where (p:int -> bool) = | ||
let rand = Random.State.make_self_init() in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I became aware of splitting random generators, I would use a different approach where I seed a generator once at toplevel, and then split it repeatedly (on the main domain), passing a split state to find_number_where
. This lets you easily control seeding (non-deterministic or deterministic are both possible). But the current approach is slightly simpler and also fine, this example is not about Random itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (modulo a few minor stylistic comments)
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Changes
Outdated
@@ -60,6 +60,9 @@ Working version | |||
halving memory usage while remaining tail-recursive. | |||
(Nicolás Ojeda Bär, review by Xavier Leroy and Gabriel Scherer) | |||
|
|||
- #11476: Add examples in documentation of Hashtbl, Queue, Atomic, Format | |||
(Simon Cruanes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you complete the list of reviewers? (There is a "reviewers" list in the top right of the PR main webpage, which is accurate as far as I can tell.)
Merged, thanks! |
Thank you @nojb and to all the reviewers! 😁 |
No description provided.