Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some examples to the stdlib's documentation #11476

Merged
merged 46 commits into from Sep 14, 2022
Merged

Conversation

c-cube
Copy link
Contributor

@c-cube c-cube commented Aug 4, 2022

No description provided.

@c-cube c-cube changed the title add some examples to Format add some examples to the stdlib's documentation Aug 4, 2022
stdlib/format.mli Outdated Show resolved Hide resolved
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think that this is an excellent idea. See inline comments.

stdlib/atomic.mli Outdated Show resolved Hide resolved
# let () =
let threads = Array.init 8 (fun _ -> Thread.create read_file ()) in
Array.iter Thread.join threads;
Printf.printf "read %d bytes\n" (Atomic.get count_bytes_read)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit awkward to exemplify concurrent atomics with a single-domain example that does not actually need atomics. Alternatives include using Domain directly (con: arguably too low-level in general) or domainslib (con: outside the stdlib, may change or get replaced in the medium-term future).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been testing it in a 4.14 repl on the side (in which it was surprisingly hard to get the Thread toplevel module?!), so I don't know exactly how I should go about that. Maybe I just need a 5.0-alpha switch to run this.

Domain.spawn should be on par in terms of complexity with Thread.create, right? Ignoring the ticker thread…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to show not a complete program with Domain.spawn, but just the code of a library function that may be called concurrently. (For example: a "unique id" generator that works by incrementing an internal counter.) You can show the sequential version, point out that it is not domain-safe, and move to Atomic instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to Domain. Hopefully someone can check the code, unless I get to trying 5.0-alpha on that.

stdlib/format.mli Outdated Show resolved Hide resolved
stdlib/hashtbl.mli Outdated Show resolved Hide resolved
stdlib/queue.mli Outdated Show resolved Hide resolved
stdlib/queue.mli Outdated Show resolved Hide resolved
@shindere
Copy link
Contributor

shindere commented Aug 4, 2022 via email

@dbuenzli
Copy link
Contributor

dbuenzli commented Aug 6, 2022

Could we please have the examples at the end of the modules in a dedicated section (e.g. {1:examples Examples}}) rather than in the module preamble (where a simple See {{!examples}examples}.} will do).

If you write them in the module preamble they get in the way when you peruse the API reference which what you eventually end up doing 99.9% of the time.

Copy link
Contributor

@ulugbekna ulugbekna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left a couple of nits if that's okay :-)

stdlib/hashtbl.mli Outdated Show resolved Hide resolved
Comment on lines 20 to 21
A basic use case is to have global counters that are updated in a
thread-safe way:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't an even simpler example work? e.g., n domains each incrementing the counter k times by 1

I am just inclined to think that one shouldn't really need to understand how to work with files to understand how to use atomics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to have "real" examples, I suppose. What do the maintainers think? @Octachron @gasche for example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this example a tad long as well, on the other hand the BFS sounded like a natural choice for queues. (I'm often too lazy to reach for a queue when I BFS and I use two lists, "frontier" and "next", but let's forget about this nitpick.) The Atomic example also had the issue of using Thread instead of Domain. It's fixed now (I haven't tried to run it yet, I'm in holidays until the end of the month and I somehow convinced myself that replying to messages is okay but running the compiler on PRs is not), but there is still a wide margin for debate. For example, the message we send is that Domain is the low-level, core module and that users should use higher-level abstractions (outside for the stdlib for now).

I think that the duty is on us to find a "real example" that is simpler but that @c-cube still likes, and maybe also manages to tiptoe around those difficulties.

stdlib/hashtbl.mli Outdated Show resolved Hide resolved
stdlib/queue.mli Outdated
]}

For a more elaborate example, a classic algorithmic use of queues
is to implement a BFS (breadth-first search) through a graph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BFS through a tree would be less involved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never really needed a BFS through a tree. I think BFS through a graph (of some kind) is one of the main application of queues, so I illustrate with a real example. Same as above I suppose, it depends on what maintainers think the docs of the stdlib should look like.

@dbuenzli
Copy link
Contributor

other modules would benefit too.

For Set and Map it's waiting for reviews in #11410

@xavierleroy
Copy link
Contributor

For Set and Map it's waiting for reviews in #11410

Oh, I missed this one. Thanks for reminding me, and keep up the good work :-)

@gasche
Copy link
Member

gasche commented Aug 21, 2022

What do we need to move this PR forward?

I think that "restructuring the .mli file" could be split off to a separate PR. My understanding is that people would like a better/different example for the atomics, and are basically fine with the rest.

@c-cube
Copy link
Contributor Author

c-cube commented Aug 21, 2022

I'm not sure what the simpler example could be. I agree that the IO stuff is a bit too long (even though it is, imho, a real use case for atomic counters: metrics).

@Octachron
Copy link
Member

Concerning the examples in Atomic, I have the impression that it might work better to drop the first example and expand the second example to explain that the use of atomic guarantees that not elements are silently dropped from the queue.

@c-cube
Copy link
Contributor Author

c-cube commented Sep 14, 2022

I expanded a bit the second example, and replaced the first one with a groundbreaking Proof Of Work™ implementation utilizing multiple cores to break Difficult Mathematical Problems™.

Basically just showing how to use bool Atomic.t to stop multiple threads; and a global thread-safe iteration counter. Both are best expressed with atomic.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new first example of Atomic better, and overall I like the PR.

I think we should consider merging it; it's better to have examples than not, and I'm happy with the current ones. (We can always refine a bit later if we have concerns.)

Approved. (Note that stdilb PRs require two approvals.)

let () =
let criterion n = n <= 100 in
let threads =
Array.init 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 8 you could use Domain.recommended_domain_count here. (This helps drive down the point that domains are not lightweight threads you can spawn as many as you want.)

(* find a number that satisfies [p], by... trying random numbers
until one fits. *)
let find_number_where (p:int -> bool) =
let rand = Random.State.make_self_init() in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I became aware of splitting random generators, I would use a different approach where I seed a generator once at toplevel, and then split it repeatedly (on the main domain), passing a split state to find_number_where. This lets you easily control seeding (non-deterministic or deterministic are both possible). But the current approach is slightly simpler and also fine, this example is not about Random itself.

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo a few minor stylistic comments)

stdlib/format.mli Outdated Show resolved Hide resolved
stdlib/format.mli Outdated Show resolved Hide resolved
stdlib/format.mli Outdated Show resolved Hide resolved
stdlib/format.mli Outdated Show resolved Hide resolved
stdlib/hashtbl.mli Outdated Show resolved Hide resolved
stdlib/hashtbl.mli Show resolved Hide resolved
stdlib/moreLabels.mli Outdated Show resolved Hide resolved
stdlib/queue.mli Outdated Show resolved Hide resolved
stdlib/templates/hashtbl.template.mli Outdated Show resolved Hide resolved
c-cube and others added 10 commits September 14, 2022 09:53
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Co-authored-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Changes Outdated
@@ -60,6 +60,9 @@ Working version
halving memory usage while remaining tail-recursive.
(Nicolás Ojeda Bär, review by Xavier Leroy and Gabriel Scherer)

- #11476: Add examples in documentation of Hashtbl, Queue, Atomic, Format
(Simon Cruanes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you complete the list of reviewers? (There is a "reviewers" list in the top right of the PR main webpage, which is accurate as far as I can tell.)

@nojb nojb merged commit e0afc0c into ocaml:trunk Sep 14, 2022
@nojb
Copy link
Contributor

nojb commented Sep 14, 2022

Merged, thanks!

@c-cube
Copy link
Contributor Author

c-cube commented Sep 14, 2022

Thank you @nojb and to all the reviewers! 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants