New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re episode 9: Benchmarks #48

Merged
merged 8 commits into from Sep 16, 2018

Conversation

Projects
None yet
2 participants
@killercup
Copy link
Contributor

killercup commented Sep 15, 2018

I've been meaning to play a bit more with criterion, and this was a
good excuse. I actually started this last week, but just go around to
making a PR.

If you run cargo bench in the new episode/9/bench directory,
it'll give you some nice CLI output, but also render some nice graphs
as HTML (open target/criterion/report/index.html). Oh, and one thing
to note: I've never really done anything with go. If there are some
compiler flags I should've set, please add them. I didn't do anything
special for Rust, though; enabling LTO and setting target-cpu=native
could do some nice things.

I'm actually pretty surprised by the results -- so I won't spoil them for
you. If you want to optimize the code (without using a completly
different approach like my PR), that might be an interesting episode,
too!

@mre

This comment has been minimized.

Copy link
Collaborator

mre commented Sep 15, 2018

Thanks for this!
Quite astonishing.

I got this:

Episode 9/rust fixed    time:   [543.57 ms 550.17 ms 564.15 ms]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild
Benchmarking Episode 9/rust pascal: Collecting 10 samples in estimated 7.4567 s (165Episode 9/rust pascal   time:   [45.389 ms 45.661 ms 46.199 ms]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high severe
Benchmarking Episode 9/go fixed: Collecting 10 samples in estimated 6.2936 s (110 iteEpisode 9/go fixed      time:   [56.025 ms 56.144 ms 56.285 ms]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe
Benchmarking Episode 9/go worker: Collecting 10 samples in estimated 8.7731 s (110 itEpisode 9/go worker     time:   [78.539 ms 79.726 ms 80.935 ms]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

For plotting, gnuplot is required. Also, it requires a nightly compiler for now.
Maybe we could add these two hints to the bench/README.md?
Could you also update the errata section in the episode.yml?
Finally, not sure if the bench/src/main.rs is really needed?

Apart from that, very nice work! 🙌

@killercup

This comment has been minimized.

Copy link
Contributor Author

killercup commented Sep 15, 2018

For plotting, gnuplot is required.

What happens if you don't have it? You can also do cargo bench -- --noplot by the way.

Also, it requires a nightly compiler for now.

Oh? … Ah! I used nightly cargo which sets edition = "2018". Interesting. Removed.

Finally, not sure if the bench/src/main.rs is really needed?

Purged it from existence.

Maybe we could add these two hints to the bench/README.md?

Will do.

Could you also update the errata section in the episode.yml?

Do you want to host the html output somewhere? We could link to that.

Otherwise, here's the violin plot I got:

bildschirmfoto 2018-09-15 um 16 19 41

@killercup

This comment has been minimized.

Copy link
Contributor Author

killercup commented Sep 16, 2018

For some reason I keep thinking about this. One thing I totally forgot to mention was that you could easily use std's hash map and impl our special version of union_with by yourself:

for (key, val) in current {
    result.entry(key).and_modify(|e| *e += val).or_insert(val);
}

This will probably perform the same as my version using im.

@mre

This comment has been minimized.

Copy link
Collaborator

mre commented Sep 16, 2018

Good work! I think the graph deserves a place in the show notes. You could add it as an image to your new entry in the errata section. It could link to the rendered HTML document.
My suspicion is that it would look fabulous on the new website (https://hello-rust.show/9/)

killercup added some commits Sep 15, 2018

Re episode 9: Benchmarks
I've been meaning to play a bit more with [criterion], and this was a
good excuse. I actually started this last week, but just go around to
making a PR.

If you run `cargo bench` in the new `episode/9/bench` directory,
it'll give you some nice CLI output, but also render some nice graphs
as HTML (`open target/criterion/report/index.html`). Oh, and one thing
to note: I've never really done anything with go. If there are some
compiler flags I should've set, please add them. I didn't do anything
special for Rust, though; enabling LTO and setting `target-cpu=native`
could do some nice things.

I'm actually pretty surprised by the results -- so I won't spoil them for
you. If you want to optimize the code (without using a completly
different approach like my PR), that might be an interesting episode,
too!

[criterion]: https://japaric.github.io/criterion.rs/book/
Clean up benchmark code
Why? Because this morning, I was an idiot. Tonight, I'm an idiot who
just saw [this].

[this]: https://github.com/steveklabnik/indexlist/blob/ffdd3a296b4a9149ea90a676fa35d036644ab90a/benches/benchmarks.rs
Add simpler Rust 'union' impl
The diff between `fixed/` and `union/` is pretty small.
Add benchmarking results violin plot
So we can have a nice, permanent link to it.

@killercup killercup force-pushed the killercup:9-oops-benchmarking-happened branch from 5adc02c to 4e38341 Sep 16, 2018

@killercup killercup force-pushed the killercup:9-oops-benchmarking-happened branch from 4e38341 to f4f6d59 Sep 16, 2018

@killercup

This comment has been minimized.

Copy link
Contributor Author

killercup commented Sep 16, 2018

Alright, updated and rebased on master.

You might need to specify a max-width for that image, it's pretty large. Also, I wanted to use your Makefile to test how this looks but gave up after seeing I needed to have a ../website dir.

@killercup

This comment has been minimized.

Copy link
Contributor Author

killercup commented Sep 16, 2018

Oh, and I added a commit which only slightly changes your "fixed" version to use union. This means the diff is quite small:

--- fixed/src/main.rs	2018-09-02 15:57:02.000000000 +0200
+++ union/src/main.rs	2018-09-16 16:47:13.000000000 +0200
@@ -3,25 +3,30 @@
 use rayon::prelude::*;
 use std::collections::HashMap;
 use std::env;
+use std::error::Error;
 use std::fs;
-use std::sync::Mutex;
 
-#[derive(Debug)]
-enum Error {
-    Lock,
-}
-
-type Words = Mutex<HashMap<String, u32>>;
+type Words = HashMap<String, u32>;
 
 fn main() -> Result<(), Box<Error>> {
-    let w = Words::new(HashMap::new());
-    env::args()
+    let words = env::args()
         .skip(1)
         .collect::<Vec<String>>()
         .par_iter()
-        .for_each(|arg| tally_words(arg.to_string(), &w).unwrap());
+        .map(|arg| {
+            tally_words(arg)
+                .map_err(|e| eprintln!("Error processing {}: {:?}", arg, e))
+                .unwrap_or_default()
+        }).reduce(
+            || Words::new(),
+            |mut result, current| {
+                for (key, val) in current {
+                    result.entry(key).and_modify(|e| *e += val).or_insert(val);
+                }
+                result
+            },
+        );
 
-    let words = w.lock().map_err(|_| Error::Lock)?;
     for (word, count) in words.iter() {
         if *count > 1 {
             println!("{} {}", count, word)
@@ -31,15 +36,13 @@
     Ok(())
 }
 
-fn tally_words(filename: String, w: &Words) -> Result<(), Box<Error>> {
-    let contents = fs::read_to_string(filename).expect("Unable to read file");
+fn tally_words(filename: &str) -> Result<Words, Box<Error>> {
+    let mut words = Words::new();
+    let contents = fs::read_to_string(filename)?;
 
     for s in contents.split_whitespace() {
         let key = s.to_lowercase();
-        {
-            let mut map = w.lock().map_err(|_| Error::Lock)?;
-            *map.entry(key).or_insert(0) += 1;
-        }
+        *words.entry(key).or_insert(0) += 1;
     }
-    Ok(())
+    Ok(words)
 }

Bonus: Got a concurrent map access error while running the "go worker" benchmark. No clue what's up with that, though.

@mre

This comment has been minimized.

Copy link
Collaborator

mre commented Sep 16, 2018

So many good things in this PR. ❤️
Thanks for that!

@mre mre merged commit 6432e97 into hello-rust:master Sep 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment