CalculateAverage_JesseVanRooy (Submission 1) #335

Jesse-Van-Rooy · 2024-01-11T22:02:14Z

Check List:

Tests pass (./test.sh <username> shows no differences between expected and actual outputs)
All formatting changes by the build are committed
Your launch script is named calculate_average_<username>.sh (make sure to match casing of your GH user name) and is executable
Output matches that of calculate_average_baseline.sh

Execution time: 2.7s
Execution time of reference implementation: 2m26.472s

Hardware: AMD Ryzen 7 5700G 8C16T, 32GB RAM

Description: See how a very simplistic scalar version stacks up to manually vectorized solutions

Jesse-Van-Rooy · 2024-01-11T23:01:52Z

My Github username contains dashes, but dashes are not allowed in .java file names

gunnarmorling · 2024-01-12T17:42:49Z

src/main/java/dev/morling/onebrc/CalculateAverage_JesseVanRooy.java

+
+    static void process(MemorySegment memorySegment, ThreadResult threadResult) {
+        // initialize hash table
+        final int[] keys = new int[MAP_SIZE];


Is this resized somewhere? There can be up to 10K station names.

No, it just wasn't properly sized for the requirements.
Fixed it.
It doesn't seem to affect performance on my machine, so it should still be good to try.

I wonder though how this passed the 10K test case there is. Could you revert it to the previous value and see why that's the case? Maybe a bug still somewhere?

It didn't, I forgot to run that test ... embarrasing.

I ran some of the tests manually intelliJ because I had trouble with UTF-8 encoding of the output when diffing it with the expected result. (I'm running this on Windows).

I now fixed the UTF-8 issues by adding '-Dsun.stdout.encoding=UTF-8' to the calculate_average_.sh file (now all 11 tests succeed on my machine).
This also revealed that adding %n does not print the expected \n in the output (it prints \r\n instead of \n), so I changed that too.

In order to verify that my output matches the output of the baseline program I had to do similar changes to the baseline program & script, but did not commit those as we are probably not expected to change these files.

I do hope the -Dsun.stdout.encoding=UTF-8 does not mess with the output on other OSes, but I would not expect it to.

If this doesn't affect the outcome on other OSes at all, you could consider adding this to the run script of the baseline and changing the println in the baseline script to a print, while manually adding the \n. This would improve compatibility on Windows.

That's the thing, it passed that test for me, also with that old map size. Just tried it again by changing the size back to a value smaller than 10K. Can you reproduce this? Something seems odd here.

Oh, you probably only changed the map size.
In that case it would work because there are still multiple thread using that map, and since the 10k test has exactly 10k lines, it would never give more than 4096 lines to one thread, hence not exceeding the map size (we use a growing hashmap for the combination step, so there is no problem there either).
If you also change the VALUE_CAPACITY to the previous value, it would fail.

I actually thought about this some more:

Is it the case that the original solution (MAP_SIZE = 4096 and VALUE_CAPACITY = 512) worked for you?

Because in that case it still should have flagged a test failure on an 8 core machine (from the Readme: "Programs are run from a RAM disk (i.o. the IO overhead for loading the file from disk is not relevant), using 8 cores of the machine.").

Could it be the case that the test is being run with more cores then? Because the only way I see that the VALUE_CAPACITY = 512 solution works is if there are 20+ threads at work. And that would only happen if 'Runtime.getRuntime().availableProcessors()' returns more than 20.

Ah yes indeed. Tests run with 32 cores, evaluation with 8 then, as per the rules.

gunnarmorling · 2024-01-14T18:09:54Z

00:04.066, nice!

gunnarmorling · 2024-01-15T20:40:35Z

@Jesse-Van-Rooy, it seems something is still wrong with hashes: when running against a 1B rows file with 10K keys, results are off. To reproduce, you can create a file via ./1brc/create_measurements3.sh 1000000000 and compare against the output of the baseline (or the current leaders to be faster).

Jesse-Van-Rooy added 2 commits January 11, 2024 22:43

Submission gunnarmorling#1

ba1d1b8

Submission gunnarmorling#1 (Fixed casing of file names)

74ed850

Jesse-Van-Rooy changed the title ~~CalculateAverage_jessevanrooy (Submission 1)~~ CalculateAverage_JesseVanRooy (Submission 1) Jan 11, 2024

Submission gunnarmorling#1 (Added executable to Git permissions)

2cfe183

gunnarmorling reviewed Jan 12, 2024

View reviewed changes

Jesse-Van-Rooy added 2 commits January 12, 2024 19:43

Submission 1 (Fixed incorrect map size)

e3a11c6

Submission 1 (Fixed output problems on Windows)

383e393

gunnarmorling merged commit 30987d7 into gunnarmorling:main Jan 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CalculateAverage_JesseVanRooy (Submission 1) #335

CalculateAverage_JesseVanRooy (Submission 1) #335

Jesse-Van-Rooy commented Jan 11, 2024

Jesse-Van-Rooy commented Jan 11, 2024

gunnarmorling Jan 12, 2024

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading

gunnarmorling Jan 12, 2024

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading

gunnarmorling Jan 12, 2024

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading

Jesse-Van-Rooy Jan 13, 2024

gunnarmorling Jan 14, 2024

gunnarmorling commented Jan 14, 2024

gunnarmorling commented Jan 15, 2024

CalculateAverage_JesseVanRooy (Submission 1) #335

CalculateAverage_JesseVanRooy (Submission 1) #335

Conversation

Jesse-Van-Rooy commented Jan 11, 2024

Check List:

Jesse-Van-Rooy commented Jan 11, 2024

gunnarmorling Jan 12, 2024

Choose a reason for hiding this comment

Jesse-Van-Rooy Jan 12, 2024 • edited Loading

Choose a reason for hiding this comment

gunnarmorling Jan 12, 2024

Choose a reason for hiding this comment

Jesse-Van-Rooy Jan 12, 2024 • edited Loading

Choose a reason for hiding this comment

gunnarmorling Jan 12, 2024

Choose a reason for hiding this comment

Jesse-Van-Rooy Jan 12, 2024 • edited Loading

Choose a reason for hiding this comment

Jesse-Van-Rooy Jan 13, 2024

Choose a reason for hiding this comment

gunnarmorling Jan 14, 2024

Choose a reason for hiding this comment

gunnarmorling commented Jan 14, 2024

gunnarmorling commented Jan 15, 2024

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading

Jesse-Van-Rooy Jan 12, 2024 •

edited

Loading