-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CalculateAverage_JesseVanRooy (Submission 1) #335
Conversation
My Github username contains dashes, but dashes are not allowed in .java file names |
|
||
static void process(MemorySegment memorySegment, ThreadResult threadResult) { | ||
// initialize hash table | ||
final int[] keys = new int[MAP_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resized somewhere? There can be up to 10K station names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it just wasn't properly sized for the requirements.
Fixed it.
It doesn't seem to affect performance on my machine, so it should still be good to try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder though how this passed the 10K test case there is. Could you revert it to the previous value and see why that's the case? Maybe a bug still somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't, I forgot to run that test ... embarrasing.
I ran some of the tests manually intelliJ because I had trouble with UTF-8 encoding of the output when diffing it with the expected result. (I'm running this on Windows).
I now fixed the UTF-8 issues by adding '-Dsun.stdout.encoding=UTF-8' to the calculate_average_.sh file (now all 11 tests succeed on my machine).
This also revealed that adding %n does not print the expected \n in the output (it prints \r\n instead of \n), so I changed that too.
In order to verify that my output matches the output of the baseline program I had to do similar changes to the baseline program & script, but did not commit those as we are probably not expected to change these files.
I do hope the -Dsun.stdout.encoding=UTF-8 does not mess with the output on other OSes, but I would not expect it to.
If this doesn't affect the outcome on other OSes at all, you could consider adding this to the run script of the baseline and changing the println in the baseline script to a print, while manually adding the \n. This would improve compatibility on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the thing, it passed that test for me, also with that old map size. Just tried it again by changing the size back to a value smaller than 10K. Can you reproduce this? Something seems odd here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you probably only changed the map size.
In that case it would work because there are still multiple thread using that map, and since the 10k test has exactly 10k lines, it would never give more than 4096 lines to one thread, hence not exceeding the map size (we use a growing hashmap for the combination step, so there is no problem there either).
If you also change the VALUE_CAPACITY to the previous value, it would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually thought about this some more:
Is it the case that the original solution (MAP_SIZE = 4096 and VALUE_CAPACITY = 512) worked for you?
Because in that case it still should have flagged a test failure on an 8 core machine (from the Readme: "Programs are run from a RAM disk (i.o. the IO overhead for loading the file from disk is not relevant), using 8 cores of the machine.").
Could it be the case that the test is being run with more cores then? Because the only way I see that the VALUE_CAPACITY = 512 solution works is if there are 20+ threads at work. And that would only happen if 'Runtime.getRuntime().availableProcessors()' returns more than 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes indeed. Tests run with 32 cores, evaluation with 8 then, as per the rules.
00:04.066, nice! |
@Jesse-Van-Rooy, it seems something is still wrong with hashes: when running against a 1B rows file with 10K keys, results are off. To reproduce, you can create a file via |
Check List:
./test.sh <username>
shows no differences between expected and actual outputs)calculate_average_<username>.sh
(make sure to match casing of your GH user name) and is executablecalculate_average_baseline.sh
Hardware: AMD Ryzen 7 5700G 8C16T, 32GB RAM
Description: See how a very simplistic scalar version stacks up to manually vectorized solutions