-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial Elixir implementation #44
Conversation
No worries, it's good to have a new one :) Unfortunately Elixir is not available in the main repos
Could you please fix this in the Dockerfile. Also no need for sudo. |
Dockerfile updated to install Elixir using the instructions in their website :) |
The results are looking like this.
|
|> Enum.reduce(%{}, fn (line, acc) -> | ||
line | ||
|> String.strip | ||
|> String.split(~r/\s/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should omit String.split and pass split: true, will probably make it a little bit faster.
I have only took a glance at the solution, but it reminded me of a exercism excersice. You could browse the solution to come up with some idea. Here is mine: http://exercism.io/submissions/a06a51b9c4f54177a29401d0df74b959 |
It's similar in the spirit, but one of the biggest difference is the type I've already pulled and doing furthers inspects. For now I try to speed up this single process implementation, since single I don't assume, that we will get into top ten with single process though. Ilyes notifications@github.com schrieb am Sa., 26. März 2016 um 10:53 Uhr:
|
The Dict module is now deprecated in favour of the Map module.
We are using pattern-matching for the annonymous functions, which makes the code more readable and yields a small performance improvement.
@NobbZ I've upgraded the pull request with your suggestions (thank you a big time!) . I've also started to use Map instead of Dict, since this last one is deprecated. I'm taking a look at the possibility of using Streams to delay the computation until it is really necessary and avoid creating intermediate lists. |
I'm still struggling with setting everything up. After I realized that my project folder isn't shared between docker and system but cloned from git fresh, I have now your branch tested once, but running |
@NobbZ for testing my pull request I've deleted the
|
We now use streams to process input before creating a dictionary to count appearances. This allows to delay computation and reduces the number of intermediate lists created.
Have you already done a new benchmark after doing the latest changes? Any recognizable improvements? |
Oh, and I just realized, you are doing all the work during compiletime. This is considered to be much slower. I will take a look into that issue when home. |
I've measured a small performance improvement, but nothing too noticeable :( |
Okay, still the last in the list, but this is the result of my last run:
Run on a “Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz” with 8 GiB of Memory. When I started I was at roughly 2m30s user-time. I will extract a patch after I got some sleep (~8 h). |
1. Removed `Stream.reject/2` call and moved it's functionality into the `Enum.reduce/3` (doing a pattern match). 2. Removed explicit conversion of the map to a list. Maps are already proper `Enum`s and can be fed into `Enum.sort/2`. 3. Preparing an IO-list and feeding this into `IO.puts/1`. This way does safe us up some function calls (less interference by the scheduler, also it reduces the amount of buffer flushes.
OK, I already wrote it, but it seems as if I forgot to submit... And I totally screwed my local stuff, I'll try something else… |
OK, I created a PR to @belaustegui's repo: belaustegui/wordcount#1, after he has merged over there, it should appear here. |
Improved performance of elixir solution a bit
This last patch has made a great performance improvement. 30 seconds faster!!!!
|
Please see this issue: #50 I'm looking forward to your contribution, let me know if it's ready to be merged. |
I'm still fiddling around locally, but currently I can't report any further improvements. The very last thing hat comes to my mind is a rewrite of the way we currently read the input, but I can't start that today, at the end it's Easter and some family time. So I think it can get merged for now after @belaustegui rebased. |
Conflicts: Dockerfile
Using `IO.puts` caused an empty line to be added at the end of the output, which broke the tests.
I think that it is ready to be merged. |
@@ -4,6 +4,10 @@ cd cpp | |||
g++ wordcount_baseline.cpp -std=c++11 -o wordcount_baseline -O3 | |||
g++ wordcount.cpp -std=c++11 -o wordcount -O3 | |||
|
|||
cd ../elixir | |||
# Elixir has to run the script to compile it (http://stackoverflow.com/questions/35722248/the-command-elixirc-is-compiling-and-executing-the-code) | |||
echo "wadus" | elixir elixir/wordcount.ex > /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, in fact its not. because there is no binary or whatever created.
The way it is used now, is very bad meta-programming voodoo and does not need any precompilation.
So these 3 lines can be removed for now.
But while I am experimenting locally I already transfered this unidiomatic use of the meta programming capabilities into a proper mix-project which also does create a proper executable (this doesn't give any perf. improvements at all!). But it is much more clean.
But I will only continue that version of mine, when I this got merged and I figured out how to manually controll IO-Streams in a more efficient way ;) StdLib does a good job in keeping manual flushing away from me ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove it before merge-ing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easier if @belaustegui removes it himself, creating another PR on his fork or providing a patch-file here might produce much of unnecessary noise. Also he seems still to actively follow this issue.
Tests pass, will merge soon. I know you'll hate me because of this, but I added a helper file for the evaluator script: https://github.com/juditacs/wordcount#adding-a-new-program This way the contributors are listed for each file (git blame needs to know the source file). Could you please add a line for elixir? |
Since there is currently no binary created, is that line still needed? |
Good idea, I'll fix it. No, don't add anything. Can I merge? |
Of course. Thanks! |
I'll merge this and we can remove those lines later. |
I've added a basic Elixir implementation. The steps are explained to make clear what the script is doing in each step.
It is currently the slowest implementation across all the tested languages :( . I'm betting it is caused by Elixir creating and destroying the dictionary in each iteration, buy I'm not really sure about it.
Hope someone with more knowledge can help improving this solution.