-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more functions #3
Conversation
New property 🎉 I think now it's harder to write an implementation that is wrong that passes the property than the other way around.
|
||
def frequency_distribution(samples) do | ||
Enum.reduce(samples, %{}, fn sample, counts -> | ||
Map.update(counts, sample, 1, fn old_value -> old_value + 1 end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 suggestion : This style is not exactly functional. Consider writing a private function based on recursion, like so:
https://elixir-lang.org/getting-started/recursion.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it's important to point out that, for large samples, Enum will become very slow. One should consider using Stream
or even, for very huge data, GenStage, something like Flow
: https://hexdocs.pm/flow/Flow.html
Bear in mind that erlang was not built for mathematical computation, so it won't have a performance as good as other languages (python or java for example). It was built with communication, parallel processing and scalability.
Take a look here: https://stackoverflow.com/questions/13629142/why-is-erlang-slower-than-java-on-all-these-small-math-benchmarks#13629562
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there,
thanks for your input.
re functional: For me FP is more about separation of side effects/pure functions and maybe immutable data structures. I'm well familiar with the concept of recursion and also use it in this library. I know lots of FP code uses recursion, reduce is also arguably very functional. If you have a recursive implementation of this function that is more readable and/or faster I'm happy to accept PRs.
re performance: I'm also aware of the performance trade offs of Erlang, please refer to the performance section of the README. Parallelization is currently not really a focus of Statistex but might be an option later. The major client of statistex (benchee) already calculates the statistics in parallel for multiple data sets so this would be unlikely to yield performance improvements.
Or well extract statistics functions we already built but only used in the calculation of other values so far :)