Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more functions #3

Merged
merged 4 commits into from
Jul 5, 2019
Merged

Add more functions #3

merged 4 commits into from
Jul 5, 2019

Conversation

PragTob
Copy link
Member

@PragTob PragTob commented Jul 3, 2019

Or well extract statistics functions we already built but only used in the calculation of other values so far :)


def frequency_distribution(samples) do
Enum.reduce(samples, %{}, fn sample, counts ->
Map.update(counts, sample, 1, fn old_value -> old_value + 1 end)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 suggestion : This style is not exactly functional. Consider writing a private function based on recursion, like so:
https://elixir-lang.org/getting-started/recursion.html

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it's important to point out that, for large samples, Enum will become very slow. One should consider using Stream or even, for very huge data, GenStage, something like Flow: https://hexdocs.pm/flow/Flow.html
Bear in mind that erlang was not built for mathematical computation, so it won't have a performance as good as other languages (python or java for example). It was built with communication, parallel processing and scalability.
Take a look here: https://stackoverflow.com/questions/13629142/why-is-erlang-slower-than-java-on-all-these-small-math-benchmarks#13629562

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there,

thanks for your input.

re functional: For me FP is more about separation of side effects/pure functions and maybe immutable data structures. I'm well familiar with the concept of recursion and also use it in this library. I know lots of FP code uses recursion, reduce is also arguably very functional. If you have a recursive implementation of this function that is more readable and/or faster I'm happy to accept PRs.

re performance: I'm also aware of the performance trade offs of Erlang, please refer to the performance section of the README. Parallelization is currently not really a focus of Statistex but might be an option later. The major client of statistex (benchee) already calculates the statistics in parallel for multiple data sets so this would be unlikely to yield performance improvements.

@PragTob PragTob merged commit d1f0913 into master Jul 5, 2019
@PragTob PragTob deleted the add-more-functions branch July 5, 2019 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants