-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Octave micro benchmarks don't represent language characteristics #13042
Comments
Please reread #5128 (comment). This is well-trod ground. |
As stated above I disagree with that comment for several reasons:
|
Actually, I do think it would be interesting to compare your implementations --- they have a nice balance of vectorizing but not just calling one builtin function to do all the work. It would be good to submit your code as a pull request. |
and the matlab jit apparently got quite a bit smarter with 2015b, so if we have that on the benchmark machine it'll also be interesting to look at - though we all have better things to do than deal with matlab's installer |
I am going to prepare a pull request. May I remove the deprecated tests, which have been commented out, to clean up the file? |
The same argument has been made about R as well, and about Matlab in the past. The purpose was mainly to be able to judge how Julia compares with other language implementations at writing simple loops and other structures. Clearly, in the case of Matlab, things have improved dramatically, as reported on julia-users to be competitive on these implementations. Instead of changing the existing octave benchmarks, I would suggest to have a new file that implements these benchmarks in a way a current octave user would to get performance. We can have the same for other languages as well. It is possible that Octave will have an LLVM backend in the future (there was a GSOC project, I think), in which case it will also improve on these same benchmarks. |
The fact that these "unfair" benchmarks have improved so much in Matlab is a clear market-based indicator that the benchmarks are actually fair – Matlab's customers care enough about the performance of this kind of code that MathWorks spent a lot of money to develop a JIT that can make it faster. |
That being said, it seems fine to have alternate vectorized implementations of the benchmarks. |
Could you please render a decision (on the mailing list for example), what the micro benchmarks shall be for?
From a language developers point of view the first option is a valid one, and when this intend is clearly stated in the text, the benchmark results are fair actually. A user should prefer the second option, because you program to solve problems—you don't program to write loops. (Although some Matlab customers might think differently, as @StefanKarpinski said.) Maybe the only way out is to have both kind of benchmarks, as suggested by @ViralBShah. The current JIT compiler in Octave is so trivial that you can safely neglect its existence. On the one hand it is a difficult task to make a decent JIT compiler (like in Julia), on the other hand it is rarely needed when you use a high level language where you don't want to struggle with low level stuff like loops. OT: There are also algorithms where Octave is faster than Matlab, see Table 4 on page 9 in this Open Access journal. [jiahao: fixed link] |
The original purpose was 1. but over time, people have started asking for 2. as well. It is certainly having both kind. We have to figure out how to present the results. |
Yes, I'm all in favor of expanded, more comprehensive benchmarks.
Fully disagree. First, JIT is beside the point. You can also combine a good static compiler with array syntax, as in fortran 90 and various C++ libraries. Second, array programming does not even give you good performance. While I'm sure your implementations are much faster in octave than what we have, my guess is they're still slower than C or Julia. Third, is that a |
I think the main point that should be conveyed from the documentation is Each problem has a different cost function, and there are many many On Tue, Sep 15, 2015 at 11:16 AM, Jeff Bezanson notifications@github.com
|
Yes, that's the big takeaway for me also, Julia optimized run-time performance, yes, but these days, the programmer time is a lot more costly, so the fact that I can get good performance much quicker in Julia than in C/C++ etc. when writing new code is the big win. I like @tbreloff 's way of expressing that, will appeal to the technical/scientific community! (minimizing a cost function). |
I'm not sure how much clearer we can be about the purpose of these benchmarks:
Your argument seems not to be that we're not measuring what we claim to be, but rather that you don't want us to measure and report it on Julia's home page because the results make Octave look bad. I made this somewhat relevant comment here:
The major failing of these benchmarks seems to be that they compute things that can be computed with vectorized algorithms. There do exist problems that can't be conveniently vectorized, however, and with a little stretch of the imagination you can see that these benchmarks show that those problems will be very slow in Octave whereas they will be as fast as C when written in Julia. This fact has been borne out over and over again as people have ported their iterative or recursive codes from R/Python/Matlab/Octave to Julia and seen 10-1000x speedups. |
It just might be that the description is a little verbose. Maybe it would be good to take the highlighted statements bullet points, so they're easy to see, and then add explanation below. |
After your comments and after reading the text over and over again, I understand that the only purpose was to benchmark the (JIT) compiler performance between various languages. Under this premise, the Octave implementation is perfectly fair. However, the wording of the text should be changed, because it gets misinterpreted by me and others easily (see #2412 and #5128 for examples). Most prominently is the confusion between “algorithm” and “implementation” in the text. IMHO “algorithm” is a high-level description of solving a particular problem, but the JIT compiler benchmarks shall compare particular “implementations” between languages. The difference is that particular implementations of an algorithm could use short-cuts or different patterns, which a language has to offer. They shall not for these benchmarks, instead each implementation shall use the very same sequence of particular operations.
I'd say yes to function calls, array operations, numerical loops, but testing the other topics would call for different benchmarks.
The particular implementation(!) of an algorithm uses loops and function declarations to the extend the language has to offer, completely suppressing idioms of the language.
All languages use the same implementation
… specific implementations across languages The labels in the table also support the confusion between particular problems being solved and the implementation being used: fib should be called recursive_fib, parse_int should be called pare_int_loop, quicksort should be called in-place_quicksort, mandel should be called nested_loop_mandel, pi_sum should be called pi_sum_loop. However, a column header “Implementation” could also server the purpose. |
I see the confusion now. Unfortunately, as far as I understand it, that isn't correct usage of the word "implementation" and the word "algorithm" means what you consider "implementation" to mean. I do think that we should relabel the benchmarks to give them labels that express what they're testing. For some this is a little hard because they test a few things, but I'm sure we can come up with something. |
For the purposes of comparison, I've run the modified benchmarks on our benchmark machine.
|
Let me risk another approach on #2412 and #5128.
The Octave benchmark timings look rather extreme on your benchmark table (julialang.org). I have looked at the benchmark code and it simply does not represent how one would actually program in that language. You write that the micro benchmarks shall “give a sense how … numerical programming in that particular language is … all of the benchmarks are written to test the performance of specific algorithms, expressed in a reasonable idiom in each language”. The micro benchmark for Octave absolutely fails that pretended target.
Let me show you how I would reasonably implement the algorithms in Octave (probably also works in proprietary Matlab, but I can't check that). I haven't spend more than a minute to think about each algorithm.
Could you please clarify what the actual purpose of your benchmarks is? I would be fine with the benchmarks saying: Loops and recursion in Octave are slow. However, the benchmark table suggests that, e.g., computing the Mandelbrot set in Octave is much slower than it actually is.
The text was updated successfully, but these errors were encountered: