Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #13

Closed
wants to merge 7 commits into from
Closed

Develop #13

wants to merge 7 commits into from

Conversation

jafingerhut
Copy link
Contributor

Hugo:

Please take a look at these modifications to criterium when you get a chance, and let me know if they or some modified version of them would be interesting to you to pull into your version.

The extra return data is useful to me when recording results from many different Clojure versions and JDKs. The changes to execute-expr reduce the overhead per call to the expression being benchmarked. The change in measurement times is quite noticeable for expressions with short run times.

jafingerhut and others added 7 commits November 3, 2012 06:27
Keeping 8192 previous return values of the benchmarked expression in
memory at once is way too many for expressions that return large data
structures.
… sequence of expressions

The goal is to see whether the results are noticeably different when
executed 'round robin' instead of doing the same expression over and
over for an extended period of time (e.g. 1 min).  Instead try 10
different expressions for 100 millisec each, for example, and then
repeat that 60 times if :sample-count is 60.

One needs to be cautious in interpreting the results, e.g. if one
expression in the sequence tends to leave behind lots of garbage that
won't be collected until a later expression is being executed.
@hugoduncan
Copy link
Owner

Andy,

This looks very interesting! Will try to get to it when conj is over.

@hugoduncan
Copy link
Owner

Thanks for this Andy! All merged.

The memory allocation for the results had been bothering me for a while, so great to see it fixed.

The round robin form looks interesting too - maybe you could add something to the README with a motivation/explanation of when to use it.

@hugoduncan hugoduncan closed this Nov 19, 2012
@jafingerhut
Copy link
Contributor Author

Glad it could be of use.

Note that the memory allocation thing that I fixed in one of my later commits of the group was one that I introduced myself in an earlier commit, I think (storing 8192 earlier results was way too much, reducing it to 4 was better).

I believe that before my changes, your code was not storing any intermediate results from one iteration to the next, but instead was calculating their hash values in each iteration. Avoiding the hash calculation in the timing loop is what I was hoping to avoid.

So my approach is still probably worse than yours in its use of memory, if the expression return value is large. It may or may not be faster in those cases depending upon how long the original code took to calculate hashes in each iteration, versus how long it takes my latest version to do the garbage collection 4 evaluations later.

I will definitely consider adding something to the README with a motivation of the round robin thing. It was motivated by Rich Hickey's recent message to clojure-dev about how to do microbenchmarking in the face of Java's HotSpot JIT, but I'd prefer to have more results showing it actually produces noticeably different results before I believe it is actually useful.

One down side of the round robin idea is that it leaves garbage from the evaluation of one expression around that might be included in GC time while timing a later expression. It would be good to have a clear example of that, too, in the documentation, as a warning for anyone who tries it.

@hugoduncan
Copy link
Owner

The garbage collection of return values happens in either approach, if I understand correctly.

I assume you mean this message from Rich: https://groups.google.com/d/msg/clojure-dev/0c-VNhEKVkI/Z7R1qKqsfN0J

I'm not convinced that round-robin is really a solution to the issue, though I haven't a better solution at the moment, given the need to maintain timing resolution.

One way of preventing GC interaction would be to force a GC between different expressions, but that possibly leads to other issues.

@hugoduncan
Copy link
Owner

@ztellman This code is now on the develop branch - maybe you could comment as to whether it improves the timing resolution, and if not, then open a top level issue for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants