-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the effects of Garbage Collection #1
Comments
Looked into this, So for this use case - that might work out or not, we'd either reimplement the relevant parts of It shouldn't be that much, just more than I can fathom in an evening after a long long day :) (wanted to give this a head start before maybe tackling memory measurements as it should improve results but yeah, probably not now :) ) |
Which is pretty trivial. Just spawn processes and link/monitor them and wait for messages or timeouts. |
Yup I just didn't do this a whole lot yet 😅 I guess it's a good way to add to my experience 😁 It's at least "slightly" more difficult than pure async/await and I'd hoped that there'd be an interface into those that we could just use |
In thinking about this before, I wondered if this might be something useful to add to Elixir itself. Maybe we should propose this feature for |
Yeah I saw a couple of different ways Also I see an argument being made that if you care that much about specific process options maybe you shouldn't be using a high level abstraction anyways 👼 However, pretty sure we'd still need to implement it ourselves here - that'd be elixir 1.6 at the earliest I believe and until we wanna drop support for everything lower than that... :| |
Here's an idea for a different approach. As I understand, the problem with cost of garbage collection is that it is spread unevenly between runs. Disabling GC completely is one way of evening it out - making gc always run is another. This means the execution would be slower than it would normally be, but it would be consistent. The way this could be achieved: during warmup, don't run GC and allow the heap to grow to the desired size, manually run GC with |
👋 Thanks for the Input Michal! Iirc we actually run garbage collection before warmup and before real execution time. The idea to run garbage collection after every invocation - without counting it into the total runtime - is there and is already possible through the Counting garbage collection into the runtime is non desirable imo. When you look at the example above, with an input size of 1k you can see that the sample size is more than 10 times smaller which also means that both together are more than 10 times slower (due to the overhead GC incurs) and so especially for small benchmarks that'd leave you without any distinction as GC makes up the majority of the run time. At the same time, the behaviour - if wanted - is easy to do on the user side by just adding |
Ah, I saw this discussion some time ago (I think you linked it on twitter), but missed it now when I was looking though issues. Maybe a settling like |
Huh, quite the intricate/interesting ideas you got! :) I like the last idea a lot, it sounds nice and self tuning. As long as people don't bechmark random stuff (which I'd like to add btw. - e.g. the data generation from property based testing) the intervals in which GCs occur could/should be quite stable but then again with a growing and shrinking heap it might be tricky. A first version can very well be "do it yourself in an after_each however you see fit" and then we can still implement builtin configuration later - which is also nice as it can be shown as part of the benchmark configuration then. |
One of the other issues with the garbage collection as it stands now is it makes accurately measuring memory usage of a particular scenario extremely unreliable. In my earlier spikes on measuring that sometimes the net memory usage for a scenario would even be measured as a negative number since garbage collection was running randomly. So while this is inconvenient for measuring runtimes, it actually feels like a blocker when it comes to measuring memory usage. I even just learned that the memory measurement feature has been disabled in benchfella since they've run into the same issues we have. |
We had an experiment in #160 - that didn't seem to help that much - sometimes even made it worse. I checked out that branch again and determined that a simple
You take a big hit in sample size (depending on what you do) but that is ok if you really wanna avoid gc. Tuning warmup time up so that the process can grow to a suitable size better might also help. |
Especially micro benchmarking can be affected by garbage collection as single runs will be much slower than the others leading to a sky rocketing standard deviation and unreliable measures. Sadly, to the best of my knowledge, one can’t turn off GC on the BEAM.
The best breadcrumb to achieve anything like this so far:
This would then go into a new configuration option like:
avoid_gc: true/false
.Would also need testing with existing benchmarks to see effect on standard deviation etc. - likely a large-ish operation :)
The text was updated successfully, but these errors were encountered: