-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-4468] Go SDK Combiner Lifting cache cap & eviction. #7927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Run Go Postcommit |
|
R: @youngoli Please take a look! |
|
Run Go PostCommit |
|
I'm investigating the failure that can only be caught on a real runner. It's certainly something in this PR. |
d82f004 to
1de532e
Compare
|
I'm pulling back the invocation optimization since there's something I'm missing there. |
|
Run Go PostCommit |
youngoli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
As a nitpick, there's a word missing in the function comment to ProcessElement. May be worth fixing real quick before merging.
|
Good catch! Thanks! I've added it to my next commit, for the now-separated invokers PR, as that should be worth 10-15% overhead. |
| // once adding dependencies is easier. | ||
| // Arbitrary limit until a broader improvement can be demonstrated. | ||
| const cacheMax = 2000 | ||
| if len(n.cache) > cacheMax { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the cache size reaches 2000, would not this be a loop of combine 2 elements, evict 2 elements? Would evicting the whole cache at this point be a better solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent question!
The initial implementation did that approach but it ended up providing insufficient combining for extremely large bundles, causing the pessimal case of everything getting sent to the GBK un-combined. eg. If there are 2001 distinct keys, and they cycle in any order, except the 2001th key is also the 4002nd key , that strategy will lead to no combining at all, since we don't provide any opportunity for the overlapping sets to meet.
The observation then is that we need to provide the opportunity for combining, which means keeping the cache as populated as possible. The bounding is to control memory growth, in particular, for Very Large Bundles.
You're right that the random eviction strategy will sometimes have sequences where keys are being evicted and possibly even a key that's about to arrive will be evicted. However, without some extra buffering between DoFns, that's impossible to guess without far more sophisticated code.
Random is just a good "placebo" algorithm for picking evictees since anything else is harder get right.
In particular, it's safe in Go to delete map entries in a loop, and the iteration order is random.
I'm pretty sure there's a paper somewhere about the efficacy of eviction different eviction patterns on random datasets, but I'm not academically inclined enough to find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut feeling is that the performance benefits of being able to combine here more often because more keys are still cached would outweigh the performance cost of having to remove elements more often (and it might not even be that often, it depends on how many different keys we might expect the user to use).
I thought about suggesting maybe removing a number of elements (50 or so) at once to avoid this... But then I realized that the performance cost is probably near-identical, just front-loaded to a single removal of 50 elements instead of 50 removals of single elements.
So I agree with Robert here that this is probably not a significant problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2000 number is also arbitrary, I'm more likely to allow it to be tuned by the runner/user in some way as it seems a touch low for my usecase. Either way it's significantly better than run-away growth.
There are pathological cases to every strategy and in practice the best way to handle that is to detect which case would be better and swap to it when you're most confident.
In this case I'm referencing the Iocaine Powder/Scicillian reasoning strategy for playing Rock Paper Scissors, though with map evictions there are few more moves.
|
I've validated that there's something wrong with this patch, and it's over-combining, when the key evicted happens to be the key being processed. Fix incoming. |
Capping the LiftedCombiner cache to 2000 elements and when additional keys are added, randomly evict an old one to the next stage.
This optimization is necessary for very large bundles which may happen for batch jobs. A random eviction policy was better than simply dumping the whole cache when the cap is hit. This implementation takes advantage of Go's built in Random Map Iteration to choose evictees. We can probably do better, but it's not a implementation I'd like the SDK to maintain, so that is deferred until adding dependencies is simpler.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.