-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-32896][Runtime/Coordination] Incorrect Map.computeIfAbsent(..., ...::new)
usage which misinterprets key as initial capacity
#23518
Conversation
Map.computeIfAbsent(..., ...::new)
usage which misinterprets key as initial capacityMap.computeIfAbsent(..., ...::new)
usage which misinterprets key as initial capacity
@flinkbot run azure |
@wuchong Can you look at it? I think it's a very simple change : ) |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for picking it up, @tzy-0x7cf and @itonyli for checking it as well. Can you rebase the branch to make CI pass? The error was introduced in master
and was fixed already.
@flinkbot run azure |
Could squash the commits and rebase the branch to most-recent The Flink CI bot is known to have issues with force-pushes. You can try to work around it by pushing an empty commit after you've reorganized and rebased the branch (in a separate push to be on the save side). |
1a872e2
to
622108a
Compare
….., ...::new)` usage which misinterprets key as initial capacity [FLINK-32896] [Runtime/Coordination] Incorrect `Map.computeIfAbsent(..., ...::new)` usage which misinterprets key as initial capacity
dc96c6d
to
057d004
Compare
@flinkbot run azure |
Thanks Matthias! I'm new to flink and would like to contribute more , so if there's anything about test or simple issues, please feel free to assign them to me! |
That's great to hear. You might be lucky if you look for issues that are labeled as "starter" in Flink's Jira. A bit more context around how the Flink community organizes Jira can be found in this Confluence wiki page. Other sources that might be worth reading (in case you haven't done so, yet): |
What is the purpose of the change
The are multiple cases in the code which look like this:
map.computeIfAbsent(..., ArrayList::new)
Not only does this create a new collection (here an ArrayList), but computeIfAbsent also passes the map key as argument to the mapping function, so instead of calling the no-args constructor such as new ArrayList<>(), this actually calls the constructor with int initial capacity parameter, such as new ArrayList<>(initialCapacity).
This can lead either to runtime exceptions in case the map key is negative, since the collection constructors reject negative initial capacity values, or it can lead to bad performance if the key (which is misinterpreted as initial capacity) is pretty low, such as 0, or is pretty large and therefore pre-allocates a lot of memory for the collection.
it might be good to replace them with lambda expressions to make this more explicit:
map.computeIfAbsent(..., k -> new ArrayList<>())
Brief change log
map.computeIfAbsent(..., ... ::new)
withmap.computeIfAbsent(..., k -> new ...)
Verifying this change
Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
This change is already covered by existing tests, such as HsSpillingStrategyUtilsTest.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation