Skip to content

Conversation

@steveniemitz
Copy link
Contributor

This fixes ProxyInvocationHandler.as, DoFnSignatures.getSignature, and ByteBuddyDoFnInvokerFactory.getByteBuddyInvokerConstructor from #22161

The fixes to DoFnSignatures and ByteBuddyDoFnInvokerFactory are trivially changing a Map to ConcurrentHashMap.

ProxyInvocationHandler is more involved, since multiple data structures must be updated atomically on a cache miss. To do this we extract the relevant structures into an immutable class. On a cache miss we construct a new instance of it with new (also immutable) copies of the updated structures. The trade-off here is slightly more garbage creation on a cache miss (since it needs to copy the structures rather than mutate them), however since in steady-state misses ~never occur, and the structures here in question are generally small (~100s of elements).

The impact of these changes is very significant, without these changes my test job spends ~30% of its time in the "start bundle" state, after these changes (as well as ExecutionStateSampler), the % time goes down to ~1-2%.

R: @lukecwik


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@github-actions github-actions bot added the java label Jul 5, 2022
@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

3 similar comments
@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@steveniemitz
Copy link
Contributor Author

The Pulsar tests are consistently failing again, seems unrelated to this change:

Caused by: java.lang.IllegalArgumentException: Trying to claim offset 1657045837238 before start of the range [1657045838354, 1657045838364)

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to add a concurrency test to ensure that things like toString/populateDisplayData/serialize all work when as() is modifying computed properties.

Not sure how easy it will be since you'll need to generate classes at runtime to invoke .as() to.

// exception in writeObject()
@SuppressFBWarnings("SE_BAD_FIELD")
private final Map<String, BoundValue> options;
private final ConcurrentHashMap<String, BoundValue> options;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private final ConcurrentHashMap<String, BoundValue> options;
/**
* Enumerating {@code options} must always be done on a copy made before accessing or deriving properties from {@code computedProperties} since concurrent hash maps are <a href="https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/package-summary.html#Weakly>weakly consistent</a>. This will allow us to ensure that the keys in {code options} will always be a subset of properties stored in {code computedProperties}.
*/
private final ConcurrentHashMap<String, BoundValue> options;

Co-authored-by: Lukasz Cwik <lcwik@google.com>
@steveniemitz
Copy link
Contributor Author

It would be great to add a concurrency test to ensure that things like toString/populateDisplayData/serialize all work when as() is modifying computed properties.

Not sure how easy it will be since you'll need to generate classes at runtime to invoke .as() to.

Hm, yeah what did you have in mind here? I was struggling to think of anything useful to test.

@lukecwik
Copy link
Member

lukecwik commented Jul 7, 2022

It would be great to add a concurrency test to ensure that things like toString/populateDisplayData/serialize all work when as() is modifying computed properties.
Not sure how easy it will be since you'll need to generate classes at runtime to invoke .as() to.

Hm, yeah what did you have in mind here? I was struggling to think of anything useful to test.

generate new classes in a loop that add a new getter/setter and call .as() and then set the value to a new property. At the same time have another thread in a loop continuously calling toString/populateDisplayData/serialize ensuring that we don't throw an exception.

@steveniemitz
Copy link
Contributor Author

generate new classes in a loop that add a new getter/setter and call .as() and then set the value to a new property. At the same time have another thread in a loop continuously calling toString/populateDisplayData/serialize ensuring that we don't throw an exception.

Cool, done

@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

3 similar comments
@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@steveniemitz
Copy link
Contributor Author

Run Java PreCommit

@lukecwik lukecwik merged commit 0e14f51 into apache:master Jul 7, 2022
@steveniemitz steveniemitz deleted the lock-tuning branch July 8, 2022 00:56
steveniemitz added a commit to twitter-forks/beam that referenced this pull request Jul 15, 2022
* Optimize locking in several critical-path methods

* spotbugs

* Apply suggestions from code review

Co-authored-by: Lukasz Cwik <lcwik@google.com>

* review comments

* unit test for concurrency

* the buddy of bytes

Co-authored-by: Lukasz Cwik <lcwik@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants