Skip to content

Conversation

@lukecwik
Copy link
Member

The Java Random and ThreadLocalRandom are thread safe and use atomics and/or synchonization internally.

This is for #21250


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

…ation since the Java Random and ThreadLocalRandom are thread safe and use atomics and/or synchonization internally.

This is for apache#21250
@lukecwik lukecwik changed the title Swap to a random number generator that doesn't use atomics/synchronization Swap to a random number generator that doesn't use atomics/synchronization on hot path Feb 18, 2023
@codecov
Copy link

codecov bot commented Feb 18, 2023

Codecov Report

Merging #25551 (e63d7dc) into master (de7eb2d) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #25551   +/-   ##
=======================================
  Coverage   72.65%   72.65%           
=======================================
  Files         758      758           
  Lines      100681   100681           
=======================================
  Hits        73154    73154           
  Misses      26118    26118           
  Partials     1409     1409           
Flag Coverage Δ
python 82.01% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/python/apache_beam/utils/interactive_utils.py 95.12% <0.00%> (-2.44%) ⬇️
sdks/python/apache_beam/io/localfilesystem.py 90.97% <0.00%> (-0.76%) ⬇️
sdks/python/apache_beam/transforms/combiners.py 93.05% <0.00%> (-0.39%) ⬇️
...hon/apache_beam/runners/worker/bundle_processor.py 94.45% <0.00%> (+0.23%) ⬆️
...python/apache_beam/runners/worker/worker_status.py 76.66% <0.00%> (+1.33%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions
Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@lukecwik
Copy link
Member Author

Run Java PreCommit

@lukecwik
Copy link
Member Author

Run Java_PVR_Flink_Batch PreCommit

@lukecwik
Copy link
Member Author

Run Python_Runners PreCommit

@lukecwik
Copy link
Member Author

Run SQL_Java11 PreCommit

@lukecwik
Copy link
Member Author

Run SQL_Java17 PreCommit

// samplingCutoff / samplingTokenUpperBound. This algorithm may be refined
// later.
samplingToken = Math.min(samplingToken + 1, SAMPLING_TOKEN_UPPER_BOUND);
return randomGenerator.nextInt(samplingToken) < SAMPLING_CUTOFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we switch to a thread-local RNG, I'm concerned that calling nextInt on every single value will be more expensive than doing the fast reservoir sampling algorithm that we had here before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is still WIP and marked it as so. I'm still trying to root cause why the existing implementation shows up on FlexWordCount profiles.

@lukecwik lukecwik changed the title Swap to a random number generator that doesn't use atomics/synchronization on hot path [WIP] Swap to a random number generator that doesn't use atomics/synchronization on hot path Feb 21, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 6, 2023

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 6, 2023
@github-actions
Copy link
Contributor

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants