Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync with open source how #118

Draft
wants to merge 4,461 commits into
base: li_trunk
Choose a base branch
from
Draft

sync with open source how #118

wants to merge 4,461 commits into from

Conversation

lesterhaynes
Copy link

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

lostluck and others added 21 commits May 28, 2024 20:48
* [prism] Add basic processing time queue.

* Initial residual handling refactor.

* Re-work teststream initilization. Remove pending element race.

* touch up

* rm merge duplicate

* Simplify watermark hold tracking.

* First successful run!

* Remove duplicated test run.

* Deduplicate processing time heap.

* rm debug text

* Remove some debug prints, cleanup.

* tiny todo cleanup

* ProcessingTime workming most of the time!

* Some cleanup

* try to get github suite to pass #1

* touch

* reduce counts a bit, filter tests some.

* Clean up unrelated state changes. Clean up comments somewhat.

* Filter out dataflow incompatible test.

* Refine processing time event comment.

* Remove test touch.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
… lower batching limit for rewrite operations which are copying. (#31410)
Signed-off-by: Jeffrey Kinard <jeff@thekinards.com>
* Refactor RowMutationInformation to use string type

* Remove unnecessary test

* Add javadoc

* Add segment too large test cases

* Add hex based test cases to integration test
* Add ApplyBucketsWithInterpolation TFTransform

* Update sdks/python/apache_beam/ml/transforms/tft.py

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

* add tft documentation link

* change docstring wording around bucket_boundaries

* Update sdks/python/apache_beam/ml/transforms/tft.py

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

---------

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
This is useful for cases where the side input may be too large for our
default caching infrastructure but the user would nonetheless prefer to
spend the memory on keeping the entire object live.

This can be especially useful for Maps, where it may be much more efficient
than doing point lookups.
This is the first transform in the (alphabetical) list, so it'd
be good to not have it empty.

Also produce slightly nicer examples for repeated arguments.
* Update bigquery.py

fix #31372

* fix lint
* Don't re-encode byte[] values in SortValues transform

* checkstyle

* Apply code review comments
* add support for FlinkJobServer configuration

* remove hardcoded timeout for FlinkPortableClient
* Emit a warning when large elements are detected.

* Reduce logging frequency

* lint
* Use bytes.

* Update sdks/python/apache_beam/runners/worker/data_plane.py
francisohara24 and others added 30 commits July 1, 2024 12:28
…#31713)

* Remove testRuntimeMigration configuration for test-utils dependencies

* Fix test dependency direct-java configuration
Co-authored-by: Lahari Guduru <lahariguduru@google.com>
* Add support for BasicAuth to Solace

* Address PR comments

* Use `checkStateNotNull`
This allows one to define new transforms with YAML that can be
imported and used in other pipelines.
* Disable caching for one more workflow

* Disable on setup-environment as well
jackson-datatype-joda already pulls joda-time 2.10.14. Sync version
* add missing transitive dependencies

* Fix analyzeClassesDependencies
* Fix Python Playground Dockerfile

* Update version to 11.0.23
grpc-census, grpc-protobuf-lite and grpc-xds are now part of the
google_cloud_platform_libraries_bom
* Replace LGPL dep in Go SDK with an MIT alternative

* Add update in CHANGES.md
…with legacy runner by building on top of MultimapState. (#31453)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment