Skip to content

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Feb 19, 2025

Bumps apache-beam from 2.62.0 to 2.63.0.

Release notes

Sourced from apache-beam's releases.

Beam 2.63.0 release

We are happy to present the new 2.63.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.

For more information on changes in 2.63.0, check out the detailed release notes.

I/Os

  • Support gcs-connector 3.x+ in GcsUtil (#33368)
  • Support for X source added (Java/Python) (#X).
  • Introduced --groupFilesFileLoad pipeline option to mitigate side-input related issues in BigQueryIO batch FILE_LOAD on certain runners (including Dataflow Runner V2) (Java) (#33587).

New Features / Improvements

  • Add BigQuery vector/embedding ingestion and enrichment components to apache_beam.ml.rag (Python) (#33413).
  • Upgraded to protobuf 4 (Java) (#33192).
  • [GCSIO] Added retry logic to each batch method of the GCS IO (Python) (#33539)
  • [GCSIO] Enable recursive deletion for GCSFileSystem Paths (Python) (#33611).
  • External, Process based Worker Pool support added to the Go SDK container. (#33572)
  • Support the Process Environment for execution in the Go SDK. (#33651)
  • Prism
    • Prism now uses the same single port for both pipeline submission and execution on workers. Requests are differentiated by worker-id. (#33438)
      • This avoids port starvation and provides clarity on port use when running Prism in non-local environments.
    • Support for @​RequiresTimeSortedInputs added. (#33513)
    • Initial support for AllowedLateness added. (#33542)
    • The Go SDK's inprocess Prism runner (AKA the Go SDK default runner) now supports non-loopback mode environment types. (#33572)
    • Support the Process Environment for execution in Prism (#33651)
    • Support the AnyOf Environment for execution in Prism (#33705)
      • This improves support for developing Xlang pipelines, when using a compatible cross language service.
  • Partitions are now configurable for the DaskRunner in the Python SDK (#33805).
  • [Dataflow Streaming] Enable Windmill GetWork Response Batching by default (#33847).
    • With this change user workers will request batched GetWork responses from backend and backend will send multiple WorkItems in the same response proto.
    • The feature can be disabled by passing --windmillRequestBatchedGetWorkResponse=false

Breaking Changes

  • AWS V1 I/Os have been removed (Java). As part of this, x-lang Python Kinesis I/O has been updated to consume the V2 IO and it also no longer supports setting producer_properties (#33430).
  • Upgraded to protobuf 4 (Java) (#33192), but forced Debezium IO to use protobuf 3 (#33541 because Debezium clients are not protobuf 4 compatible. This may cause conflicts when using clients which are only compatible with protobuf 4.
  • Minimum Go version for Beam Go updated to 1.22.10 (#33609)

Bugfixes

  • Fix data loss issues when reading gzipped files with TextIO (Python) (#18390, #31040).
  • [BigQueryIO] Fixed an issue where Storage Write API sometimes doesn't pick up auto-schema updates (#33231)

... (truncated)

Changelog

Sourced from apache-beam's changelog.

[2.63.0] - 2025-02-18

I/Os

  • Support gcs-connector 3.x+ in GcsUtil (#33368)
  • Support for X source added (Java/Python) (#X).
  • Introduced --groupFilesFileLoad pipeline option to mitigate side-input related issues in BigQueryIO batch FILE_LOAD on certain runners (including Dataflow Runner V2) (Java) (#33587).

New Features / Improvements

  • Add BigQuery vector/embedding ingestion and enrichment components to apache_beam.ml.rag (Python) (#33413).
  • Upgraded to protobuf 4 (Java) (#33192).
  • [GCSIO] Added retry logic to each batch method of the GCS IO (Python) (#33539)
  • [GCSIO] Enable recursive deletion for GCSFileSystem Paths (Python) (#33611).
  • External, Process based Worker Pool support added to the Go SDK container. (#33572)
  • Support the Process Environment for execution in the Go SDK. (#33651)
  • Prism
    • Prism now uses the same single port for both pipeline submission and execution on workers. Requests are differentiated by worker-id. (#33438)
      • This avoids port starvation and provides clarity on port use when running Prism in non-local environments.
    • Support for @​RequiresTimeSortedInputs added. (#33513)
    • Initial support for AllowedLateness added. (#33542)
    • The Go SDK's inprocess Prism runner (AKA the Go SDK default runner) now supports non-loopback mode environment types. (#33572)
    • Support the Process Environment for execution in Prism (#33651)
    • Support the AnyOf Environment for execution in Prism (#33705)
      • This improves support for developing Xlang pipelines, when using a compatible cross language service.
  • Partitions are now configurable for the DaskRunner in the Python SDK (#33805).
  • [Dataflow Streaming] Enable Windmill GetWork Response Batching by default (#33847).
    • With this change user workers will request batched GetWork responses from backend and backend will send multiple WorkItems in the same response proto.
    • The feature can be disabled by passing --windmillRequestBatchedGetWorkResponse=false

Breaking Changes

  • AWS V1 I/Os have been removed (Java). As part of this, x-lang Python Kinesis I/O has been updated to consume the V2 IO and it also no longer supports setting producer_properties (#33430).
  • Upgraded to protobuf 4 (Java) (#33192), but forced Debezium IO to use protobuf 3 (#33541 because Debezium clients are not protobuf 4 compatible. This may cause conflicts when using clients which are only compatible with protobuf 4.
  • Minimum Go version for Beam Go updated to 1.22.10 (#33609)

Bugfixes

  • Fix data loss issues when reading gzipped files with TextIO (Python) (#18390, #31040).
  • [BigQueryIO] Fixed an issue where Storage Write API sometimes doesn't pick up auto-schema updates (#33231)
  • Prism
    • Fixed an edge case where Bundle Finalization might not become enabled. (#33493).
    • Fixed session window aggregation, which wasn't being performed per-key. (#33542).)
  • [Dataflow Streaming Appliance] Fixed commits failing with KeyCommitTooLargeException when a key outputs >180MB of results. #33588.
  • Fixed a Dataflow template creation issue that ignores template file creation errors (Java) (#33636)
  • Correctly documented Pane Encodings in the portability protocols (#33840).
  • Fixed the user mailing list address (#26013).

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [apache-beam](https://github.com/apache/beam) from 2.62.0 to 2.63.0.
- [Release notes](https://github.com/apache/beam/releases)
- [Changelog](https://github.com/apache/beam/blob/master/CHANGES.md)
- [Commits](apache/beam@v2.62.0...v2.63.0)

---
updated-dependencies:
- dependency-name: apache-beam
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Feb 19, 2025
@liferoad liferoad merged commit 1ee5181 into main Feb 19, 2025
8 checks passed
@dependabot dependabot bot deleted the dependabot/pip/apache-beam-2.63.0 branch February 19, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant