Skip to content

Fix #37736: Allow composite transforms to use implicit input chaining#37860

Closed
liferoad wants to merge 2 commits intoapache:masterfrom
liferoad:fix/beam-37736-composite-transform
Closed

Fix #37736: Allow composite transforms to use implicit input chaining#37860
liferoad wants to merge 2 commits intoapache:masterfrom
liferoad:fix/beam-37736-composite-transform

Conversation

@liferoad
Copy link
Contributor

Issue

#37736

When using type: composite in Beam YAML, each sub-transform requires an explicit input, unlike type: chain which automatically passes the output of one transform to the next.

Fix

Modified expand_composite_transform() in sdks/python/apache_beam/yaml/yaml_transform.py to automatically chain sub-transforms when no explicit inputs/outputs are specified, similar to how chain type transforms work.

Testing

Added test_composite_implicit_input_chaining test case in yaml_transform_test.py to verify the fix.

…ms/util.py

Replace root logger calls (logging.info, logging.warning, etc.) with a
module-level named logger (_LOGGER = logging.getLogger(__name__)) in
apache_beam.transforms.util. This allows sdk_harness_log_level_overrides
to properly control log levels for this module.
…aining

When a composite transform has no explicit inputs/outputs on its
sub-transforms, automatically chain them similar to how 'chain' type
transforms work. This allows composite transforms to be used as middle
transforms without requiring explicit inputs for each sub-transform.

Added test_composite_implicit_input_chaining to verify the fix.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the usability of Apache Beam's YAML composite transforms by introducing implicit input chaining. Previously, composite transforms required explicit input/output definitions for each sub-transform, which could be verbose. With this change, users can now define a sequence of sub-transforms within a composite without specifying inputs and outputs, and the system will automatically pass the output of one transform as the input to the next, streamlining the definition of complex pipelines.

Highlights

  • Implicit Input Chaining for Composite Transforms: Enabled type: composite transforms in Beam YAML to automatically chain sub-transforms when no explicit inputs or outputs are specified, mirroring the behavior of type: chain transforms.
  • expand_composite_transform Modification: Modified the expand_composite_transform function to detect the absence of explicit inputs/outputs and then programmatically set up implicit chaining between sub-transforms.
  • New Test Case: Added test_composite_implicit_input_chaining to validate the correct functionality of the newly implemented implicit input chaining for composite transforms.
  • Logging Refactor: Refactored direct logging calls in sdks/python/apache_beam/transforms/util.py to use a module-level _LOGGER instance for consistency.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • sdks/python/apache_beam/transforms/util.py
    • Introduced _LOGGER = logging.getLogger(name) at the module level.
    • Replaced all direct calls to logging.info, logging.warning, logging.debug, logging.error, and logging.critical with calls to _LOGGER.info, _LOGGER.warning, _LOGGER.debug, _LOGGER.error, and _LOGGER.critical respectively.
  • sdks/python/apache_beam/yaml/yaml_transform.py
    • Added a new block of code within expand_composite_transform to handle implicit input chaining.
    • Checked if any sub-transform in the composite has explicit 'input' or 'output' keys.
    • If no explicit I/O is found, it iterates through original_transforms, setting the 'input' for subsequent transforms to the uuid of the preceding transform.
    • For the first transform, it handles the composite's overall input.
    • If new_transforms are created, the spec is updated, and an implicit 'output' is set for the composite based on the last sub-transform's uuid.
  • sdks/python/apache_beam/yaml/yaml_transform_test.py
    • Added a new test method test_composite_implicit_input_chaining to YamlTransformTest.
    • This test defines a composite transform with two PyMap sub-transforms (Square and Increment) without explicit inputs, and asserts the correct chained output [2, 5, 10] for input [1, 2, 3].
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@liferoad liferoad closed this Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant