Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Transform Service that uses Docker Compose #26023

Merged

Conversation

chamikaramj
Copy link
Contributor

@chamikaramj chamikaramj commented Mar 29, 2023

A Docker Compose based transform service

This can be started to use portable transforms offered by Beam. Additionally this can be used to discover schema-aware transforms available in Beam.

After containers are released, this service can be started up by any Beam user with following steps:

  • Install Docker

  • Set environment variable BEAM_VERSION to the Beam version.

  • Run following from the directory that contains the docker-compose.yml file
    $ docker-compose up

  • Now the transforms service will be available at "<IP of the current machine>:5001"

Please see https://s.apache.org/beam-transform-service for the design.

Github issue: #26211


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Mar 29, 2023

Codecov Report

Merging #26023 (8205d0b) into master (21301b1) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #26023      +/-   ##
==========================================
- Coverage   72.06%   72.06%   -0.01%     
==========================================
  Files         745      745              
  Lines      101203   101203              
==========================================
- Hits        72932    72928       -4     
- Misses      26811    26815       +4     
  Partials     1460     1460              
Flag Coverage Δ
python 81.09% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...beam/runners/portability/expansion_service_main.py 0.00% <0.00%> (ø)

... and 4 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@chamikaramj chamikaramj force-pushed the beam_transform_service_prototype branch from faeb84f to e04c687 Compare April 11, 2023 00:03
@chamikaramj chamikaramj marked this pull request as ready for review April 11, 2023 00:03
@chamikaramj chamikaramj force-pushed the beam_transform_service_prototype branch from e04c687 to d22665d Compare April 11, 2023 00:15
@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @AnandInguva for label python.
R: @Abacn for label java.
R: @damccorm for label build.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@chamikaramj
Copy link
Contributor Author

R: @robertwb

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@chamikaramj
Copy link
Contributor Author

Tested by running both following pipelines against the same transform service.

  • A Python pipeline that uses Java Kafka read and write transforms.
  • A Java pipeline that uses the Python Dataframe transform.

@chamikaramj
Copy link
Contributor Author

Run Java PreCommit

@chamikaramj
Copy link
Contributor Author

Run Java_GCP_IO_Direct PreCommit

@chamikaramj
Copy link
Contributor Author

@robertwb friendly ping.

Please note that I hope to add some advanced features (for example, sharing auth credentials between the local machine and the Docker containers) in future PRs.

@chamikaramj chamikaramj force-pushed the beam_transform_service_prototype branch from 5f000ba to 653be3a Compare April 24, 2023 23:53
@chamikaramj
Copy link
Contributor Author

Run Java PreCommit

@chamikaramj
Copy link
Contributor Author

Run Spotless PreCommit

@chamikaramj chamikaramj force-pushed the beam_transform_service_prototype branch 2 times, most recently from aa5a310 to 56f9663 Compare April 25, 2023 16:28
@chamikaramj
Copy link
Contributor Author

Also added support for sharing credentials between the local machine and the expansion service containers.

@chamikaramj
Copy link
Contributor Author

Run Java PreCommit

1 similar comment
@chamikaramj
Copy link
Contributor Author

Run Java PreCommit

@Abacn Abacn self-requested a review April 26, 2023 21:42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to keep the released instance of this service in sync with released beam, then you do not want to create a new Go module at this level (the go.mod and go.sum files.)

Otherwise this becomes another thing to try and keep up to date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Probably we can push the transform service under "sdks/java" to avoid adding a new Go module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files were removed.

Copy link
Contributor

@Abacn Abacn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just have a few minor comments (see above). Validated the command ./gradlew :sdks:java:transform-service:controller-container:docker running successfully on linux amd64.

Note: On apple M1 it fails with

#9 ERROR: "/target/launcher/linux_amd64/boot" not found: not found
------
 > [6/9] COPY target/launcher/linux_amd64/boot /opt/apache/beam/:
------
failed to compute cache key: "/target/launcher/linux_amd64/boot" not found: not found

because the generated image was in target/launcher/linux_arm64/boot

@chamikaramj chamikaramj force-pushed the beam_transform_service_prototype branch from 13766b5 to 8205d0b Compare May 6, 2023 04:36
@chamikaramj
Copy link
Contributor Author

Thanks. Updated to support both amd64 and arm64.

@chamikaramj
Copy link
Contributor Author

Run Java PreCommit

@chamikaramj
Copy link
Contributor Author

Run Spotless PreCommit

@chamikaramj
Copy link
Contributor Author

Run Community Metrics Prober

@chamikaramj
Copy link
Contributor Author

Run CommunityMetrics PreCommit

@chamikaramj
Copy link
Contributor Author

Run Java_GCP_IO_Direct PreCommit

@chamikaramj
Copy link
Contributor Author

Run Java_Pulsar_IO_Direct PreCommit

@chamikaramj
Copy link
Contributor Author

Run Java_Amazon-Web-Services_IO_Direct PreCommit

@chamikaramj
Copy link
Contributor Author

Run Java_Amazon-Web-Services2_IO_Direct PreCommit

@chamikaramj chamikaramj merged commit fc38a47 into apache:master May 7, 2023
cushon pushed a commit to cushon/beam that referenced this pull request May 24, 2024
* A Transform Service that uses Docker Compose

* Adds supports for the Python expansion service

* Fix spotless and remove unused config file

* Fix spotless

* Add licenses

* Fix spotless and adds code to copy licenses to Docker containers

* Fix checkstyle and artifact request forwarding

* Adding unit tests for the controller

* Adds support for specifying credentials via a volume

* Rebasing to fix test failures

* Use correct dependencies for Schema-aware transforms

* Addreses reviewer comments

* Addressing reviewer comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants