Skip to content

[BEAM-4742] allow custom docker image in portable runner#5902

Closed
ryan-williams wants to merge 2 commits intoapache:masterfrom
ryan-williams:docker-img
Closed

[BEAM-4742] allow custom docker image in portable runner#5902
ryan-williams wants to merge 2 commits intoapache:masterfrom
ryan-williams:docker-img

Conversation

@ryan-williams
Copy link
Contributor

Allow specifying a docker image for the portable runner to use, via DOCKER_IMAGE env var

Also: make output directory in wordcount example, if it doesn't exist.

R: @angoenka

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status Build Status Build Status Build Status
Python Build Status --- Build Status
Build Status
--- --- --- ---

output = counts | 'format' >> beam.Map(format_result)

out_dir = os.path.dirname(known_args.output)
if not FileSystems.exists(out_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the expectation should be that any output path should be created during pipeline execution and not by the driver program creating the pipeline.

Please revert this change to wordcount and fix the filesystem implementation to create any necessary directories instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I originally made a change to LocalFileSystem to create directories on open, but I wasn't sure if that was the right semantics; it sounds like you're saying it is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #5903 with that change; can close this in favor of that if that's what you prefer, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, taking a look at #5903.

@staticmethod
def default_docker_image():
if 'USER' in os.environ:
if 'DOCKER_IMAGE' in os.environ:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already controlled by the flag --harness_docker_image:

parser.add_argument('--harness_docker_image',

Do you still want to make the default container selection be based off of DOCKER_IMAGE?
If yes, should it specify the full path and not assume the user wants the :latest suffix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yea, I just saw the pipeline option for this as well! thanks for pointing it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll revert this part of the change, I don't think it's necessary)

@ryan-williams
Copy link
Contributor Author

closing in favor of #5903, thanks @lukecwik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants