Skip to content

DirectRunner RunnableOnService tempLocation configuration insufficient #17911

@damccorm

Description

@damccorm

The way we specify temp storage locations for RunnableOnService tests is not sufficient, specifically for DirectRunner execution. Right now, RunnableOnService tests are run for DirectRunner and DataflowRunner, who set their temp locations differently:

  • DirectRunner doesn't specify a temp location directly, but test classes will use a JUnit @rule TemporaryDirectory. Individual tests set it as necessary for tempLocation, and set a fake gs:// path for individual GCP IO tests.
  • DataflowRunner tests pass an actual GCS path as tempRoot, and TestDataflowRunner will initialize stagingLocation to this path.

This setup makes it difficult to write RunnableOnService tests which pass for both runners. We should separate temp location setup out of individual test classes so that RunnableOnService tests "just work" on any runner.

One solution would be to add logic inside TestPipeline#testingPipelineOptions:

  • If --tempRoot is specified, use it to set tempLocation and stagingLocation. Otherwise, use a JUnit TemporaryDirectory to set it
  • If tempLocation is a GCS path, use it to set stagingLocation. Otherwise, use a fake gcs path (i.e. gs://foo)

Imported from Jira BEAM-436. Original Jira may contain additional context.
Reported by: swegner.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions