Skip to content

Conversation

@alanmyrvold
Copy link
Member

[BEAM-6081]: Create "Dataflow Reaper" infrastructure to periodically clean up stuck Dataflow jobs

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Python Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@alanmyrvold
Copy link
Member Author

R: @pabloem

@pabloem
Copy link
Member

pabloem commented Mar 5, 2019

Run Seed Job

Copy link
Member

@pabloem pabloem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just had a couple questions.

if err != nil {
log.Fatalf("Error creating dataflow client, %v", err)
}
err = cleanDataflowJobs(client, "apache-beam-testing", 12.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more aggressive to mark something as stale. Maybe even 3 or 4 hours? WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 3.

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that these tests don't get triggered continuously? Does it make sense to run tool tests continuously? This could be part of a different PR I guess. Just wondering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it to the same jenkins job that cancels.

@alanmyrvold
Copy link
Member Author

Run Seed Job

@pabloem
Copy link
Member

pabloem commented Mar 5, 2019

Run Seed Job

@pabloem
Copy link
Member

pabloem commented Mar 5, 2019

Run Cancel Stale Dataflow Jobs

1 similar comment
@pabloem
Copy link
Member

pabloem commented Mar 5, 2019

Run Cancel Stale Dataflow Jobs

@pabloem
Copy link
Member

pabloem commented Mar 5, 2019

Very cool. LMK if I should merge this.

@alanmyrvold
Copy link
Member Author

Thanks for running the job. https://builds.apache.org/job/beam_CancelStaleDataflowJobs/2/console succeeded and cancelled jobs in https://console.cloud.google.com/dataflow?project=apache-beam-testing so looks fine to be merged.

@pabloem pabloem merged commit 991285c into apache:master Mar 5, 2019
ajamato pushed a commit to ajamato/beam that referenced this pull request Mar 12, 2019
…clean up stuck Dataflow jobs (apache#7985)

* [BEAM-6081]: Create "Dataflow Reaper" infrastructure to periodically clean up stuck Dataflow jobs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants