Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33282] Migrate from deprecated probot autolabeler to GitHub labeler action #30244

Closed

Conversation

kbendick
Copy link
Contributor

@kbendick kbendick commented Nov 4, 2020

What changes were proposed in this pull request?

This PR removes the old Probot Autolabeler labeling configuration, as the probot autolabeler has been deprecated. I've updated the configs in Iceberg and in Avro, and we also need to update here. This PR adds in an additional workflow for labeling PRs and migrates the old probot config to the new format. Unfortunately, because certain features have not been released upstream, we will not get the exact behavior as before. I have documented where that is and what changes are neeeded, and in the associated ticket I've also discussed other options and why I think this is the best way to go. Definitely a follow up ticket is needed to get the original behavior back in these few cases, but PRs have not been labeled for almost a month and so it's probably best to get it right 95% of the time and occasionally have some UI related PRs labeled as CORE while the issue is resolved upstream and/or further investigated.

Why are the changes needed?

The probot autolabeler is dead and will not be maintained going forward. This has been confirmed with github user [at]mithro in an issue in their repository.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

To test this PR, I first merged the config into my local fork. I then edited it several times and ran tests on that.

Unfortunately, I've overwritten my fork with the apache repo in order to create a proper PR. However, I've also added the config for the same thing in the Iceberg repo as well as the Avro repo.

I have now merged this PR into my local repo and will be running some tests on edge cases there and for validating in general:

I've also discovered that we're likely not killing github actions that run (like large tests etc) when users push to their PR. In most cases, I see that a user has to mark something as "OK to test", but it still seems like we might want to discuss whether or not we should add a cancellation step In order to save time / capacity on the runners. If so desired, we would add an action in each workflow that cancels old runs when a push action occurs on a PR. This will likely make waiting for test runners much faster iff tests are automatically rerun on push by anybody (such as PMCs, PRs that have been marked OK to test, etc). We could free a large number of resources potentially if a cancellation step was added to all of the workflows in the Apache account (as github action API limits are set at the account level).

Admittedly, the fact that the "old" workflow runs weren't cancelled could admittedly be because of the fact that I was working in a fork, but given that there are explicit actions to be added to the start of workflows to cancel old PR workflows and given that we don't have them configured indicates to me that likely this is the case in this repo (and in most apache repos as well), at least under certain circumstances (e.g. repos that don't have "Ok to test"-like webhooks as one example).

This is a separate issue though, which I can bring up on the mailing list once I'm done with this PR. Unfortunately I've been very busy the past two weeks, but if somebody else wanted to work on that I would be happy to support with any knowledge I have.

The last Apache repo to still have the probot autolabeler in it is Beam, at which point we can have Gavin from ASF Infra remove the permissions for the probot autolabeler entirely. See the associated JIRA ticket for the links to other tickets, like the one for ASF Infra to remove the dead probot autolabeler's read and write permissions to our PRs in the Apache organization.

@HyukjinKwon
Copy link
Member

ok to test

@kbendick kbendick changed the title [SPARK-33282] Migrate from dead probot autolabeler to GitHub labeler action [WIP][SPARK-33282] Migrate from dead probot autolabeler to GitHub labeler action Nov 4, 2020
@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35192/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35192/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35225/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35225/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35227/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35227/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35230/

.github/labeler.yml Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35231/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35230/

…/streaming/**/* to allow it to match directories with that pattern that are not at the repo root
@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35231/

@kbendick kbendick changed the title [WIP][SPARK-33282] Migrate from dead probot autolabeler to GitHub labeler action [SPARK-33282] Migrate from dead probot autolabeler to GitHub labeler action Nov 5, 2020
@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35232/

@kbendick
Copy link
Contributor Author

kbendick commented Nov 5, 2020

I created an issue in the actions/labeler repo about releasing a version with proper support for any and all, which would allow us to use the negation (!) operator on matches and get the correct behavior again: actions/labeler#111

- "repl/**/*"
- "bin/spark-shell*"
SQL:
#- any: ["**/sql/**/*", "!python/pyspark/sql/avro/**/*", "!python/pyspark/sql/streaming.py", "!python/pyspark/sql/tests/test_streaming.py"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how the any should look when it's supported in a release upstream.

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35232/

@HyukjinKwon
Copy link
Member

Thanks for working on this @kbendick. Let me know when you think it's ready to merge. We can merge and see how it gose.

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35236/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35236/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35238/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35239/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35238/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35239/

@HyukjinKwon
Copy link
Member

cc @nchammas FYI

@kbendick kbendick changed the title [SPARK-33282] Migrate from dead probot autolabeler to GitHub labeler action [SPARK-33282] Migrate from deprecated probot autolabeler to GitHub labeler action Nov 5, 2020
@kbendick
Copy link
Contributor Author

kbendick commented Nov 5, 2020

Thanks for working on this @kbendick. Let me know when you think it's ready to merge. We can merge and see how it gose.

Thanks @HyukjinKwon! This is ready to be merged.

It might need a small tweak or two given it was a pretty complicated config relative to the others that I've done so far but I would say it's tested and ready to merge so people can finally have their labels back. 👍

@HyukjinKwon
Copy link
Member

Let me merge and see how it goes.

Thanks for this work and thanks for the followup in advance. BTW would you mind file a separate JIRA to follow up?

@HyukjinKwon
Copy link
Member

Merged to master.

@kbendick kbendick deleted the begin-migration-to-github-labeler-action branch November 6, 2020 07:48
@kbendick
Copy link
Contributor Author

kbendick commented Nov 6, 2020

I created an issue for updating the labeler with any to match the exact behavior as before. It also links to the issue I opened in the actions/labeler repo asking if they have a potential timeline for a release that contains the any function. It's up to you if you want to use HEAD of their main branch, which does have that support cc @HyukjinKwon @dongjoon-hyun https://issues.apache.org/jira/browse/SPARK-33370

@HyukjinKwon
Copy link
Member

Just a reminder :-). It might be great to double check if any match is supported now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants