New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup #34655
Conversation
@@ -275,8 +275,7 @@ SPARK_ROOT_DIR="$(dirname "${SCRIPT_DIR}")" | |||
|
|||
pushd "$SPARK_ROOT_DIR" &> /dev/null | |||
|
|||
# skipping local ruby bundle directory from the search | |||
PYTHON_SOURCE="$(find . -path ./docs/.local_ruby_bundle -prune -false -o -name "*.py")" | |||
PYTHON_SOURCE="$(git ls-files '*.py')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an improvement, but I don't think this is that great, either. In general, I don't like that we are building a huge string of every Python file and passing it as an argument.
Instead, we should be using the appropriate include and exclude filters (e.g. via tox.ini) to capture everything to be tested. But I don't want to get into that here.
@@ -19,6 +19,7 @@ coverage | |||
|
|||
# Linter | |||
mypy | |||
git+https://github.com/typeddjango/pytest-mypy-plugins.git@b0020061f48e85743ee3335bd62a3a608d17c6bd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was being used in CI but it was missing from here, so I added it in.
In a future PR, I want to reorganize things so that CI references the requirements file. I don't think it's good that we have dependencies specified here and then specified again inside our CI script.
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #145417 has finished for PR 34655 at commit
|
If we are already here let me ask a stupid question ‒ why do we use both |
Flake8 bundles other tools, including pycodestyle. So yes, we can probably merge the pycodestyle configs into the Flake8 configs within tox.ini: Lines 16 to 19 in 4f20898
I'll go ahead and do that. |
cee4d6a
to
48cd470
Compare
Test build #145462 has finished for PR 34655 at commit
|
I suspect this is because Jenkins is using a different version of Flake8 than what is pinned in GitHub CI. (This is why I really want us to single-source all Python dev requirements.) spark/dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/spark-py36-spec.txt Line 100 in 4e1e33b
I will try to fix this. Hopefully, I won't need any special access to Jenkins. |
OK, never mind. I will need to push updates to the Jenkins workers via Ansible to fix this, which I don't want to get into here. Instead, I'll bump down the version of Flake8 to match what's in Jenkins. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Turns out there was some difference in Flake8's behavior when running Python 3.6 vs. 3.9, so I've adjusted the configs accordingly. Should be fixed now. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #145467 has finished for PR 34655 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me
@zero323 - Is the cleanup of pycodestyle configs along the lines of what you were expecting? |
Looks sensible, but I wouldn't mind more eyes on this. |
@HyukjinKwon - Is there anyone else you think should review this? |
Merged to master. |
What changes were proposed in this pull request?
This PR makes a small number of tweaks to our Python lint infra.
Why are the changes needed?
General maintenance of our internal Python infra.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
The existing
dev/lint-python
script.