Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup #34655

Closed
wants to merge 10 commits into from

Conversation

nchammas
Copy link
Contributor

@nchammas nchammas commented Nov 19, 2021

What changes were proposed in this pull request?

This PR makes a small number of tweaks to our Python lint infra.

  • pylintrc is obsolete and not used.
  • tox.ini supports multiline values, so I've broken up the extremely long lines that were in there.
  • A requirement was missing from requirements.txt so I added it in there.
  • I've removed pycodestyle and merged its configs into flake8's, since flake8 includes pycodestyle checks.

Why are the changes needed?

General maintenance of our internal Python infra.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

The existing dev/lint-python script.

@@ -275,8 +275,7 @@ SPARK_ROOT_DIR="$(dirname "${SCRIPT_DIR}")"

pushd "$SPARK_ROOT_DIR" &> /dev/null

# skipping local ruby bundle directory from the search
PYTHON_SOURCE="$(find . -path ./docs/.local_ruby_bundle -prune -false -o -name "*.py")"
PYTHON_SOURCE="$(git ls-files '*.py')"
Copy link
Contributor Author

@nchammas nchammas Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an improvement, but I don't think this is that great, either. In general, I don't like that we are building a huge string of every Python file and passing it as an argument.

Instead, we should be using the appropriate include and exclude filters (e.g. via tox.ini) to capture everything to be tested. But I don't want to get into that here.

@@ -19,6 +19,7 @@ coverage

# Linter
mypy
git+https://github.com/typeddjango/pytest-mypy-plugins.git@b0020061f48e85743ee3335bd62a3a608d17c6bd
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was being used in CI but it was missing from here, so I added it in.

In a future PR, I want to reorganize things so that CI references the requirements file. I don't think it's good that we have dependencies specified here and then specified again inside our CI script.

@HyukjinKwon
Copy link
Member

cc @zero323 and @Fokko FYI

@HyukjinKwon HyukjinKwon changed the title [SPARK-37380] Miscellaneous Python lint infra cleanup [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup Nov 19, 2021
@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49891/

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49891/

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Test build #145417 has finished for PR 34655 at commit cee4d6a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323
Copy link
Member

zero323 commented Nov 19, 2021

If we are already here let me ask a stupid question ‒ why do we use both pycodestyle and flake8. Isn't the former one redundant?

@nchammas
Copy link
Contributor Author

Flake8 bundles other tools, including pycodestyle. So yes, we can probably merge the pycodestyle configs into the Flake8 configs within tox.ini:

spark/dev/tox.ini

Lines 16 to 19 in 4f20898

[pycodestyle]
ignore=E203,E226,E241,E305,E402,E722,E731,E741,W503,W504,E501
max-line-length=100
exclude=*/target/*,python/pyspark/cloudpickle/*.py,shared.py,python/docs/source/conf.py,work/*/*.py,python/.eggs/*,dist/*,.git/*,dev/ansible-for-test-node/*

I'll go ahead and do that.

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Test build #145462 has finished for PR 34655 at commit 48cd470.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@nchammas
Copy link
Contributor Author

nchammas commented Nov 19, 2021

Test build #145462 has finished for PR 34655 at commit 48cd470.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

I suspect this is because Jenkins is using a different version of Flake8 than what is pinned in GitHub CI. (This is why I really want us to single-source all Python dev requirements.)

https://repo.anaconda.com/pkgs/main/noarch/flake8-3.9.0-pyhd3eb1b0_0.conda

I will try to fix this. Hopefully, I won't need any special access to Jenkins.

@nchammas
Copy link
Contributor Author

OK, never mind. I will need to push updates to the Jenkins workers via Ansible to fix this, which I don't want to get into here.

Instead, I'll bump down the version of Flake8 to match what's in Jenkins.

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49934/

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49934/

@nchammas
Copy link
Contributor Author

Turns out there was some difference in Flake8's behavior when running Python 3.6 vs. 3.9, so I've adjusted the configs accordingly. Should be fixed now.

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49939/

@SparkQA
Copy link

SparkQA commented Nov 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49939/

@SparkQA
Copy link

SparkQA commented Nov 20, 2021

Test build #145467 has finished for PR 34655 at commit 5418ec5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me

@nchammas
Copy link
Contributor Author

@zero323 - Is the cleanup of pycodestyle configs along the lines of what you were expecting?

@zero323
Copy link
Member

zero323 commented Nov 22, 2021

@zero323 - Is the cleanup of pycodestyle configs along the lines of what you were expecting?

Looks sensible, but I wouldn't mind more eyes on this.

@nchammas
Copy link
Contributor Author

@HyukjinKwon - Is there anyone else you think should review this?

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants