prepare repo for auto-formatters #1546

pmeier · 2022-01-28T08:08:44Z

Addresses pytorch/data#169 (comment). Don't review yet. I'll explain everything as soon as the setup is done.

pmeier

CI is now fully configured. I intentionally did not yet push the changes that will be made by the auto-formatters since there are so many of them that this PR will get unreviewable afterwards. Let me know if you are happy with the configs and I'll pull the trigger as soon as this approved. I'll also add a paragraph the contributing guidelines when the configs are approved.

.circleci/config.yml.in

.flake8

.pre-commit-config.yaml

.prettierignore

setup.cfg

tox.ini

.circleci/config.yml.in

mthrok · 2022-01-28T16:44:10Z

Is this effort coordinated with #1534?

parmeet · 2022-01-28T16:55:52Z

Is this effort coordinated with #1534?

cc: @abhinavarora

abhinavarora · 2022-01-28T17:11:34Z

Is this effort coordinated with #1534?

No, this is not coordinated. @pmeier, I also have #1534, which is similar to what torchaudio and vision has and this is in sync with Meta's internal workflow. Would you like to merge the PRs together. I will be happy to address your feedback as well :-)

abhinavarora

Hi @pmeier , I closed my PR #1534 in favor of this PR which is much more detailed. Overall your changes look great to me! I will verify them with our internal workflow as well and then approve the PR :-)

One question for you: Do you think we can use pre-commit to run clang-format as well like I was doing in #1534. That ways OSS contributors will not have to worry about it while making commits.

pmeier · 2022-01-31T09:31:34Z

Hey @abhinavarora, I'm sorry I was not aware that there was another effort to add this functionality.

One question for you: Do you think we can use pre-commit to run clang-format as well like I was doing in #1534. That ways OSS contributors will not have to worry about it while making commits.

Yes, that is indeed a better way to deal with this. Let me adapt the PR.

abhinavarora

Overall LGTM! Just added a nit comment. Feel free to merge the PR after addressing that. Thank you so much for enforcing high code standards in torchtext repo and making us at par with other PyTorch domain libraries.

CONTRIBUTING.md

pmeier · 2022-02-01T23:59:08Z

@abhinavarora

Feel free to merge the PR after addressing that.

I don't have rights, so you'll have to. Just to make sure we are not talking past each other. You want to merge this PR after the auto format changes are pushed right not before, right?

Regardless, it would be nice to confirm that nothing breaks by getting the unittest workflows to pass after I have pushed the changes. But the CI currently seems toasted. Are you sure you want to merge this PR with all the changes without being able to confirm everything is fine through CI? One thing we stumbled over when adding this to torchvision was that we implicitly relied on a custom import order in some modules. usort changed that and so importing torchvision failed.

abhinavarora · 2022-02-02T17:15:40Z

Regardless, it would be nice to confirm that nothing breaks by getting the unittest workflows to pass after I have pushed the changes. But the CI currently seems toasted. Are you sure you want to merge this PR with all the changes without being able to confirm everything is fine through CI? One thing we stumbled over when adding this to torchvision was that we implicitly relied on a custom import order in some modules. usort changed that and so importing torchvision failed.

@pmeier Thank you for sharing your experience during torchvision. In that case, let's add the changes from the formatters in this PR. Once added, we can validate the changes on CI. Since some dataset tests on CI are toasted (due to some caching issues on CI), we can also validate the chnages by running pytest locally. I will be happy to do that on my end before merging the PR.

To summarize I could take on the following action items after you add the auto formatter changes:

Run pytest locally to validate the changes.
Import this PR to Meta's internal repo and validate there.
Merge the PR

Please let me know if you have any concerns. Thank you again for this effort :-)

…workflow

abhinavarora · 2022-02-04T00:20:58Z

Regardless, it would be nice to confirm that nothing breaks by getting the unittest workflows to pass after I have pushed the changes. But the CI currently seems toasted. Are you sure you want to merge this PR with all the changes without being able to confirm everything is fine through CI? One thing we stumbled over when adding this to torchvision was that we implicitly relied on a custom import order in some modules. usort changed that and so importing torchvision failed.

@pmeier Thank you for sharing your experience during torchvision. In that case, let's add the changes from the formatters in this PR. Once added, we can validate the changes on CI. Since some dataset tests on CI are toasted (due to some caching issues on CI), we can also validate the chnages by running pytest locally. I will be happy to do that on my end before merging the PR.

To summarize I could take on the following action items after you add the auto formatter changes:

Run pytest locally to validate the changes.

Import this PR to Meta's internal repo and validate there.

Merge the PR

Please let me know if you have any concerns. Thank you again for this effort :-)

Hi @pmeier, I validated this PR against meta's internal workflow and ran into some discrepancies. In order to address these, I took the following steps:

Downgraded the version of usort in pre-commit to match Meta's internal workflow. This is the same as the version used in audio and vision.
Added a .clang-format style file as present in vision and audio repos for C/C++ formatting to be in sync.

I also merged latest code and formatted it in this PR so that there are no merge conflicts. I will go ahead and now merge this PR. Thank you so much for your great work in onboarding us to high code formatting standards!

pmeier · 2022-02-04T07:35:16Z

Downgraded the version of usort in pre-commit to match Meta's internal workflow. This is the same as the version used in audio and vision.

I don't know how Meta's internals work, but I think upgrading the internal workflows would be the way to go. The only reason torchvision is using usort==0.6.4 is that at the time of adding the hooks, this was the latest version. In fact, I have a PR open that upgrades it 1.0.1 pytorch/vision#5106. @NicolasHug wanted to merge this (soon?) after "fixing" the internal tooling.

There are two major improvements of 1.0.1 over 0.6.4:

Multiple import statements that import from the same module will be merged into one.

Imports inside an import statement are sorted lexiographically.

For example
from foo import baz
from foo import spam, ham
from foo import bar
will be turned into
from foo import bar, baz, ham, spam

pmeier · 2022-02-04T07:59:26Z

There are two problems with clang-format

Having never run run-clang-format.py myself, I was under the impression that it will fix the inconsistencies automatically. This is not the case since it will only show you the difference. The binary itself has a -i flag that enables this. So we should either change the wording in the contributing guide to indicate that the user needs to make the changes themselves or nuke the use of the Python script in favor of running the binary directly.
Running the clang-format binary either directly or through the Python script is far from trivial. The binary is not statically linked and requires libtinfo.so.5. In the CI jobs, we simply do apt install libtinfo5, but what if your system (like mine) has no option to install this (outdated) version? My next step was trying to install ncurses=5 from conda-forge since it provides libtinfo.so.5 only to discover that python>=3.7 requires ncurses>=6. Since the upcoming release will drop Python 3.6 support, the contributor needs to have a separate environment just to run clang-format. To add insult to injury, you have to manually set LD_LIBRARY_PATH because libtinfo.so.5 is not found by default. After jumping through all these hoops the final blocker that I hit, that I still can't run the binary with the -i flag since

libtinfo.so.5: no version information available (required by ./clang-format-linux64)

Although I'm very much in favor of letting an auto formatter do its job rather than "abusing" it only as linter, I can't recommend going that route.

In any way, we should greatly expand the contributing guide to explain the steps necessary to even run clang-format. Plus, since the binary is provided by Meta, could you ask the person responsible to statically link it so we don't have to deal with all this?

NicolasHug · 2022-02-04T08:04:33Z

I have a PR open that upgrades it 1.0.1 pytorch/vision#5106. @NicolasHug wanted to merge this (soon?) after "fixing" the internal tooling

Yes you'll have to tweak another thing internally to use the latest ufmt version:
https://fb.workplace.com/groups/pyfmt/permalink/925830131387837/

codecov · 2022-02-16T22:34:53Z

Codecov Report

Merging #1546 (8a12398) into main (8808e7e) will decrease coverage by 0.04%.
The diff coverage is 82.65%.

@@            Coverage Diff             @@
##             main    #1546      +/-   ##
==========================================
- Coverage   85.33%   85.28%   -0.05%     
==========================================
  Files          58       58              
  Lines        2496     2488       -8     
==========================================
- Hits         2130     2122       -8     
  Misses        366      366

Impacted Files	Coverage Δ
torchtext/experimental/datasets/raw/wmt14.py	`21.66% <25.00%> (-1.29%)`	⬇️
torchtext/data/utils.py	`51.63% <45.00%> (ø)`
torchtext/data/datasets_utils.py	`74.77% <66.66%> (+0.11%)`	⬆️
torchtext/data/functional.py	`61.66% <66.66%> (ø)`
...orchtext/experimental/datasets/raw/wmtnewscrawl.py	`68.18% <66.66%> (ø)`
torchtext/_download_hooks.py	`53.48% <81.81%> (ø)`
torchtext/nn/modules/multiheadattention.py	`92.40% <83.33%> (ø)`
torchtext/utils.py	`81.06% <84.61%> (ø)`
torchtext/models/roberta/bundler.py	`88.33% <87.50%> (ø)`
torchtext/models/roberta/modules.py	`84.66% <87.50%> (-0.10%)`	⬇️
... and 42 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8808e7e...8a12398. Read the comment docs.

abhinavarora · 2022-02-17T00:14:20Z

There are two problems with clang-format

1. Having never run `run-clang-format.py` myself, I was under the impression that it will fix the inconsistencies automatically. This is not the case since it will only show you the difference. The binary itself has a `-i` flag that enables this. So we should either change the wording in the contributing guide to indicate that the user needs to make the changes themselves or nuke the use of the Python script in favor of running the binary directly.

2. Running the `clang-format` binary either directly or through the Python script is far from trivial. The binary is not statically linked and requires `libtinfo.so.5`. In the CI jobs, we simply do `apt install libtinfo5`, but what if your system (like mine) has no option to install this (outdated) version? My next step was trying to install `ncurses=5` from conda-forge since it provides `libtinfo.so.5` only to discover that `python>=3.7` requires `ncurses>=6`. Since the upcoming release will drop Python 3.6 support, the contributor needs to have a separate environment just to run `clang-format`. To add insult to injury, you have to manually set `LD_LIBRARY_PATH` because `libtinfo.so.5` is not found by default. After jumping through all these hoops the final blocker that I hit, that I still can't run the binary with the `-i` flag since
   > libtinfo.so.5: no version information available (required by ./clang-format-linux64)
   
   
   Although I'm very much in favor of letting an auto formatter do its job rather than "abusing" it only as linter, I can't recommend going that route.
   In any way, we should greatly expand the contributing guide to explain the steps necessary to even run `clang-format`. Plus, since the binary is provided by Meta, could you ask the person responsible to statically link it so we don't have to deal with all this?

@pmeier you are right. I will create an option in run-clang-format.py to also format the files. As far as statically linking the binaries is concerned, let me reach out to the teams that own them to see if this an option!

pmeier · 2022-02-22T16:18:10Z

FYI: a recent update in the build dependencies of ufmt broke the pre-commit hook. You are going to see lint CI failures. We moved them out of the hooks temporarily in pytorch/vision#5454.

parmeet · 2022-02-22T17:44:45Z

FYI: a recent update in the build dependencies of ufmt broke the pre-commit hook. You are going to see lint CI failures. We moved them out of the hooks temporarily in pytorch/vision#5454.

cc: @abhinavarora

prepare repo for aut-formatters

bd2998a

pytorch-bot bot added the ciflow/default label Jan 28, 2022

facebook-github-bot added the cla signed label Jan 28, 2022

pmeier added 11 commits January 28, 2022 09:27

fix CircleCI config

ce9adea

reactivate lint jobs

169b66d

install libtinfo for linting c code

b3194ad

disable git diff for required action

207906e

fix config

561da34

fix clang format

a63c6c8

fix lint failure behavior

a8f3258

fix circleci consistency check

8fb0176

try different failure messages

f04646e

change failure format

c657a2c

fix config template

5da9f8b

pmeier commented Jan 28, 2022

View reviewed changes

pmeier marked this pull request as ready for review January 28, 2022 09:34

abhinavarora mentioned this pull request Jan 28, 2022

Add Black and other auto formatters as pre-commit hooks #1534

Closed

abhinavarora reviewed Jan 28, 2022

View reviewed changes

pmeier mentioned this pull request Jan 31, 2022

use clang-format in pre-commit pytorch/vision#5317

Closed

pmeier added 7 commits January 31, 2022 11:11

use clang-format as pre-commit hook

bc17ab1

fix rev

6b94331

fix rev

e643c57

rename job from lint to format

4b35cad

fix step names

117a72d

update template

baf4b26

remove flake8-docstrings

be0ec70

abhinavarora approved these changes Feb 1, 2022

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

pmeier added 2 commits February 2, 2022 00:40

add explanation of clang-format to contribution guide

6769019

remove flake8 from unittest environments

e707a17

pmeier and others added 7 commits February 3, 2022 08:34

apply changes from auto formatters

bb3d1b1

Change Black and usort version to match Meta's internal version

3a304ce

Add formatting changes after changing black and usort version

f4e0535

Add .clang-format to ensure clang changes align with Meta's internal …

1f90bc5

…workflow

Add clang-format changes

88ed954

Merge branch 'main' into pre-commit

b86c069

Merge main into branch

17d3337

abhinavarora mentioned this pull request Feb 4, 2022

Fix handling of end of file while reading vocab from file #1573

Merged

Nayef211 mentioned this pull request Feb 11, 2022

[FORMATTING] Update formatting for dataset tests #1601

Merged

abhinavarora added 5 commits February 16, 2022 10:37

Merge branch 'main' into pre-commit

314387b

Run pre-commit after merge

4993090

Run clang-format after merge

f75495e

Fix regenerate.py merge conflict

b1e3a69

Fix merge issue

8a12398

abhinavarora merged commit c31a400 into pytorch:main Feb 17, 2022

abhinavarora added a commit that referenced this pull request Feb 17, 2022

Fix flake8 issues introduced as a result of #1546 (#1617)

ebabe82

pmeier deleted the pre-commit branch February 22, 2022 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prepare repo for auto-formatters #1546

prepare repo for auto-formatters #1546

pmeier commented Jan 28, 2022

pmeier left a comment

mthrok commented Jan 28, 2022

parmeet commented Jan 28, 2022

abhinavarora commented Jan 28, 2022

abhinavarora left a comment

pmeier commented Jan 31, 2022

abhinavarora left a comment

pmeier commented Feb 1, 2022

abhinavarora commented Feb 2, 2022

abhinavarora commented Feb 4, 2022

pmeier commented Feb 4, 2022

pmeier commented Feb 4, 2022

NicolasHug commented Feb 4, 2022

codecov bot commented Feb 16, 2022 •

edited

abhinavarora commented Feb 17, 2022

pmeier commented Feb 22, 2022 •

edited

parmeet commented Feb 22, 2022

prepare repo for auto-formatters #1546

prepare repo for auto-formatters #1546

Conversation

pmeier commented Jan 28, 2022

pmeier left a comment

Choose a reason for hiding this comment

mthrok commented Jan 28, 2022

parmeet commented Jan 28, 2022

abhinavarora commented Jan 28, 2022

abhinavarora left a comment

Choose a reason for hiding this comment

pmeier commented Jan 31, 2022

abhinavarora left a comment

Choose a reason for hiding this comment

pmeier commented Feb 1, 2022

abhinavarora commented Feb 2, 2022

abhinavarora commented Feb 4, 2022

pmeier commented Feb 4, 2022

pmeier commented Feb 4, 2022

NicolasHug commented Feb 4, 2022

codecov bot commented Feb 16, 2022 • edited

Codecov Report

abhinavarora commented Feb 17, 2022

pmeier commented Feb 22, 2022 • edited

parmeet commented Feb 22, 2022

codecov bot commented Feb 16, 2022 •

edited

pmeier commented Feb 22, 2022 •

edited