Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove "bad characters" from our codebase #24841

Merged
merged 2 commits into from
Jul 5, 2022

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Jul 5, 2022

We had plenty of "bad characters" in our codebase that were not
invited and came here by accident. We want to get rid of those
"bad characters" once and for all.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Jul 5, 2022

Turned out we had more "Bad characters" :P in our codebase than literally meets the eye :P

@potiuk potiuk force-pushed the remove-bad-characters branch 2 times, most recently from 4d1cdde to 3f95cc9 Compare July 5, 2022 10:06
@potiuk
Copy link
Member Author

potiuk commented Jul 5, 2022

:( Executable ruby not found

@potiuk
Copy link
Member Author

potiuk commented Jul 5, 2022

This is no go, I do not want to get ruby dependency for pre-commit. Time for 15-liner python script :)

.gitmodules Outdated Show resolved Hide resolved
@ashb
Copy link
Member

ashb commented Jul 5, 2022

What's wrong with using smart/curly quotes in docs?

@potiuk
Copy link
Member Author

potiuk commented Jul 5, 2022

What's wrong with using smart/curly quotes in docs?

For the context - this is the result of discussion in #24797 - where nbsps were shown in some IDES (only some IDES show it , some others will render it as space.

The main reason is consistency.

If we are using them inconsistently, then it's wrong. You either should use always curly quotes in your docs or always straight ones otherwise our docs will look strange if you look closely. Another problem it introduces it's "search & replace" functionality. Whenever you are using global search and replace and want to search for "quotes" you will often miss the curly quote cases - especially when it is "random" and not used consistently.

Most of the "curly ones" comes usually (similarly as nbsp) from copy-pasting from word-like sources, but some people will use the "straight" ones, some people will use the "curly" ones and it's difficult to keep consistency. And the bad thing with it is also that sometimes those curly quotes will sneak-in the code examples and snippets which are part of our documentation and we will not notice. This happened often to me where I wanted to copy some piece of code from the browser, It did not parse/compile or produce strange results.

Of course there are plenty exceptions where you realy can't use curly ones (code snippets etc.) and it's next to impossible to figure out automatically when yoiu can use them.. It's far easier to replace all curly ones with straight ones. The consistency of look&feel is kept and you avoid any of the problems above. And look & feel is marginally impacted IMHO.. And if we REALLY want to keep the curly quotes in some specific files, or folders we can always exclude them by regexp in pre-commit.

We had plenty of "bad characters" in our codebase that were not
invited and came here by accident. We want to get rid of those
"bad characters" once and for all.
@potiuk
Copy link
Member Author

potiuk commented Jul 5, 2022

Updated much improved version:

  1. brought back the same exclusions that "forbid tabs" used to have
  2. No ruby/pre-commit dependency - all is pure python + rich (so there is really one virtualenv used for this and many other pre-commits - pre-commit nicely recognizes multiple pre-commits and will reuse the virtualenv that has the same set of requirements.
  3. Nicer, colorful output informing what has been removed (I added "replacements" after the number comparing to the screenshot).

image

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
@potiuk potiuk merged commit 96b01a8 into apache:main Jul 5, 2022
@potiuk potiuk deleted the remove-bad-characters branch July 5, 2022 15:08
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI area:dev-tools area:helm-chart Airflow Helm Chart area:providers area:webserver Webserver related Issues provider:google Google (including GCP) related issues type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants