Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyupgrade: Format specifiers #1594

Merged
merged 31 commits into from Jan 11, 2023
Merged

Pyupgrade: Format specifiers #1594

merged 31 commits into from Jan 11, 2023

Conversation

colin99d
Copy link
Contributor

@colin99d colin99d commented Jan 3, 2023

A part of #827. Posting this for visibility. Still has some work to do to be done.

Things that still need done before this is ready:

  • Does not work when the item is being assigned to a variable
  • Does not work if being used in a function call
  • Fix incorrectly removed calls in the function
  • Has not been tested with pyupgrade negative test cases

Tests from pyupgrade can be seen here: https://github.com/asottile/pyupgrade/blob/main/tests/features/format_literals_test.py

@colin99d
Copy link
Contributor Author

colin99d commented Jan 4, 2023

I will try to have this done by Sunday night.

@colin99d
Copy link
Contributor Author

colin99d commented Jan 6, 2023

@charliermarsh I have the following function to detect expressions:

pub fn match_expression(expression_text: &str) -> Result<Expression> {
    match libcst_native::parse_expression(expression_text) {
        Ok(expression) => Ok(expression),
        Err(_) => bail!("Failed to extract CST from source"),
    }
}

It works fine for an expression like this:

    'foo{0}' 'bar{1}'.format(1, 2)

But the second you make it multi-line like so:

    'foo{0}'\n    'bar{1}'.format(1, 2)

The whole thing fails. Is this an error with my coding, or an error with libcst_native?

@charliermarsh
Copy link
Member

I'm wondering if you need to dedent the code. You might be passing LibCST the indented expression, which would cause an error.

@colin99d
Copy link
Contributor Author

colin99d commented Jan 6, 2023

It looks like the issue still happens with the de-indented code: 'foo{0}'\n'bar{1}'.format(1, 2).

@colin99d
Copy link
Contributor Author

colin99d commented Jan 7, 2023

thread 'main' panicked at 'called Result::unwrap() on an Err value: ParserError(ParseError { location: ParseLoc { start_pos: LineCol { line: 2, column: 4, offset: 13 }, end_pos: LineCol { line: 2, column: 12, offset: 21 } }, expected: ExpectedSet { expected: {"EOF"} } }, "'foo{0}'\n 'bar{1}'.format(1, 2)")', src/pyupgrade/plugins/format_specififiers.rs:57:57

Looks like this is a parse error from the parse_expression function in libcst_native. My thoughts are we either:

  1. Make a custom patch in your fork
  2. Find a way to modify the string to add a \ at the end of each line so the parser can understand it, and then remove it once we are done
  3. Don't use libcst_native ( I dont like this one because I like libcst now)

Just let me know which you would like for me to pursue.

@charliermarsh
Copy link
Member

Would you be open to filing an issue on the LibCST repo? In the meantime, we could just skip those cases. (Are they needed for detection, or just autofixing?)

@colin99d
Copy link
Contributor Author

colin99d commented Jan 8, 2023

Here is the issue:
Instagram/LibCST#846
I will just make it a check for now!

new_call,
expr.location,
expr.end_location.unwrap(),
));
}
checker.add_check(check);
checker.diagnostics.push(diagnostic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you and sorry that this all churned mid-PR for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im never going to complain about you making this project easier to contribute to.

@colin99d
Copy link
Contributor Author

colin99d commented Jan 9, 2023

Hey @charliermarsh I got all the positive test cases working (except for the one edge case we talked about), and all the negative cases working except one. I wanted to get your feedback before I implemented it.
The following edge case is what causes issues:

'{' '0}'.format(1)

Since the format specifier is split into two different strings, it is NOT supposed to be formatted. However, both the Expr
struct, and the tokens automatically combine this into one string. What would be the best way of detecting that this edge case is preset? Or do we want to ignore it? Since the refactoring would produce changes that could be seen as desirable:
Screenshot 2023-01-08 at 10 07 52 PM

@charliermarsh
Copy link
Member

I think it's best just to ignore it (i.e., treat it as if there is no space). It strikes me as exceptionally rare, to the point that it's likely either a mistake or that it's not worth adding more complexity to our own implementation to handle it.

@charliermarsh
Copy link
Member

We could probably detect it by tokenizing... but it doesn't seem worth it to me.

@colin99d
Copy link
Contributor Author

colin99d commented Jan 9, 2023

I think it's best just to ignore it (i.e., treat it as if there is no space). It strikes me as exceptionally rare, to the point that it's likely either a mistake or that it's not worth adding more complexity to our own implementation to handle it.

100% agree, especially since the python code will still work.

@colin99d
Copy link
Contributor Author

colin99d commented Jan 9, 2023

I just need to fix one bug with this edge case, and then I should be able to take this PR out of draft.

@colin99d colin99d marked this pull request as ready for review January 9, 2023 15:59
@colin99d
Copy link
Contributor Author

@charliermarsh anything else you want to see from this for merge?

@charliermarsh
Copy link
Member

@colin99d - Will review this tonight!

// FOR REVIEWER: If the new and old are identical, don't create a fix. Pyupgrade
// doesnt even want me to report this, so we could just have an enum for errors,
// and if a special one is returned here then we dont even report a fix
if module_text == final_state.to_string() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nevermind.

}

/// UP030
pub(crate) fn format_literals(checker: &mut Checker, summary: &FormatSummary, expr: &Expr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colin99d - It turns out that we already had a utility for extracting the format positions, which uses the RustPython string parser underneath and so is very robust. Sorry that I didn't flag this earlier -- I didn't realize it could even be reused here, but it should make things more efficient too since we can do one string parse and share that "summary" amongst a bunch of checks.

contents,
expr.location,
expr.end_location.unwrap(),
));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like one strategy that could work here would be...

  • Take the entire text.
  • Replace the string part via the regex above.
  • Replace the arguments via LibCST.

Kinda tough, hacky, etc... but would solve the missing cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I guess that wouldn't solve the '{' '0}'.format(1) case.

@charliermarsh charliermarsh merged commit c016c41 into astral-sh:main Jan 11, 2023
@colin99d
Copy link
Contributor Author

Thanks for the review!

@colin99d colin99d deleted the formatspecifiers branch January 11, 2023 01:50
renovate bot added a commit to ixm-one/pytest-cmake-presets that referenced this pull request Jan 11, 2023
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [ruff](https://togithub.com/charliermarsh/ruff) | `^0.0.217` ->
`^0.0.218` |
[![age](https://badges.renovateapi.com/packages/pypi/ruff/0.0.218/age-slim)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://badges.renovateapi.com/packages/pypi/ruff/0.0.218/adoption-slim)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://badges.renovateapi.com/packages/pypi/ruff/0.0.218/compatibility-slim/0.0.217)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://badges.renovateapi.com/packages/pypi/ruff/0.0.218/confidence-slim/0.0.217)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>charliermarsh/ruff</summary>

###
[`v0.0.218`](https://togithub.com/charliermarsh/ruff/releases/tag/v0.0.218)

[Compare
Source](https://togithub.com/charliermarsh/ruff/compare/v0.0.217...v0.0.218)

#### What's Changed

- Implement flake8-simplify SIM112 by
[@&#8203;messense](https://togithub.com/messense) in
[astral-sh/ruff#1764
- Do not autofix PT004 and PT005 by
[@&#8203;harupy](https://togithub.com/harupy) in
[astral-sh/ruff#1763
- Disable release builds on CI by
[@&#8203;charliermarsh](https://togithub.com/charliermarsh) in
[astral-sh/ruff#1761
- Move CONTRIBUTING.md to top-level by
[@&#8203;charliermarsh](https://togithub.com/charliermarsh) in
[astral-sh/ruff#1768
- \[`flake8-bandit`] Add Rule for `S508` (snmp insecure version) &
`S509` (snmp weak cryptography) by
[@&#8203;saadmk11](https://togithub.com/saadmk11) in
[astral-sh/ruff#1771
- Generate RuleCode::origin() via macro by
[@&#8203;not-my-profile](https://togithub.com/not-my-profile) in
[astral-sh/ruff#1770
- Disable doctests by
[@&#8203;charliermarsh](https://togithub.com/charliermarsh) in
[astral-sh/ruff#1772
- Enable isort-style `required-imports` enforcement by
[@&#8203;charliermarsh](https://togithub.com/charliermarsh) in
[astral-sh/ruff#1762
- Pyupgrade: Format specifiers by
[@&#8203;colin99d](https://togithub.com/colin99d) in
[astral-sh/ruff#1594
- Avoid B023 false-positives for some common builtins by
[@&#8203;charliermarsh](https://togithub.com/charliermarsh) in
[astral-sh/ruff#1776

**Full Changelog**:
astral-sh/ruff@v0.0.217...v0.0.218

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://app.renovatebot.com/dashboard#github/ixm-one/pytest-cmake-presets).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzQuOTcuMSJ9-->

Signed-off-by: Renovate Bot <bot@renovateapp.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants