Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate lib2to3 (and 2to3) for future removal #84540

Closed
gpshead opened this issue Apr 22, 2020 · 51 comments
Closed

Deprecate lib2to3 (and 2to3) for future removal #84540

gpshead opened this issue Apr 22, 2020 · 51 comments
Labels
3.11 bug and security fixes topic-2to3 type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented Apr 22, 2020

BPO 40360
Nosy @gvanrossum, @gpshead, @vstinner, @carljm, @ambv, @ericsnowcurrently, @hroncok, @corona10, @miss-islington, @tirkarthi, @isidentical, @iritkatriel
PRs
  • bpo-40360: Prepare to deprecate lib2to3. #19645
  • bpo-40360: Deprecate lib2to3 module in light of PEP 617 #19663
  • bpo-40360: Add a What's new entry for lib2to3 pending deprecation #19898
  • bpo-40360: Handle PendingDeprecationWarning in test_lib2to3. #21694
  • [3.9] bpo-40360: Handle PendingDeprecationWarning in test_lib2to3. (GH-21694) #21696
  • [3.9] bpo-40360: Handle PendingDeprecationWarning in test_lib2to3. (GH-21694) #21697
  • bpo-40360: Deprecate the lib2to3 package #28116
  • bpo-40360: [doc] Rephrase deprecation note about lib2to3 #28122
  • [3.10] bpo-40360: [doc] Rephrase deprecation note about lib2to3 (GH-28122) #28127
  • bpo-40360: Make the 2to3 deprecation more obvious. #29064
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-02.16:03:42.312>
    created_at = <Date 2020-04-22.04:40:54.230>
    labels = ['type-feature', 'expert-2to3', '3.11']
    title = 'Deprecate lib2to3 (and 2to3) for future removal'
    updated_at = <Date 2021-10-20.22:28:52.390>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2021-10-20.22:28:52.390>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-02.16:03:42.312>
    closer = 'vstinner'
    components = ['2to3 (2.x to 3.x conversion tool)']
    creation = <Date 2020-04-22.04:40:54.230>
    creator = 'gregory.p.smith'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40360
    keywords = ['patch']
    message_count = 51.0
    messages = ['366973', '367005', '367031', '367051', '367208', '367209', '367230', '367235', '367716', '367726', '367730', '367743', '367744', '367766', '367767', '367770', '367771', '367884', '368077', '368388', '373185', '373198', '373327', '373332', '373334', '373444', '373538', '374348', '374349', '374566', '374572', '374573', '374574', '374611', '374634', '374643', '379014', '400876', '400877', '400879', '400883', '400885', '400892', '400905', '400906', '400923', '400927', '400933', '400937', '404324', '404532']
    nosy_count = 14.0
    nosy_names = ['gvanrossum', 'gregory.p.smith', 'vstinner', 'carljm', 'lukasz.langa', 'eric.snow', 'davidhalter', 'hroncok', 'corona10', 'miss-islington', 'xtreak', 'BTaskaya', 'Peter Ludemann', 'iritkatriel']
    pr_nums = ['19645', '19663', '19898', '21694', '21696', '21697', '28116', '28122', '28127', '29064']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue40360'
    versions = ['Python 3.11']

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 22, 2020

    Based on the PEP-617 acceptance thread on python-dev, lib2to3 is eventually going to run into trouble parsing modern syntax a few releases from now.

    It would be better off maintained outside of the standard library. It gets used by a lot of things and is generally useful, but would make a lot more sense as a PyPI project than as something only quasi-maintained within the stdlib (it only gained the ability to parse a couple modern syntax features in via bugfix contributions to the stdlib the past month or two... meaning a lot of versions of it out there cannot)

    Black has already forked it.

    goal: PendingDeprecationWarning and documentation as such in 3.9. Move to DeprecationWarning in 3.10 or 3.11 and remove it by ~3.12. Subject to our existing deprecation process guidelines.

    @gpshead gpshead added the 3.9 only security fixes label Apr 22, 2020
    @gpshead gpshead self-assigned this Apr 22, 2020
    @gpshead gpshead added type-feature A feature request or enhancement 3.9 only security fixes labels Apr 22, 2020
    @gpshead gpshead self-assigned this Apr 22, 2020
    @gpshead gpshead added type-feature A feature request or enhancement topic-2to3 labels Apr 22, 2020
    @gvanrossum
    Copy link
    Member

    I am in favor of this. We could promote LibCST, which is based on Parso, which uses a forked version of pgen2 (the parser in lib2to3). I believe one of these could switch to a fork of pegen as its parser, so it will be able to handle new PEG based syntax in 3.10+.

    Removal by 3.12 might be feasible.

    @carljm
    Copy link
    Contributor

    carljm commented Apr 22, 2020

    I volunteered in the python-dev thread to write a patch to the docs clarifying future status of lib2to3; happy to include the PendingDeprecationWarning as well.

    Re linking to alternatives, we want to make sure we link to alternatives that are committed to updating to support newer Python versions' syntax. This definitely includes LibCST; I can inquire with the parso maintainer about whether it also includes parso. In future it could also include a third-party-maintained copy of lib2to3, if someone picks that up.

    @carljm
    Copy link
    Contributor

    carljm commented Apr 22, 2020

    I opened a PR. It deprecates the lib2to3 library to discourage future use of it for Python3, but not the 2to3 tool. This of course means that the lib2to3 module will in practice stick around in the stdlib as long as 2to3 is still bundled with Python.

    It seems like the idea in this issue is to deprecate and remove both. I'm not sure what we typically do to deprecate a command-line utility bundled with Python. Given warnings are silent by default, the deprecation warning for lib2to3 won't be visible to users of 2to3. Should I add something to its --help output? Or something more aggressive; an unconditionally-printed warning?

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 24, 2020

    New changeset 503de71 by Carl Meyer in branch 'master':
    bpo-40360: Deprecate lib2to3 module in light of PEP-617 (GH-19663)
    503de71

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 24, 2020

    Okay,the pending deprecation is in. Keeping open as a reminder to turn that into a real DeprecationWarning in 3.10 after the 3.9 branch is cut.

    We'll then want to track reminding us to remove it in 3.12.

    @carljm
    Copy link
    Contributor

    carljm commented Apr 24, 2020

    @gregory.p.smith

    What do you think about the question I raised above about how to make this deprecation visible to users of the 2to3 CLI tool, assuming the plan is to remove both?

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 24, 2020

    I think what we're doing with the documentation update is fine. We can add a warning on stderr to the tool in 3.11. But I don't expect people will be using the tool _from_ the latest CPython 3.x by then.

    2to3 is already included with Python 2.7 and the only real use for it is for people who still have code they maintain on 2.7 so they've got a copy already. There is no value in running a 2to3 shipped with Python 3 vs the latest 2.7. Meaningful updates to it were already back ported to 2.7 over time as it was intentionally exempt from feature freeze.

    We should have sorted out a PyPI home for lib2to3 by 3.11 time and can also create a PyPI package for the 2to3 tool itself at that point.

    I _think_ there is support for running 2to3 on sources at package install time from setup.py? But I don't expect anything actually maintained and widely used to require that by the time this deprecation lands. If it does, that becomes a plumbing issue within package tools to know that requiring 2to3 at either build or install time adds an implicit tool dependency on the new pypi package to get it.

    Maybe I'm just in a good mood about all of this, but none of this seems worrisome.

    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Apr 29, 2020

    The documentation change gives two possible successors:

    https://libcst.readthedocs.io/ (https://github.com/Instagram/LibCST)
    https://parso.readthedocs.io/

    And I've also seen this mentioned: https://github.com/pyga/awpa

    Is it possible to settle on one of these as the successor to the lib2to3 parser? It would be nice to avoid a 2nd deprecation in the future ...

    @gvanrossum
    Copy link
    Member

    It's typically not up to the core devs to pick a winning third party library; we tend to recommend libraries that are already essentially category winners, like requests. In a sense pointing to LibCST *and* parso is redundant because LibCST builds on parso. Comparing stars on GitHub:

    • LibCST: 423
    • parso: 296
    • awpa: 10

    @carljm
    Copy link
    Contributor

    carljm commented Apr 30, 2020

    Right, although I think it still makes sense to link both LibCST and parso since they provide different levels of abstraction that would be suitable for different types of tools (e.g. I would rather write an auto-formatter on top of parso, because LibCST's careful parsing and assignment of whitespace would mostly just get in the way, but I'd rather write any kind of refactoring tooling on top of LibCST.)

    Another tool that escaped my mind when writing the PR that should probably be linked also is Baron/RedBaron (https://github.com/PyCQA/redbaron); 457 stars makes it slightly more popular than LibCST (but it's also been around a lot longer.)

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Apr 30, 2020

    Coul you please add a what's new entry for this change?

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Apr 30, 2020

    I don't understand why there is a PendingDeprecationWarning and not a DeprecationWarning.

    See https://discuss.python.org/t/pendingdeprecationwarning-is-really-useful/1038/4 and bpo-36404

    @carljm
    Copy link
    Contributor

    carljm commented Apr 30, 2020

    Coul you please add a what's new entry for this change?

    The committed change already included an entry in NEWS. Is a "What's New" entry something different?

    I don't understand why there is a PendingDeprecationWarning and not a DeprecationWarning.

    Purely because I was following gps' recommendation in the first comment on this issue. Getting rid of PendingDeprecationWarning seems like an orthogonal decision; if it happens, this can trivially be upgraded to DeprecationWarning as part of a removal sweep.

    @gvanrossum
    Copy link
    Member

    A "What's New" entry would go into Doc/whatsnew/3.9.rst and is much more visible to users looking for exciting bits in the new release (the NEWS file is very large, see e.g. https://docs.python.org/3/whatsnew/changelog.html#changelog.

    The What's New doc typically has a section collecting all the deprecations, e.g. https://docs.python.org/3/whatsnew/3.8.html#deprecated.

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Apr 30, 2020

    Getting rid of PendingDeprecationWarning seems like an orthogonal decision; if it happens, this can trivially be upgraded to DeprecationWarning as part of a removal sweep.

    My thought was that the decision was already made to do so. Hence adding new PendingDeprecationWarnings goes against that decision.

    But maybe I misunderstand and that decision was not made.

    @gvanrossum
    Copy link
    Member

    IIRC PendingDeprecationError does not mean that the decision hasn't been made yet. It just means it's less urgent for folks to worry about. I believe we tend to change PendingDeprecationError to DeprecationError in the last release before something is removed.

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented May 1, 2020

    Thanks for the explanation.

    I plan to send a PR to add this to the What's new in 3.9 page early next week. Anyone, feel free to beat me to it.

    @gpshead
    Copy link
    Member Author

    gpshead commented May 4, 2020

    New changeset 18f1c60 by Miro Hrončok in branch 'master':
    bpo-40360: Add a What's New entry for lib2to3 pending deprecation (GH-19898)
    18f1c60

    @vstinner
    Copy link
    Member

    vstinner commented May 7, 2020

    FYI the autopep8 project uses lib2to3.

    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Jul 6, 2020

    Looking at the suggested successor tools (redbaron, libCST, parso, awpa) ... all of them appear to use some variant of pgen2. But at some point Python will be using a PEG approach (PEP-617), and therefor the pgen2 approach apparently won't work.

    For a number of projects, it's important to have a parse tree that contains all the "whitespace" information (indent, dedent, comment, newline, etc.) As far as I can tell, the new PEG parser won't provide that, and it seems that none of the successor tools will be able to handle future versions of Python syntax.

    So, three questions:

    1. Am I right that all proposed replacements (redbaron, libCST, parso, awpa) use some variation of the LL(1) and therefore will have trouble in the future?
    2. Are there any plans (either part of the core development or as a project) for one of these replacements that is PEG-based? (Or a new project?)
    3. Is Lib/ast.py going to continue being supported? (I infer that it will, with the change from LL(1) to PEG being mostly transparent - https://mail.python.org/archives/list/python-dev@python.org/thread/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/#4D3B2NM2JMV2UKIT6EV5Q2A6XK2HXDEH )

    If Lib/ast.py continues to be supported, I think I can see a way of providing functionality similar to lib2to3 (in terms of an AST-ish thing with "whitespace" from the source, sufficient for tools such as yapf, black, pykythe, pytype, mypy, etc.) as a kind of wrapper to ast.py.
    I suppose I should discuss this idea on python-dev? Is there an ongoing discussion? (I couldn't find any but might have been using the wrong search terms)

    @gvanrossum
    Copy link
    Member

    There's no python-dev discussion; if you want more feedback I recommend starting on python-ideas first (on either forum you may expect pushback because this is not about a proposed change to Python or its workflow).

    The Lib/ast.py module will continue to be the official API for the standard AST. It is a simple wrapper around the builtin parser (at least in CPython -- I don't actually know to what extent other Python implementations support it, but they certainly *could*). And in 3.9 and later the AST is already being produced using the *new* parser.

    We want to deprecate lib2to3 because nobody is interested in maintaining it., Having it in the stdlib, with its strict backwards compatibility requirements, makes it difficult to do a good job at updating it. This is why it's been forked repeatedly -- once forked, the owner of the fork can make changes easily, preserving the API perfectly (if so desired) and maintaining compatibility with older Python versions.

    My own thoughts are that libraries like LibCST and parso have two sides: an API for the AST, and a way to parse source code into an AST. Usually the parsing API is incredibly simple -- e.g. a function to parse a file and another function to parse a string. And there's no reason for the AST API to change just because the parsing algorithm has changed.

    Finally, we already have a (rough) Python implementation of the PEG parser too -- in fact it's included in Tools/peg_generator (and used to regenerate the metaparser). This reads the same grammar format (i.e. Grammar/python.gram) and generates Python code instead of C code to do the parsing. It's easy to retarget the tokenizer of the generated Python code.

    So a decent way forward might be to pick one of the 3rd party libraries (perhaps parso, which is itself a fork of lib2to3 and what LibCST builds on) and update its parser to use a PEG parser generated using the PEG generator from Tools/peg_generator (which people are welcome to fork).

    This might be a summer-of-code-sized project.

    @gpshead gpshead removed 3.9 only security fixes labels Oct 19, 2020
    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2021

    I created PR 28116 to deprecate the lib2to3 package: replace PendingDeprecationWarning to DeprecationWarning.

    In 2021, I don't think that we should keep the 2to3 tool in the stdlib. Python 2 reached end of line 1 year ago.

    For the lib2to3 *parser*, IMO it would be better to maintain it outside the stdlib, and collaborate with other existing forks, like the parser used by Black.

    The change is only about *deprecating* lib2to3, not *remove* it. For the removal, we should check if major projects which used it moved to something else.

    It's kind of a shame that Python stdlib (lib2to3) cannot parse valid Python 3.10 code :-( IMO it's better to deprecate (and then remove) lib2to3.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2021

    I retarget this issue to Python 3.11, since lib2to3 is *not* deprecated in Python 3.10.

    @vstinner vstinner added 3.11 bug and security fixes and removed 3.10 only security fixes labels Sep 1, 2021
    @gvanrossum
    Copy link
    Member

    How come the deprecation didn't happen in 3.10? Were people just not
    interested?

    @gpshead
    Copy link
    Member Author

    gpshead commented Sep 1, 2021

    I think we just forgot to make the change in time. 3.11 is fine. We're not _maintaining_ lib2to3 or describing it as fit for any modern purpose regardless. It's just code that'll sit around in the back of the 3.10 stdlib but not be able to parse the new syntax in 3.10.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2021

    Guido: How come the deprecation didn't happen in 3.10? Were people just not interested?

    Well, if nobody deprecates it, it's not deprecated. It is simple as it it :-)

    IMO it's ok to only deprecate it in Python 3.11, unless Pablo *really* wants to deprecate lib2to3 before just Python 3.10 final.

    I dislike adding new warnings between a beta1 release and the final release :-( In my experience, it *does* break projects which care of warnings.

    @gvanrossum
    Copy link
    Member

    We can add to the 3.10 docs that it is deprecated without any code change. And in 3.11 we can add a warning.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 2, 2021

    New changeset d589a7e by Victor Stinner in branch 'main':
    bpo-40360: Deprecate the lib2to3 package (GH-28116)
    d589a7e

    @ambv
    Copy link
    Contributor

    ambv commented Sep 2, 2021

    We can add to the 3.10 docs that it is deprecated without any code change.

    This was already the case:
    https://docs.python.org/3/library/2to3.html#module-lib2to3

    The wording was a bit clumsy so I rephrased in #72309.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 2, 2021

    New changeset f0b63d5 by Łukasz Langa in branch 'main':
    bpo-40360: [doc] Rephrase deprecation note about lib2to3 (GH-28122)
    f0b63d5

    @miss-islington
    Copy link
    Contributor

    New changeset 559af74 by Miss Islington (bot) in branch '3.10':
    bpo-40360: [doc] Rephrase deprecation note about lib2to3 (GH-28122)
    559af74

    @vstinner
    Copy link
    Member

    vstinner commented Sep 2, 2021

    I close the issue: lib2to3 is now deprecated in Python 3.11. I propose to open a new issue in Python 3.13 or newer to remove it.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 2, 2021

    The "pending" deprecation status of lib2to3 in 3.9 and 3.10 is no worse than a vanilla deprecation in terms of visibility. It will appear just the same when run with pytest or -X dev.

    However, upgrading the deprecation between 3.10.0rc1 and 3.10.0rc2 really was too late. So we'll have it deprecated fully in 3.11 and removed in 3.13. One more year doesn't hurt us much but might be helpful for library maintainers to have more time to move off of the included lib2to3.

    Invoking 2to3 also generates the warning so I guess this can be closed:

    $ ./python.exe Tools/scripts/2to3
    /private/tmp/cpy2/Tools/scripts/2to3:3: DeprecationWarning: lib2to3 package is deprecated and may not be able to parse Python 3.10+
      from lib2to3.main import main
    At least one file or directory argument required.
    Use --help to show usage.

    Thanks for the patches, Gregory, Carl, and Victor! ✨ 🍰 ✨

    Now we just have to remember to actually remove the damn thing in 3.13 😂

    @ambv
    Copy link
    Contributor

    ambv commented Oct 19, 2021

    New changeset fdbdf3f by Gregory P. Smith in branch 'main':
    bpo-40360: Make the 2to3 deprecation more obvious. (GH-29064)
    fdbdf3f

    @iritkatriel
    Copy link
    Member

    Created bpo-45544 to close all open issues and list them there.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 bug and security fixes topic-2to3 type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants