Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reindent.py inserts spaces in multiline literals #57139

Open
dimaqq mannequin opened this issue Sep 7, 2011 · 22 comments
Open

reindent.py inserts spaces in multiline literals #57139

dimaqq mannequin opened this issue Sep 7, 2011 · 22 comments
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes build The build process and cross-build type-bug An unexpected behavior, bug, or error

Comments

@dimaqq
Copy link
Mannequin

dimaqq mannequin commented Sep 7, 2011

BPO 12930
Nosy @serwy, @merwok, @dimaqq, @serhiy-storchaka
Dependencies
  • bpo-13447: Add tests for some scripts in Tools/scripts
  • Files
  • caioromao-fix-12930-v4.patch
  • bug12930-testfiles.tgz: test
  • testfile-original.py
  • testfile-expected.py
  • testfile-issue.py
  • tab_test.py
  • save_strings.patch: Patch which replaces tab characters in strings with '\t'
  • issue12930.patch
  • test_reindent.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-09-07.14:29:37.117>
    labels = ['type-bug', '3.8', '3.9', '3.10']
    title = 'reindent.py inserts spaces in multiline literals'
    updated_at = <Date 2020-10-26.00:05:12.927>
    user = 'https://github.com/dimaqq'

    bugs.python.org fields:

    activity = <Date 2020-10-26.00:05:12.927>
    actor = 'Dima.Tisnek'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Demos and Tools']
    creation = <Date 2011-09-07.14:29:37.117>
    creator = 'Dima.Tisnek'
    dependencies = ['13447']
    files = ['23194', '23195', '23500', '23501', '23502', '24074', '24120', '27366', '27367']
    hgrepos = []
    issue_num = 12930
    keywords = ['patch']
    message_count = 22.0
    messages = ['143679', '143715', '143718', '143999', '144291', '144313', '144314', '144321', '146112', '146169', '146209', '146212', '146213', '148063', '150064', '150425', '150426', '171678', '171679', '171682', '171685', '379621']
    nosy_count = 6.0
    nosy_names = ['roger.serwy', 'eric.araujo', 'Dima.Tisnek', 'Caio.Romao', 'Jonathan.Rogers', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue12930'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @dimaqq
    Copy link
    Mannequin Author

    dimaqq mannequin commented Sep 7, 2011

    Given this as input:
    #!/usr/bin/python
    def x():
    s = """line one
    line two
    line three"""
    return s

    reindent.py changes it to:
    #!/usr/bin/python
    def x():
    s = """line one
    line two
    line three"""
    return s

    Which means that I cannot use reindent.py on any source that includes multiline literals that are not docs.

    Btw, it is generally weird that reindented file ends up with 2 spaces before "line two".

    @dimaqq dimaqq mannequin added the type-bug An unexpected behavior, bug, or error label Sep 7, 2011
    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Sep 8, 2011

    This patch fixes the reported issue.

    First time contributor here, feel free to bash.

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Sep 8, 2011

    New patch, fixing issue pointed out by gutworth and some others that came up.

    I've used the following as a test input:

    -----8<----------8<--------

    #!/usr/bin/python
    def x():
      """
      This is a doc
      """
      '''
      Another doc.'''
      s = """line one
    line two
          line three
          '''
    line five
    """
      var = '''test'''
      """Third doc"""
      return s
    -----8<----------8<

    The patch got way bigger than the initial version and feels a little hackish, but I couldn't come up with something better.

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Sep 14, 2011

    Third version, with slightly less code and my name added to the Misc/ACKS file.

    @merwok
    Copy link
    Member

    merwok commented Sep 19, 2011

    Follow the “review” link for some comments.

    Do you have a test file that I could use to reproduce the bug and make sure it’s fixed?

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Sep 19, 2011

    New patch version ack-ing Éric's suggestion.

    Note: I'm now confused as to whether I should add my name to the ACKS file or not, heh. This patch doesn't include the change.

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Sep 19, 2011

    Attaching files for testing in a gzipped tarball:

    • testfile-original.py: file to be reindented with reindent.py
    • testfile-issue.py: resulting file after using the current Tools/scripts/reindent.py
    • testfile-expected.py: expected output file; the result after applying the patch in this thread

    @dimaqq
    Copy link
    Mannequin Author

    dimaqq mannequin commented Sep 20, 2011

    Thanks Caio, your test case covers my issue; seeing these spelt out got me thinking, there are perhaps 3~4 different cases:

    def f0():
      s = """select
    some sql
    from
    somewhere;
    -- cannot be reindented"""
    
    def f1():
      """ Multiline 
      text docstring
      should be reindented """
    
    def f2():
      """ should example be reindented inside docstring?
      if f2():
        print "great"
      """
    
    def f3():
      """ # should doctest be reindented inside docstring?
      >>> if f3():
      ...   print "yes"
      ... else:
      ...   print "no"
      ...
      no
      """

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Oct 21, 2011

    It's been a while since this got any activity. Was the provided testfile not enough or any issue found? Just let me know and I'll make adjustments asap.

    @dimaqq
    Copy link
    Mannequin Author

    dimaqq mannequin commented Oct 22, 2011

    good enough for me :)

    @merwok
    Copy link
    Member

    merwok commented Oct 23, 2011

    I want to look at this but lack time right now. Could someone make one up-to-date patch with all changes, code and tests? It will be much easier to review and test than an archive.

    @merwok
    Copy link
    Member

    merwok commented Oct 23, 2011

    Forget that, there are no automated tests for tools. So, a text version of the files would be nice.

    @CaioRomao
    Copy link
    Mannequin

    CaioRomao mannequin commented Oct 23, 2011

    Attaching files from tarball as requested. See http://bugs.python.org/issue12930#msg144314 for explanation

    @merwok
    Copy link
    Member

    merwok commented Nov 21, 2011

    Thanks, I’ll make time to review the change this week.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Dec 22, 2011

    I applied the patch and it failed against the attached "tab_test.py" file. For reference, every '\t' in the file is followed by "Tab".

    @JonathanRogers
    Copy link
    Mannequin

    JonathanRogers mannequin commented Jan 1, 2012

    I don't think reindent.py should change any bytes inside string literals since it can't know anything about what those strings mean or how they'll be used by the program at run time. Unfortunately, it starts out by unconditionally calling the .expandtabs() method on each input line, so tab characters are lost. The only change to a string literal I can imagine that would be safe is to replace tab characters with '\t'.

    I am trying to use reindent.py on Python source files which include triple-quoted, multi-line string literals containing makefile and Python snippets. In both cases, running reindent.py changes the meaning of of that contained in the literal.

    @JonathanRogers
    Copy link
    Mannequin

    JonathanRogers mannequin commented Jan 1, 2012

    Rather than expanding tab characters inside string literals, it's safer to replace them with '\t'.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Sep 30, 2012

    This issue also affects reindent.py in Python 3.x.

    The attached patch adds a "-s" switch to reindent.py to disallow changes to the contents of strings.

    The rstrip_and_expand_tabs function tokenizes the input to identify all strings and then expands and rstrips all parts of the input not within a string.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Sep 30, 2012

    Attached is a simple test of the "-s" functionality for reindent.py. It works on Linux but I'm not sure about Windows.

    @serhiy-storchaka
    Copy link
    Member

    Definitely a bugfix should not add any new switches. Patch would simply lead to preserving of all string literals and nothing more. I will look at Caio's patches a little later.

    Now Python contains automated tests for some tools.

    @serwy
    Copy link
    Mannequin

    serwy mannequin commented Oct 1, 2012

    Definitely a bugfix should not add any new switches. Patch would simply lead to preserving of all string literals and nothing more.
    I added the switch so that the existing behavior of reindent would stay
    the same in case some other scripts rely on that behavior. If you'd
    rather not include the switch, then I'm ok with removing it.

    The patch causes reindent to no longer removes trailing whitespace on
    multi-line doc strings. Is that side-effect acceptable?

    @iritkatriel iritkatriel added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Oct 25, 2020
    @dimaqq
    Copy link
    Mannequin Author

    dimaqq mannequin commented Oct 26, 2020

    Given the trajectory of this bug, would it be easier to remove reindent.py from cpython code base entirely?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the build The build process and cross-build label Nov 24, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes build The build process and cross-build type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants