Skip to content

tarfile: _proc_gnulong uses removesuffix("/") but comment claims consistency with frombuf() which uses rstrip("/") #149980

@lpyu001

Description

@lpyu001

Bug report

Bug description:

Bug report

Bug description

Hi ,I found a minor bug in Lib/tarfile.py, _proc_gnulong strips trailing slashes from directory
names using removesuffix("/"), with a comment stating this is "to be
consistent with frombuf()". However, frombuf() (and _proc_builtin) use
rstrip("/"). The two are not equivalent:

  • rstrip("/") removes all trailing slashes.
  • removesuffix("/") removes at most one trailing slash.

For tar archives in GNU_FORMAT, a directory entry whose name is long enough
to require a GNU long-name header AND ends with multiple slashes is normalized
differently from a short directory entry with the same trailing-slash pattern.
This contradicts the stated intent of the comment.

Code locations (line numbers from current main):

# Lib/tarfile.py — frombuf 
if obj.isdir():
    obj.name = obj.name.rstrip("/")

# Lib/tarfile.py — _proc_builtin 
if self.isdir():
    self.name = self.name.rstrip("/")

# Lib/tarfile.py — _proc_gnulong (GNU long-name path)  ← inconsistent
# Remove redundant slashes from directories. This is to be consistent
# with frombuf().
if next.isdir():
    next.name = next.name.removesuffix("/")

Reproducer

import io, tarfile

buf = io.BytesIO()
with tarfile.open(fileobj=buf, mode="w", format=tarfile.GNU_FORMAT) as tf:
    # Long name (>100 chars) forces a GNU long-name header.
    long_name = ("d" * 150) + "//"
    info = tarfile.TarInfo(name=long_name)
    info.type = tarfile.DIRTYPE
    tf.addfile(info)

    short_name = "shortdir//"
    info2 = tarfile.TarInfo(name=short_name)
    info2.type = tarfile.DIRTYPE
    tf.addfile(info2)

buf.seek(0)
with tarfile.open(fileobj=buf, mode="r") as tf:
    for m in tf.getmembers():
        print(repr(m.name))

Actual output:

'dddd…dd/'    # 151 chars: only one slash stripped (via _proc_gnulong)
'shortdir'    # 8 chars: all slashes stripped (via frombuf / _proc_builtin)

Expected: both entries should be normalized identically (rstrip semantics,
matching what the comment claims).

Suggested fix

Change removesuffix("/") to rstrip("/") in _proc_gnulong, matching the
comment's stated intent and the behavior of frombuf / _proc_builtin.
A regression test using a long directory name with multiple trailing slashes
should be added to Lib/test/test_tarfile.py.

I'd be happy to submit a PR.

CPython versions tested on:

3.14

Operating systems tested on:

Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions