Skip to content

Commit

Permalink
Update example of str.split, bytes.split (python#121287)
Browse files Browse the repository at this point in the history
In `{str,bytes}.strip(chars)`, multiple characters are not treated as a
prefix/suffix, but as individual characters. This may make users confuse
whether `split` has similar behavior.
Users may incorrectly expect that
`'Good morning, John.'.split(', .') == ['Good', 'morning', 'John']`

Adding a bit of clarification in the doc.

Co-authored-by: Yuxin Wu <ppwwyyxx@users.noreply.github.com>
  • Loading branch information
ppwwyyxx and ppwwyyxx committed Jul 5, 2024
1 parent 8ecb896 commit 892e3a1
Showing 1 changed file with 10 additions and 6 deletions.
16 changes: 10 additions & 6 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2095,8 +2095,9 @@ expression support in the :mod:`re` module).
If *sep* is given, consecutive delimiters are not grouped together and are
deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
``['1', '', '2']``). The *sep* argument may consist of multiple characters
(for example, ``'1<>2<>3'.split('<>')`` returns ``['1', '2', '3']``).
Splitting an empty string with a specified separator returns ``['']``.
as a single delimiter (to split with multiple delimiters, use
:func:`re.split`). Splitting an empty string with a specified separator
returns ``['']``.

For example::

Expand All @@ -2106,6 +2107,8 @@ expression support in the :mod:`re` module).
['1', '2,3']
>>> '1,2,,3,'.split(',')
['1', '2', '', '3', '']
>>> '1<>2<>3<4'.split('<>')
['1', '2', '3<4']

If *sep* is not specified or is ``None``, a different splitting algorithm is
applied: runs of consecutive whitespace are regarded as a single separator,
Expand Down Expand Up @@ -3149,10 +3152,9 @@ produce new objects.
If *sep* is given, consecutive delimiters are not grouped together and are
deemed to delimit empty subsequences (for example, ``b'1,,2'.split(b',')``
returns ``[b'1', b'', b'2']``). The *sep* argument may consist of a
multibyte sequence (for example, ``b'1<>2<>3'.split(b'<>')`` returns
``[b'1', b'2', b'3']``). Splitting an empty sequence with a specified
separator returns ``[b'']`` or ``[bytearray(b'')]`` depending on the type
of object being split. The *sep* argument may be any
multibyte sequence as a single delimiter. Splitting an empty sequence with
a specified separator returns ``[b'']`` or ``[bytearray(b'')]`` depending
on the type of object being split. The *sep* argument may be any
:term:`bytes-like object`.

For example::
Expand All @@ -3163,6 +3165,8 @@ produce new objects.
[b'1', b'2,3']
>>> b'1,2,,3,'.split(b',')
[b'1', b'2', b'', b'3', b'']
>>> b'1<>2<>3<4'.split(b'<>')
[b'1', b'2', b'3<4']

If *sep* is not specified or is ``None``, a different splitting algorithm
is applied: runs of consecutive ASCII whitespace are regarded as a single
Expand Down

0 comments on commit 892e3a1

Please sign in to comment.