Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid ambiguous backslashes in regular expression strings #416

Merged
merged 1 commit into from Sep 1, 2022

Conversation

bemoody
Copy link
Collaborator

@bemoody bemoody commented Aug 18, 2022

@tompollard found that ambiguous backslashes give DeprecationWarnings on his system.

(Not that I think there is any danger of this behavior changing in python, but I think it's fair to say this syntax is confusing and should be avoided.)

I.e.: "\s\n" is a three-character string (equal to "\\s\n"). If what you mean is "\\s\n", it's better to write that and be unambiguous.

r"\s\n" is a four-character string ("\\s\\n") but is interpreted by re.compile as equivalent to "\\s\n". So when writing regular expression patterns it's best to use r strings if possible.

Sequences such as "\s" should be avoided in ordinary Python
strings due to the potential for confusion, and will result in
warnings in some Python versions.

For regular expressions, it's better to use raw strings (r"\s")
instead, which make parsing unambiguous without needing to double
backslashes.

(Since re.compile understands all the ordinary Python backslash
sequences, it should be safe to change non-raw regexp strings to
raw strings as long as they don't contain any escaped backslashes
or quotes.)
@tompollard
Copy link
Member

Thanks Benjamin, looks good to me.

This fixes all of the "invalid escape sequence" warnings that I was seeing:

[env] raw-backslash 2s ± pytest
=================================================== test session starts ====================================================
platform darwin -- Python 3.9.13, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/tompollard/sand/soundfile/wfdb-python
plugins: xdist-2.5.0, forked-1.4.0
collected 63 items                                                                                                         

tests/test_annotation.py .....                                                                                       [  7%]
tests/test_datasource.py ........                                                                                    [ 20%]
tests/test_multi_record.py .....                                                                                     [ 28%]
tests/test_plot.py .......                                                                                           [ 39%]
tests/test_record.py ..................................                                                              [ 93%]
tests/test_url.py ..                                                                                                 [ 96%]
tests/io/test_convert.py ..                                                                                          [100%]

===================================================== warnings summary =====================================================
tests/test_annotation.py::TestAnnotation::test_1
tests/test_annotation.py::TestAnnotation::test_2
tests/test_annotation.py::TestAnnotation::test_3
  /Users/tompollard/sand/soundfile/wfdb-python/wfdb/io/download.py:176: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
    ann_data = np.fromstring(content, dtype=np.dtype("<u1"))

tests/test_annotation.py::TestAnnotation::test_1
tests/test_annotation.py::TestAnnotation::test_2
tests/test_annotation.py::TestAnnotation::test_3
tests/test_annotation.py::TestAnnotation::test_5
  /Users/tompollard/sand/soundfile/wfdb-python/wfdb/io/annotation.py:944: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
    if fs_bytes == [] and cl_bytes == []:

tests/test_record.py: 15 warnings
  /Users/tompollard/sand/soundfile/wfdb-python/wfdb/io/download.py:146: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
    sig_data = np.fromstring(content, dtype=dtype)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================= 63 passed, 22 warnings in 46.47s =============================================

@bemoody bemoody merged commit b1c3284 into main Sep 1, 2022
@bemoody bemoody deleted the raw-backslash branch September 1, 2022 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants