Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programmatic management of BytesWarning doesn't work for native triggers. #87692

Open
masklinn mannequin opened this issue Mar 17, 2021 · 14 comments
Open

Programmatic management of BytesWarning doesn't work for native triggers. #87692

masklinn mannequin opened this issue Mar 17, 2021 · 14 comments
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@masklinn
Copy link
Mannequin

masklinn mannequin commented Mar 17, 2021

BPO 43526
Nosy @gpshead, @vstinner, @methane, @masklinn, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-03-17.09:03:05.688>
labels = ['3.8', 'type-bug', 'library', '3.9', '3.10']
title = "Programmatic management of BytesWarning doesn't work for native triggers."
updated_at = <Date 2021-11-16.00:57:58.798>
user = 'https://github.com/masklinn'

bugs.python.org fields:

activity = <Date 2021-11-16.00:57:58.798>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2021-03-17.09:03:05.688>
creator = 'xmorel'
dependencies = []
files = []
hgrepos = []
issue_num = 43526
keywords = []
message_count = 14.0
messages = ['388912', '388913', '388915', '389154', '399064', '399076', '399192', '399193', '399194', '399203', '406360', '406361', '406382', '406383']
nosy_count = 5.0
nosy_names = ['gregory.p.smith', 'vstinner', 'methane', 'xmorel', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue43526'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Mar 17, 2021

When setting BytesWarning programmatically (via the warnings API), though the warnings.filters value matches what's obtained via python -b and an explicit warnings.warn trigger will trigger, "native" triggers of the warning fail to trigger properly:

    import warnings
    warnings.simplefilter('default', category=BytesWarning)
    str(b'')
    warnings.warn('test', category=BytesWarning)

If run using python, this will print:

test.py:4: BytesWarning: test
  warnings.warn('test', category=BytesWarning)

There is no warning for the string-ification of the bytes instance.

If run using python -b, the behaviour is as one would expect:

test.py:3: BytesWarning: str() on a bytes instance
  str(b'')
test.py:4: BytesWarning: test
  warnings.warn('test', category=BytesWarning)

Inspecting warnings.filters shows now difference in their contents, in both cases it is:

[('default', None, <class 'BytesWarning'>, None, 0), ('default', None, <class 'DeprecationWarning'>, '__main__', 0), ('ignore', None, <class 'DeprecationWarning'>, None, 0), ('ignore', None, <class 'PendingDeprecationWarning'>, None, 0), ('ignore', None, <class 'ImportWarning'>, None, 0), ('ignore', None, <class 'ResourceWarning'>, None, 0)]

(in Python 3.9).

The warning module's own suggestion:

    import sys
    if not sys.warnoptions:
        import warnings
        warnings.simplefilter("default") # Change the filter in this process

also fails to enable BytesWarning.

If this is intended behaviour, which seems to be the case according to ncoghlan's comment https://bugs.python.org/issue32230#msg307721, it should be clearly documented, as it's rather frustrating.

@masklinn masklinn mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 17, 2021
@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Mar 17, 2021

Addendum: is there a way to force -b from within the running Python program?

@serhiy-storchaka
Copy link
Member

Yes, it is intended behaviour. BytesWarning is only emitted when you set a special internal flag (by the -b option). Warning filters control what happen with that warning later: be it ignored, printed, or converted to exception.

In normal circumstances you should never deal with BytesWarning. The -b option is only used for testing your program for some possible bugs caused by migration from Python 2. If your program always worked only with Python 3, the -b option has no use for you.

It may be more complicated if you write a library which support bytes and strings. Since you do not know in what program it will be used, it may be nice to to test it with mixed bytes and str data and ensure that it works with bytes warnings enabled.

@terryjreedy terryjreedy added 3.10 only security fixes and removed 3.7 (EOL) end of life labels Mar 20, 2021
@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Mar 20, 2021

In normal circumstances you should never deal with BytesWarning. The -b option is only used for testing your program for some possible bugs caused by migration from Python 2. If your program always worked only with Python 3, the -b option has no use for you.

That's technically true for perfect programs, but practically if the program is not tested enough it's possible for it to contain unexpected instances of stringifying bytes which pass by mostly unnoticed, just with garbage bits of output which may or may not get reported.

Noticing such cases in an existing codebase is exactly what prompted me to try and enable BytesWarning (and then try and filter them to just conversion warnings, as the category is quite noisy with somewhat uninteresting warnings), and then spend a while trying to understand why programmatically enabling BytesWarning didn't do anything.

As noted in the last paragraph it's fine if -b is required to enable the feature, I'd just like to know if it could be documented clearly as it's somewhat frustrating to pore over the documentation of warnings, BytesWarning, and -b, diff the state of the filters configuration, etc… and end up finding out from a bpo comment whether the behaviour you're observing is intentional or not.

@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Aug 6, 2021

Serhiy an other question (because I've encountered it *again*), do you think it'd be possible to split out the specific warning of stringifying (but *not* explicitely repr-ing) a bytes instance from everything else?

There are use-cases for it, but I think it's rather error prone when a bytes unexpectedly finds itself mixed into either string or non-string data which stringify "normally" and generates what is often garbage output.

@serhiy-storchaka
Copy link
Member

I am not against documenting the behavior of -b and BytesWarning clearly. I don't think that anyone would be against. Just somebody have to provide a PR.

Explicitly repr-ing a bytes instance does not produce a warning, and never produced, so I don't understand what do you mean.

@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Aug 7, 2021

I am not against documenting the behavior of -b and BytesWarning clearly. I don't think that anyone would be against. Just somebody have to provide a PR.

Right but what about the ability to enable warning on stringification without enabling the warning on comparison? There are packages / situations where comparing strings and bytes makes a lot of sense (e.g. use string and byte versions of data as keys in the same dict), but I think stringifying bytes rarely if ever does.

Explicitly repr-ing a bytes instance does not produce a warning, and never produced, so I don't understand what do you mean.

Well that's absolutely ideal as I did not check if it did, just wanted to be clear that I wasn't looking for a warning in that case.

@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Aug 7, 2021

And though I did not check, I expect the -b flag exists mostly because of the performance impact of the warning any time bytes are checked for equality, but surely that impact would be limited and probably not very relevant for the stringification of bytes and that could (eventually) be moved out of the -b flag and to the regular warnings system?

I recall Inada-san tends to be quite involved in BytesWarning, is adding people to the nosy list something that's done or is it in bad taste?

@serhiy-storchaka
Copy link
Member

And though I did not check, I expect the -b flag exists mostly because of the performance impact of the warning any time bytes are checked for equality, but surely that impact would be limited and probably not very relevant for the stringification of bytes and that could (eventually) be moved out of the -b flag and to the regular warnings system?

If working Python 3 program suddenly became emitting BytesWarning it will confuse users.

Finally we will get rid of the -b option and BytesWarning. str() for bytes object will be equal to repr(), without warnings and errors. If you want to change it, use -b. Maybe introducing PYTHONBYTESWARNING will help. But I do not think we should change the behavior for these who do not ask for it specially if it will differ from the future behavior.

Right but what about the ability to enable warning on stringification without enabling the warning on comparison?

I think it would work. But it is significant amount of work (add corresponding C API, parsing and copying code, documentation, tests), and for small benefit. Would not be better to filter out warnings by message?

I think that adding PYTHONBYTESWARNING can help you.

@masklinn
Copy link
Mannequin Author

masklinn mannequin commented Aug 7, 2021

If working Python 3 program suddenly became emitting BytesWarning it will confuse users.

Oh yeah no I meant making it a normal warning, without needing the -b flag, not enabling it by default.

Because controlling / configuring warnings can be done programmatically and dynamically from inside the program, but in my understanding -b is either set or unset when calling it, if you're not setting it you're SOL. Or so I understand, I didn't find any switch accessible from inside the Python program.

I think it would work. But it is significant amount of work (add corresponding C API, parsing and copying code, documentation, tests)

Wouldn't it just be removing the check on -b in the tp_str slot of bytes and bytearray?

and for small benefit. Would not be better to filter out warnings by message?

Well yes and no, my issue remains the one from the beginning: it's quite easy to unexpectedly stringify bytes, which usually isn't what's desired. Ensuring that doesn't happen requires making sure every developer and environment (non-production at least) does run python with -b to enable the warning, which is not as easy as the software being able to set that internally (which is doable for various deprecation warnings and friends).

@gpshead
Copy link
Member

gpshead commented Nov 15, 2021

fwiw, at work we modified our interpreter to treat PYTHONBYTESWARNING=1 or 2 as if -b or -bb were passed. All of our tests run by default with a PYTHONBYTESWARNING=2 in their environment.

@gpshead
Copy link
Member

gpshead commented Nov 15, 2021

clarification after poking around: s/all/a lot/ - YouTube used it to help their Python 3 migration. Regardless, the env. var has been useful.

@vstinner
Copy link
Member

While you should not do that at home ;-), it's now *technically* possible to change the BytesWarning flag at runtime using a a *private* *internal* API:

$ ./python
Python 3.11.0a2+ (heads/remove_asyncore:010015d2eb, Nov 11 2021,
>>> "unicode" == b"bytes"
False

# Change the internal BytesWarning flag
>>> import _testinternalcapi
>>> config=_testinternalcapi.get_config()
>>> config['bytes_warning']=1
>>> _testinternalcapi.set_config(config)

>>> "unicode" == b"bytes"
<stdin>:1: BytesWarning: Comparison between bytes and string
False

@vstinner
Copy link
Member

If this is intended behaviour, which seems to be the case according to ncoghlan's comment https://bugs.python.org/issue32230#msg307721, it should be clearly documented, as it's rather frustrating.

The -b command line documentation is not explicitly enough?

"Issue a warning when comparing bytes or bytearray with str or bytes with int. Issue an error when the option is given twice (-bb)."

https://docs.python.org/dev/using/cmdline.html#cmdoption-b

Modifying warning filters don't change how Python emits or not warnings. It only changes how emitted warnings are handlded: ignored, displayed, treated as errors.

The relationship between -b and BytesWarning is different than other warnings.

Should we add a note near the BytesWarning documentation?
https://docs.python.org/dev/library/exceptions.html#BytesWarning

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants