Breaking change in 2.28.0 when using string enums as Headers (working in v2.27.1) #6159

bonastreyair · 2022-06-09T20:00:51Z

When using string enums as key values for the headers dict parameter with requests.get() function in v2.28.0 it raises an InvalidHeader error but using previous version v2.27.1 works well without any error. It seems there was an unexpected breaking change with that release.

Expected Result

Dict keys for the headers parameter when using requests.get() are still valid if a string enum is used.

Actual Result

Instead it raises an InvalidHeader error stating that:

Header must be of type str or bytes, not enum.

Traceback (most recent call last):
  File "/home/yair/PycharmProjects/tests/requests_error.py", line 9, in <module>
    requests.get("http://URL", headers={CustomEnum.TRACE_ID: "90e85293-afd1-4b48-adf0-fa6daf02359e"})
  File "/home/yair/.local/lib/python3.8/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/yair/.local/lib/python3.8/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/yair/.local/lib/python3.8/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/home/yair/.local/lib/python3.8/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/home/yair/.local/lib/python3.8/site-packages/requests/models.py", line 369, in prepare
    self.prepare_headers(headers)
  File "/home/yair/.local/lib/python3.8/site-packages/requests/models.py", line 491, in prepare_headers
    check_header_validity(header)
  File "/home/yair/.local/lib/python3.8/site-packages/requests/utils.py", line 1037, in check_header_validity
    raise InvalidHeader(
requests.exceptions.InvalidHeader: Header part (<CustomEnum.TRACE_ID: 'X-B3-TraceId'>) from {<CustomEnum.TRACE_ID: 'X-B3-TraceId'>: '90e85293-afd1-4b48-adf0-fa6daf02359e'} must be of type str or bytes, not <enum 'CustomEnum'>

Reproduction Steps

from enum import Enum
import requests


class CustomEnum(str, Enum):
    TRACE_ID = "X-B3-TraceId"


requests.get("http://URL", headers={CustomEnum.TRACE_ID: "90e85293-afd1-4b48-adf0-fa6daf02359e"})

System Information

$ python -m requests.help

{
  "chardet": {
    "version": "3.0.4"
  },
  "charset_normalizer": {
    "version": "2.0.4"
  },
  "cryptography": {
    "version": "2.8"
  },
  "idna": {
    "version": "3.2"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.8.10"
  },
  "platform": {
    "release": "5.13.0-48-generic",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "1010106f",
    "version": "19.0.0"
  },
  "requests": {
    "version": "2.28.0"
  },
  "system_ssl": {
    "version": "1010106f"
  },
  "urllib3": {
    "version": "1.26.6"
  },
  "using_charset_normalizer": false,
  "using_pyopenssl": true
}

The text was updated successfully, but these errors were encountered:

bonastreyair · 2022-06-09T20:07:06Z

I know it is easily solvable by making use of the CustomEnum.TRACE_ID.value to use the stored string instead of using the enum directly with CustomEnum.TRACE_ID... but it worked before with v2.27.1, so it feels like an unexpected breaking change... 👀

nateprewitt · 2022-06-09T20:09:50Z

Thanks for the report, @bonastreyair! So I think this is on the edge of support. Headers have always been defined to be a str or bytes and we've been programmatically enforcing that for header values for 6 years. While this worked in 2.27.1, it wasn't an intentional function of the headers param. I'd like to see how widespread this issue is before we make a decision on cases like this.

In the meantime, Requests has no special logic for Enums, so casting it to a string prior to calling Requests will not affect behavior. That would be the recommendation while we gauge impact. Otherwise, 2.27.1 will continue to work as it did originally.

paulclarkaranz · 2022-06-10T04:47:01Z

Here's another instance fwiw: taverntesting/tavern#788
In this case it's just a subclass of str being passed in as header values: https://github.com/taverntesting/tavern/blob/ede36ca062b546cfaf3e6b0940fbac30733586a4/tavern/util/formatted_str.py

FelixWohlfrom · 2022-06-10T14:27:41Z

We also have similar issues in company internal tools.
Isn't the explicit check for str and byte contradicting the duck typing approach of python? At least previously the code was working fine, even though we passed e.g. xml elements that were then implicitly converted to strings.

sigmavirus24 · 2022-06-10T16:27:27Z

We also have similar issues in company internal tools.
Isn't the explicit check for str and byte contradicting the duck typing approach of python? At least previously the code was working fine, even though we passed e.g. xml elements that were then implicitly converted to strings.

No. It's not contradicting that because any time we magically convert things for users, it tends to be wrong about 50% of the time. Being explicit in our documentation hasn't been effective and these validity checks are designed to help users not send junk that will almost certainly not do what they want. Much of what has been explicitly described above either for narrow cases but not for everyone. Enums for example are not required to have a string values. How does one consistently string-ify it then? Support for people subclassing built-in types notorious for breaking those subclasses is also not a logical choice because again it will surprise people who try it by breaking in a weird way it doing the opposite of what they expect

jaypatrickhoward · 2022-06-13T14:34:40Z

We encountered this same issue when passing header values whose type is a subclass of str. Might it make sense to allow subclasses to pass validation? In this case the requests library would not be doing any conversion for the user.

nateprewitt · 2022-06-13T16:31:40Z

I wrote up a patch for this on Friday. I've been waiting to see how widely this approach is used, because we've only had 4 reports with ~75% of downloads (50 million/week) now using 2.28.0. I think this falls into a very niche usage, but I agree bytes/str subclasses did work for both name and value prior to this release.

We're going to monitor for any other issues in 2.28.0 to determine if we need a patch release. I'd be willing to merge the fix above into that with a heavy caveat that this isn't strictly supported. Any breakages due to subclasses deviating in behavior from the base bytes/str type likely won't be fixed.

Long term, I think we're going to continue taking a much firmer stance on inputs because people do crazy things when the values aren't bytes/str and expect them to just work. This becomes much more problematic when the requests are being sent to systems corrupted due to incorrect "stringification". We can't reasonably handle every case, so we need to draw a line somewhere on what the APIs are intended to support to provide stability.

jaypatrickhoward · 2022-06-13T17:45:33Z

In our case we're interacting with Salesforce, whose API consists of XML messages. We parse a session ID out of the response from the authentication endpoint using lxml. This value is of type lxml.etree._ElementUnicodeResult which is a subclass of str. We pass it to the constructor of a class from salesforce-bulk. This class uses requests under the covers to communicate with Salesforce. It naively assumes that the session ID we supplied can be passed to requests as a header value without issue.

Our fix was to cast the session ID to an actual str before giving it to salesforce-bulk.

kevinr-electric · 2022-06-21T15:21:41Z

Just want to add support for this issue by saying we are also having this issue. Our use is basically:

class ContentType(str, Enum):
    JSON = "application/json"
    XML = "text/xml"

...
def execute_request(...):
...
    headers = {"Content-Type": ContentType.JSON}
    response = requests.request(
                url=url,
                method=method,
                params=params,
                headers=headers,
                data=data
            )

nickswebsite · 2022-07-19T16:10:27Z

Using enums for header names is a common (and generally regarded as a good) practice. +1.

In the ContentType enum given above, you have what looks like a string, acts like a string, and quacks like a string. It should be treated as a string. requests shouldn't be responsible for being fed misbehaving subclasses of builtin types IMO.

Anyway, just trying to add one to the 'user engagement' since that was mentioned as a consideration in the 2.28.1 release.

nateprewitt · 2022-07-19T16:21:33Z

Using enums for header names is a common (and generally regarded as a good) practice. +1.

I don't believe anything in this change precludes the use of enums in your code. Requests is being explicit of what we accept. enum.name becomes enum.name.value and everything works fine.

briantist · 2022-07-30T02:04:27Z

Also seeing this in an Ansible collection, since the test containers updated their requirements to allow a newer version of requests.

token must be of type str or bytes, not <class 'ansible.utils.unsafe_proxy.AnsibleUnsafeText'> ansible-collections/community.hashi_vault#289

CarliJoy · 2022-09-06T16:22:44Z

As explained in #6232 -> we use a string subclass to prevent secrets leaking in certain circumstances, i.e. in Stack Traces.

Just converting them to str would contradict this intention. So in opposite to using Enums, there is not really a workaround for us.

sigmavirus24 · 2022-09-06T16:57:34Z

Just converting them to str would contradict this intention

To be clear, to convert it to a str before passing to requests would be a problem, but requests converting it to a string isn't a problem in your eyes? That makes absolutely zero sense to me. Requests isn't doing anything special to handle your secret.

Also, you seem to be relying on this class to avoid leaking secrets in stack traces and assuming you/your colleagues will always remember to use it rather than looking for secrets in stack traces before allowing them to be printed to logs by using a formatter to ensure no secrets are printed ever (just to be safe). Defense in depth is always better than expecting a library to do something for you at one or two layers below you.

CarliJoy · 2022-09-07T10:20:09Z

Actually the Secret class is of course one of multiple layers.
We ensure on multiple places (i.e. using our auth libs), that secrets given are wrapped in this special class.
We simply add this test to all places in our code that accepts secrets.
This helps us also to track secrets within our code base.
We not solely rely on it.

And of course we use loggers/formatters that prevent leaking secrets but enforcing that is much harder then wrapping secrets.
Better do both.

Also other libraries follow a similar approach. For instance httpx for URLs:
https://github.com/encode/httpx/blob/master/httpx/_urls.py#L512

And actually requests does not (as far I found) convert the special class back to a normal string.

    def test_header_validation(self, httpbin):
        """Ensure prepare_headers regex isn't flagging valid header contents."""
        class StringSubClass(str):
            def __repr__(self):
                return "Secret"

        valid_headers = {
            "foo": "bar baz qux",
            "bar": b"fbbq",
            "baz": "",
            "qux": "1",
            "sub": StringSubClass("VerySecret")
        }
        r = requests.get(httpbin("get"), headers=valid_headers)
        for key in valid_headers.keys():
            valid_headers[key] == r.request.headers[key]
            # Ensure type is not modified
            assert type(valid_headers[key]) == type(r.request.headers[key])

This modified test works as expected.

Even the function to_native_string called in prepare_header (and even used for keys only) in _intnal_utils.py does not convert a value back to a naive string. Of course some send logic could/will do convert the value i.e. to a byte. But to leak the secret a exception needs to be raised in function that is called with this converted value without being handled by the library...

So requests does not have to handle the secret any special.
It would only leak the secret in a stack trace, if the header is converted to a normal str class before passing it to a function that is raising the exception which is printed.
Yes of course that is not 100% protection against a possible leakage but good enough for us as very unlikely.

Consider the opposite: We tell our devs to use the Secret everywhere, but then also need to tell them to wrap calls to requests to make them visible.
It like: Hey on all terminals enter your password in the input showing **** for your input but in this special terminal just enter it in plain text.
Yes maybe also the terminal showing ***** is saving the password in the background as plain text but that is a different story. The main point here how to teach people on how to handle a secret.

To make the case more general again:

Subclasses of strings are common for a number of reason, i.e. a json/xml/yaml libary that uses it to track formatting of a string, a a string enum, mark stuff a secret etc...
Enforcing stuff to be naive string class will break this things. If it is a subclass of a string, it should be good enough.

I do not see a case in which both the isinstance check is okay and the regex matches a valid input and but the string representation is still wrong.
In this case also the output of print(value) will show the wrong value.

I know that users of libs do crazy stuff but could you elaborate what kind of wrong "stringification" could happen if both checks are valid?

CarliJoy · 2022-10-18T09:20:42Z

@nateprewitt any news here? I really would love to see your patch applied.

teruokun · 2022-11-16T20:41:35Z

Quick ping here, looking over the proposed patch I don't see a good reason not to apply it, especially given that this kind of behavior actually violates the Python principle of 'duck typing'. It's unclear why Requests needs to be deeply coupled beyond isinstance. What's the use-case where limiting to the precise coordinate of str or bytes is truly necessary? And again, we aren't talking about string conversion, but simply if it says it's an instance of str or bytes, letting users be responsible for what happens if it isn't true seems reasonable, for the same reason single-underscore private methods aren't truly private in python, but a convention.

rajeshkanugu · 2023-01-24T12:27:21Z

you can resolve this issue by converting to a string before parsing it as a key to the headers directory
`
from enum import Enum
class CustomEnum(Enum):
TRACE_ID = "X-B3-TraceId"

headers = {CustomEnum.TRACE_ID: "90e85293-afd1-4b48-adf0-fa6daf02359e"}
requests.get("http://url/", headers=headers)

steven-aset mentioned this issue Jun 10, 2022

RestRequestException when using custom headers taverntesting/tavern#788

Closed

nateprewitt mentioned this issue Jun 29, 2022

v2.28.1 #6182

Merged

briantist mentioned this issue Jul 30, 2022

token must be of type str or bytes, not <class 'ansible.utils.unsafe_proxy.AnsibleUnsafeText'> ansible-collections/community.hashi_vault#289

Closed

briantist mentioned this issue Aug 5, 2022

Fix "Unsafe" token values that originate from lookups ansible-collections/community.hashi_vault#291

Merged

nateprewitt mentioned this issue Sep 6, 2022

Headers that are subclass of str/byte are not accepted #6232

Closed

CarliJoy mentioned this issue Oct 18, 2022

allow string/byte subtypes for header #6230

Closed

sigmavirus24 mentioned this issue Nov 15, 2022

Headers should allow subclasses of str or bytes #6285

Closed

nateprewitt added this to the 2.29.0 milestone Nov 16, 2022

sethmlarson mentioned this issue Feb 17, 2023

Allow str/bytes subclasses to be used as header parts #6356

Merged

sethmlarson closed this as completed in #6356 Mar 3, 2023

This was referenced Dec 24, 2023

ansible-core 2.16.1/2.16.2 break action plugin parameters due to changed casting ansible/ansible#82414

Closed

Fix for core changes: stop stripping Unsafe, adapt adapter to config manager changes ansible-collections/community.hashi_vault#416

Merged

github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking change in 2.28.0 when using string enums as Headers (working in v2.27.1) #6159

Breaking change in 2.28.0 when using string enums as Headers (working in v2.27.1) #6159

bonastreyair commented Jun 9, 2022

bonastreyair commented Jun 9, 2022

nateprewitt commented Jun 9, 2022

paulclarkaranz commented Jun 10, 2022

FelixWohlfrom commented Jun 10, 2022

sigmavirus24 commented Jun 10, 2022

jaypatrickhoward commented Jun 13, 2022

nateprewitt commented Jun 13, 2022

jaypatrickhoward commented Jun 13, 2022 •

edited

kevinr-electric commented Jun 21, 2022

nickswebsite commented Jul 19, 2022

nateprewitt commented Jul 19, 2022

briantist commented Jul 30, 2022

CarliJoy commented Sep 6, 2022

sigmavirus24 commented Sep 6, 2022

CarliJoy commented Sep 7, 2022

CarliJoy commented Oct 18, 2022

teruokun commented Nov 16, 2022

rajeshkanugu commented Jan 24, 2023

Breaking change in 2.28.0 when using string enums as Headers (working in v2.27.1) #6159

Breaking change in 2.28.0 when using string enums as Headers (working in v2.27.1) #6159

Comments

bonastreyair commented Jun 9, 2022

Expected Result

Actual Result

Reproduction Steps

System Information

bonastreyair commented Jun 9, 2022

nateprewitt commented Jun 9, 2022

paulclarkaranz commented Jun 10, 2022

FelixWohlfrom commented Jun 10, 2022

sigmavirus24 commented Jun 10, 2022

jaypatrickhoward commented Jun 13, 2022

nateprewitt commented Jun 13, 2022

jaypatrickhoward commented Jun 13, 2022 • edited

kevinr-electric commented Jun 21, 2022

nickswebsite commented Jul 19, 2022

nateprewitt commented Jul 19, 2022

briantist commented Jul 30, 2022

CarliJoy commented Sep 6, 2022

sigmavirus24 commented Sep 6, 2022

CarliJoy commented Sep 7, 2022

CarliJoy commented Oct 18, 2022

teruokun commented Nov 16, 2022

rajeshkanugu commented Jan 24, 2023

jaypatrickhoward commented Jun 13, 2022 •

edited