Skip to content

[BUG] [python] generated code fails to raise proper exception for non-UTF-8 responses #19862

@igorgatis

Description

@igorgatis

Bug Report Checklist

Description

The python-pydantic-v1 generated code fails to raise proper exception content when response is not UTF-8.

openapi-generator version

Spotted in v7.8.0 but problem still persist in master as of the opening of this issue.

Generation Details

Just use python-pydantic-v1 generator. Here is a sample code:

        try:
            # perform request and return response
            response_data = self.request(
            ...
        except ApiException as e:
            if e.body:
                e.body = e.body.decode('utf-8')  <-- HERE
            raise e
Steps to reproduce

Tricky, one needs server replying with other encoding (eg. charset=ISO-8859-1). Here is the error output:

  File ".../.venv/lib/python3.11/site-packages/my_api/api_client.py", line 219, in __call_api
    e.body = e.body.decode('utf-8')
             ^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 78: invalid continuation byte
Related issues/PRs

(could not find any)

Suggest a fix
--- a/modules/openapi-generator/src/main/resources/python-pydantic-v1/api_client.mustache
+++ b/modules/openapi-generator/src/main/resources/python-pydantic-v1/api_client.mustache
@@ -220,6 +220,12 @@ class ApiClient:
                                                      collection_formats)
             url += "?" + url_query

+        def extract_charset(content_type):
+            match = None
+            if content_type is not None:
+                match = re.search(r"charset=([a-zA-Z\-\d]+)[\s;]?", content_type)
+            return match.group(1) if match else "utf-8"
+
         try:
             # perform request and return response
             response_data = {{#asyncio}}await {{/asyncio}}{{#tornado}}yield {{/tornado}}self.request(
@@ -231,7 +237,12 @@ class ApiClient:
                 _request_timeout=_request_timeout)
         except ApiException as e:
             if e.body:
-                e.body = e.body.decode('utf-8')
+                try:
+                    charset = extract_charset((e.headers or {}).get('content-type'))
+                    e.body = e.body.decode(charset)
+                except UnicodeDecodeError:
+                    # Keep original body if charset is not recognized.
+                    pass
             raise e

         self.last_response = response_data
@@ -247,12 +258,9 @@ class ApiClient:
           if response_type == "bytearray":
               response_data.data = response_data.data
           else:
-              match = None
               content_type = response_data.getheader('content-type')
-              if content_type is not None:
-                  match = re.search(r"charset=([a-zA-Z\-\d]+)[\s;]?", content_type)
-              encoding = match.group(1) if match else "utf-8"
-              response_data.data = response_data.data.decode(encoding)
+              charset = extract_charset(content_type)
+              response_data.data = response_data.data.decode(charset)

           # deserialize response data
           if response_type == "bytearray":

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions