Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Generating invalid HTTP headers #1142

Closed
Lakitna opened this issue Apr 28, 2021 · 21 comments · Fixed by #1904
Closed

[BUG] Generating invalid HTTP headers #1142

Lakitna opened this issue Apr 28, 2021 · 21 comments · Fixed by #1904
Assignees
Labels
Difficulty: Beginner Ideal for newcomers Priority: Medium Planned for regular releases Type: Bug Errors or unexpected behavior
Milestone

Comments

@Lakitna
Copy link

Lakitna commented Apr 28, 2021

Describe the bug
Schemathesis generates headers like:

x-api-key: \x80

The \x80 character is not a part of ISO-8859-1 (AKA Latin-1). This is an issue because according to the HTTP 1.1 standard, all HTTP headers are encoded with ISO-8859-1. This means that my tests will always fail because of something I can't actually control.

To Reproduce

  1. In your OpenAPI Schema, add a security schema:
      securitySchemes:
        ApiKey:
          type: apiKey
          in: header
          name: x-custom-api-key-header
    security:
      - ApiKey: []
  2. Run Schemathesis

Expected behavior

I expect Schemathesis to not generate things that are unsupported due to the HTTP spec. By definition, these things will not have stable behaviour, so it'll be impossible to test with Schemathesis.

Environment (please complete the following information):

  • OS: Windows 10
  • Python version: 3.9.2
  • Schemathesis version: 3.6.3
  • Spec version: Open API 3.0.3
@Lakitna Lakitna added Status: Needs Triage Requires initial assessment to categorize and prioritize Type: Bug Errors or unexpected behavior labels Apr 28, 2021
@Stranger6667 Stranger6667 added Difficulty: Beginner Ideal for newcomers Priority: Medium Planned for regular releases and removed Status: Needs Triage Requires initial assessment to categorize and prioritize labels Apr 28, 2021
@Stranger6667
Copy link
Member

Cool! Thank you for the detailed report! :)

I will take a look today

@Stranger6667
Copy link
Member

Stranger6667 commented Apr 28, 2021

I've skimmed through RFC 7230, and it seems like \x80 is a valid value for an HTTP header.
Section 3.2 defines header field value like this:

field-value     = *( field-content / obs-fold )
field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar     = VCHAR / obs-text

obs-fold        = CRLF 1*( SP / HTAB )
                    ; obsolete line folding
                    ; see Section 3.2.4

Having field-vchar defined as VCHAR / obs-text gives us alternatives:

  • any visible USASCII character, which is x00-x7F
  • x80-xFF (as defined in section 3.2.6)

Which gives the total interval x00-xFF, which contains the mentioned value of x80 (matching the obs-text rule).

For such headers, schemathesis uses this Hypothesis strategy - st.text(alphabet=st.characters(min_codepoint=0, max_codepoint=255)) which corresponds to that interval.

Maybe I missed something, let me know what do you think :)

@Lakitna
Copy link
Author

Lakitna commented Apr 28, 2021

Hmm reading a bit in 3.2:

3.2.4:

Historically, HTTP has allowed field content with text in the
ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
through use of [RFC2047] encoding. In practice, most HTTP header
field values use only a subset of the US-ASCII charset [USASCII].
Newly defined header fields SHOULD limit their field values to
US-ASCII octets. A recipient SHOULD treat other octets in field
content (obs-text) as opaque data.

Where ISO-8859-1 = \x00-\x7f

I think I'm feeling that one. It looks like I'm seeing this behaviour with everything > \x80.

Maybe some servers implement the full \x00-\xFF. But mine clearly does not :\

I ran some more tests. It looks like my server returns a generic 400 with an empty body and no content type for the following situations:

API key header == \x00

And all the following values in the API key header
"0x81" contains the following non-Latin-1 character: 0x81
"0xbf" contains the following non-Latin-1 character: 0xbf
"0xe1" contains the following non-Latin-1 character: 0xe1
"0xd4" contains the following non-Latin-1 character: 0xd4
"0x95" contains the following non-Latin-1 character: 0x95
"0x89" contains the following non-Latin-1 character: 0x89
"0xd2" contains the following non-Latin-1 character: 0xd2
"0xd2" contains the following non-Latin-1 character: 0xd2
"0x76 0xa8" contains the following non-Latin-1 character: 0xa8
"0x67 0xc6 0xfa 0x9b 0x1a" contains the following non-Latin-1 character:  0xc6 0xfa0x9b
"0x7c 0xaf 0x56 0xbc" contains the following non-Latin-1 character:  0xaf0xbc
"0xe5 0xc5" contains the following non-Latin-1 character:  0xe50xc5
"0xd8 0x7b 0x55 0xc9" contains the following non-Latin-1 character:  0xd80xc9
"0xdb 0x75 0x13 0x66 0x2e 0x26" contains the following non-Latin-1 character: 0xdb
"0xc1" contains the following non-Latin-1 character: 0xc1
"0x57 0x9a" contains the following non-Latin-1 character: 0x9a
"0x61 0xe9" contains the following non-Latin-1 character: 0xe9
"0x91 0x64" contains the following non-Latin-1 character: 0x91
"0x91" contains the following non-Latin-1 character: 0x91
"0x91" contains the following non-Latin-1 character: 0x91
"0x91" contains the following non-Latin-1 character: 0x91
"0xbe 0xb9 0x71 0x47" contains the following non-Latin-1 character:  0xbe0xb9
"0x91 0x75 0x49 0x8c" contains the following non-Latin-1 character:  0x910x8c
"0x3e 0x94" contains the following non-Latin-1 character: 0x94
"0x55 0xef" contains the following non-Latin-1 character: 0xef
"0xfa" contains the following non-Latin-1 character: 0xfa
"0xb1 0xb7 0xe6 0x67 0xa3" contains the following non-Latin-1 character:  0xb1 0xb7 0xe60xa3
"0xdc" contains the following non-Latin-1 character: 0xdc
"0xa1" contains the following non-Latin-1 character: 0xa1
"0x30 0xfd" contains the following non-Latin-1 character: 0xfd
"0x7d 0xc3" contains the following non-Latin-1 character: 0xc3
"0xc3" contains the following non-Latin-1 character: 0xc3
"0xc3" contains the following non-Latin-1 character: 0xc3
"0x94 0xe7 0xaf" contains the following non-Latin-1 character:  0x94 0xe70xaf
"0x94 0xaf 0xaf" contains the following non-Latin-1 character:  0x94 0xaf0xaf
"0x94 0x94 0xaf" contains the following non-Latin-1 character:  0x94 0x940xaf
"0x94 0x94 0xaf" contains the following non-Latin-1 character:  0x94 0x940xaf
"0xf3" contains the following non-Latin-1 character: 0xf3
"0xa9 0x13" contains the following non-Latin-1 character: 0xa9
"0xd6 0xd3 0xe7" contains the following non-Latin-1 character:  0xd6 0xd30xe7
"0xe8 0xff 0x82 0x61" contains the following non-Latin-1 character:  0xe8 0xff0x82
"0x62 0xc4 0xf8 0x29 0xac 0x51 0xb0 0xdd 0x61" contains the following non-Latin-1 character:  0xc4 0xf8 0xac 0xb00xdd
"0x8f 0x79" contains the following non-Latin-1 character: 0x8f
"0xef 0x94 0x6f" contains the following non-Latin-1 character:  0xef0x94
"0x4e 0xeb 0x3a 0x3 0x3 0xa7 0x8f 0xb1 0x59 0xd0 0x7 0x4f 0x6e 0xe0" contains the following non-Latin-1 character:  0xeb 0xa7 0x8f 0xb1 0xd00xe0
"0x4 0x9a 0x0 0x32 0x95 0x2f 0xb 0x46 0xb" contains the following non-Latin-1 character:  0x9a0x95
"0xee" contains the following non-Latin-1 character: 0xee
"0xee" contains the following non-Latin-1 character: 0xee
"0x32 0x9d 0x9 0x71" contains the following non-Latin-1 character: 0x9d
"0x93" contains the following non-Latin-1 character: 0x93
"0xf5" contains the following non-Latin-1 character: 0xf5
"0x33 0xe3 0x40 0xb7 0xde 0x77 0x77 0xd8 0x67" contains the following non-Latin-1 character:  0xe3 0xb7 0xde0xd8
"0xc8 0xa1 0xe6" contains the following non-Latin-1 character:  0xc8 0xa10xe6
"0xf7 0xe" contains the following non-Latin-1 character: 0xf7
"0x43 0xa9 0x3f 0x5f 0xb0 0x19 0xc8" contains the following non-Latin-1 character:  0xa9 0xb00xc8
"0x7f 0xd2" contains the following non-Latin-1 character: 0xd2
"0x26 0xe7 0x40 0x12 0x33" contains the following non-Latin-1 character: 0xe7
"0x7c 0x15 0xc8 0xaa 0x25 0x97 0xd3 0x50" contains the following non-Latin-1 character:  0xc8 0xaa 0x970xd3
"0x59 0xef" contains the following non-Latin-1 character: 0xef
"0xef 0xef" contains the following non-Latin-1 character:  0xef0xef
"0x9b 0x2c 0xa4 0x7d 0x61 0xd3 0x70" contains the following non-Latin-1 character:  0x9b 0xa40xd3
"0x31 0xec 0x34" contains the following non-Latin-1 character: 0xec
"0x45 0x13 0x89 0x7e 0x3d 0x27 0x4" contains the following non-Latin-1 character: 0x89
"0x88 0x85 0xa0" contains the following non-Latin-1 character:  0x88 0x850xa0
"0xa2" contains the following non-Latin-1 character: 0xa2
"0xd5" contains the following non-Latin-1 character: 0xd5
"0xed" contains the following non-Latin-1 character: 0xed
"0xed" contains the following non-Latin-1 character: 0xed
"0x2d 0xb4 0x25" contains the following non-Latin-1 character: 0xb4
"0x9c 0xf3 0x1a 0xc6" contains the following non-Latin-1 character:  0x9c 0xf30xc6
"0x0 0xe4 0x1b 0x71 0xf3 0xc0 0x61 0x26 0xad 0x55 0xc1 0x41 0xd7 0x7d 0xd0" contains the following non-Latin-1 character:  0xe4 0xf3 0xc0 0xad 0xc1 0xd70xd0
"0xb6 0xeb" contains the following non-Latin-1 character:  0xb60xeb
"0xc6 0x36 0x95 0x49 0xfa" contains the following non-Latin-1 character:  0xc6 0x950xfa
"0x5e 0xfa" contains the following non-Latin-1 character: 0xfa
"0x5e 0xfa" contains the following non-Latin-1 character: 0xfa
"0x31 0x67 0x85 0x32 0x35 0xd3 0xd4" contains the following non-Latin-1 character:  0x85 0xd30xd4
"0xdb" contains the following non-Latin-1 character: 0xdb
"0xdb 0x73 0x41" contains the following non-Latin-1 character: 0xdb
"0xdb 0x31" contains the following non-Latin-1 character: 0xdb
"0xaf 0x70" contains the following non-Latin-1 character: 0xaf
"0xb0 0x68 0xd5 0x48" contains the following non-Latin-1 character:  0xb00xd5
"0xb0" contains the following non-Latin-1 character: 0xb0
"0xb0" contains the following non-Latin-1 character: 0xb0
"0xb0" contains the following non-Latin-1 character: 0xb0
"0xb0" contains the following non-Latin-1 character: 0xb0
"0xb0" contains the following non-Latin-1 character: 0xb0
"0x2d 0x88" contains the following non-Latin-1 character: 0x88
"0x65 0x8d 0x86 0x0" contains the following non-Latin-1 character:  0x8d0x86
"0xc0 0xc6 0xe2" contains the following non-Latin-1 character:  0xc0 0xc60xe2
"0xdf 0xd0" contains the following non-Latin-1 character:  0xdf0xd0
"0xdf" contains the following non-Latin-1 character: 0xdf
"0xe4 0x3f 0x74 0x39" contains the following non-Latin-1 character: 0xe4
"0xe4 0x3f 0x74 0x39" contains the following non-Latin-1 character: 0xe4
"0xe4 0xe4 0x74 0x39" contains the following non-Latin-1 character:  0xe40xe4
"0x30 0x83" contains the following non-Latin-1 character: 0x83
"0x48 0xc9" contains the following non-Latin-1 character: 0xc9
"0x78 0x43 0x41 0x90 0x40" contains the following non-Latin-1 character: 0x90
"0xab 0xa1 0xe5" contains the following non-Latin-1 character:  0xab 0xa10xe5
"0x43 0xd5 0x24 0xb6" contains the following non-Latin-1 character:  0xd50xb6
"0x2c 0x41 0x8b 0x54" contains the following non-Latin-1 character: 0x8b
"0x91 0xf 0x88" contains the following non-Latin-1 character:  0x910x88
"0xa9 0x43 0x22" contains the following non-Latin-1 character: 0xa9
"0xe 0x48 0x88 0x7f" contains the following non-Latin-1 character: 0x88
"0x32 0x48 0x88 0x7f" contains the following non-Latin-1 character: 0x88
"0x16 0x91" contains the following non-Latin-1 character: 0x91
"0xb6 0xbb 0xef 0xe9" contains the following non-Latin-1 character:  0xb6 0xbb 0xef0xe9
"0xb6" contains the following non-Latin-1 character: 0xb6
"0xb6" contains the following non-Latin-1 character: 0xb6
"0xb6" contains the following non-Latin-1 character: 0xb6
"0xb6" contains the following non-Latin-1 character: 0xb6
"0xb6" contains the following non-Latin-1 character: 0xb6
"0xb6" contains the following non-Latin-1 character: 0xb6
"0x5d 0x6a 0x7b 0x2f 0x33 0xeb" contains the following non-Latin-1 character: 0xeb
"0x96" contains the following non-Latin-1 character: 0x96
"0x38 0xdf" contains the following non-Latin-1 character: 0xdf
"0xdf 0xdf" contains the following non-Latin-1 character:  0xdf0xdf
"0x94" contains the following non-Latin-1 character: 0x94
"0x5c 0xc1 0x8a" contains the following non-Latin-1 character:  0xc10x8a
"0x5c 0xc1 0x8a" contains the following non-Latin-1 character:  0xc10x8a
"0x9b 0x9c 0xde 0x38" contains the following non-Latin-1 character:  0x9b 0x9c0xde
"0x19 0x9c 0x3c 0x61 0x25" contains the following non-Latin-1 character: 0x9c
"0xdf" contains the following non-Latin-1 character: 0xdf
"0x91" contains the following non-Latin-1 character: 0x91
"0xbb 0x7b" contains the following non-Latin-1 character: 0xbb
"0xaa 0x86 0x27 0xe3 0x43" contains the following non-Latin-1 character:  0xaa 0x860xe3
"0x32 0xa1 0xfe" contains the following non-Latin-1 character:  0xa10xfe
"0x3f 0xd9" contains the following non-Latin-1 character: 0xd9
"0xb8 0x96 0xd2 0x66 0xcd 0x7a" contains the following non-Latin-1 character:  0xb8 0x96 0xd20xcd
"0x33 0xdf 0x4a 0x5b 0xaf 0x30 0x24 0xe0" contains the following non-Latin-1 character:  0xdf 0xaf0xe0
"0xb1" contains the following non-Latin-1 character: 0xb1
"0x32 0x5e 0x9e 0x47 0x37 0xb9" contains the following non-Latin-1 character:  0x9e0xb9
"0xfb 0x31" contains the following non-Latin-1 character: 0xfb
"0xb6" contains the following non-Latin-1 character: 0xb6
"0x80" contains the following non-Latin-1 character: 0x80
"0xfa 0xda" contains the following non-Latin-1 character:  0xfa0xda
"0xf2 0x77 0xed 0xcc 0x76" contains the following non-Latin-1 character:  0xf2 0xed0xcc
"0x2f 0x7b 0x27 0xc6 0xc0" contains the following non-Latin-1 character:  0xc60xc0
"0xd0 0xc1 0x8 0x3b" contains the following non-Latin-1 character:  0xd00xc1
"0xd0 0xc1 0x8 0x3b" contains the following non-Latin-1 character:  0xd00xc1
"0xed 0x2b 0x84" contains the following non-Latin-1 character:  0xed0x84
"0xd1 0x57 0xef 0xb8 0x37 0x5c" contains the following non-Latin-1 character:  0xd1 0xef0xb8
"0x34 0xeb 0xb0" contains the following non-Latin-1 character:  0xeb0xb0
"0xda 0xc5" contains the following non-Latin-1 character:  0xda0xc5
"0xf1" contains the following non-Latin-1 character: 0xf1
"0xd4 0x24 0x94 0x91 0x68 0xa5 0x13 0x43" contains the following non-Latin-1 character:  0xd4 0x94 0x910xa5
"0xbf 0xf9" contains the following non-Latin-1 character:  0xbf0xf9
"0xfd" contains the following non-Latin-1 character: 0xfd
"0xf7" contains the following non-Latin-1 character: 0xf7
"0xf5" contains the following non-Latin-1 character: 0xf5
"0x31 0x88 0x32" contains the following non-Latin-1 character: 0x88
"0xc8 0x14 0xf8 0x14 0x0" contains the following non-Latin-1 character:  0xc80xf8
"0xa4 0xff 0x4c 0xb9 0xbf" contains the following non-Latin-1 character:  0xa4 0xff 0xb90xbf
"0x5 0xc6 0x9f 0xf9 0x77 0xb6 0xad" contains the following non-Latin-1 character:  0xc6 0x9f 0xf9 0xb60xad
"0xb5" contains the following non-Latin-1 character: 0xb5
"0x13 0xaf 0x5d 0xf5 0xd9 0xd8" contains the following non-Latin-1 character:  0xaf 0xf5 0xd90xd8
"0x25 0x2f 0x56 0x49 0xd8" contains the following non-Latin-1 character: 0xd8
"0xb8 0xbe 0x9b 0x63 0x48 0xf" contains the following non-Latin-1 character:  0xb8 0xbe0x9b
"0xf1 0x0 0xa1 0x8f 0x3e" contains the following non-Latin-1 character:  0xf1 0xa10x8f
"0x48 0xf0 0xe1 0x6b 0xbe 0xf 0xd6 0xec 0x69 0xee 0xf5 0x8" contains the following non-Latin-1 character:  0xf0 0xe1 0xbe 0xd6 0xec 0xee0xf5
"0x87 0x8f" contains the following non-Latin-1 character:  0x870x8f
"0x30 0x9d" contains the following non-Latin-1 character: 0x9d
"0xd7" contains the following non-Latin-1 character: 0xd7
"0x37 0xeb 0x35 0x61 0xc2 0xf1" contains the following non-Latin-1 character:  0xeb 0xc20xf1
"0xc6" contains the following non-Latin-1 character: 0xc6
"0x34 0x6b 0xe8" contains the following non-Latin-1 character: 0xe8
"0xb3" contains the following non-Latin-1 character: 0xb3
"0x7 0xe2 0x7c 0xaf" contains the following non-Latin-1 character:  0xe20xaf
"0xde" contains the following non-Latin-1 character: 0xde
"0xa8 0x2d" contains the following non-Latin-1 character: 0xa8
"0xd7 0x0 0x18" contains the following non-Latin-1 character: 0xd7
"0xe8 0x76 0x59" contains the following non-Latin-1 character: 0xe8
"0xa9 0xe" contains the following non-Latin-1 character: 0xa9
"0x91" contains the following non-Latin-1 character: 0x91
"0x23 0x88 0x13 0xac 0x5c 0x68 0x5a 0x2" contains the following non-Latin-1 character:  0x880xac
"0xfa 0x7a" contains the following non-Latin-1 character: 0xfa
"0xee 0x31" contains the following non-Latin-1 character: 0xee
"0x81" contains the following non-Latin-1 character: 0x81
"0xd3 0xe2" contains the following non-Latin-1 character:  0xd30xe2
"0xe2 0xe2" contains the following non-Latin-1 character:  0xe20xe2
"0x9b" contains the following non-Latin-1 character: 0x9b
"0x9b 0xf8 0x27 0xfa" contains the following non-Latin-1 character:  0x9b 0xf80xfa
"0x9b 0xf8 0x27 0xfa" contains the following non-Latin-1 character:  0x9b 0xf80xfa
"0xe7 0x70 0x37" contains the following non-Latin-1 character: 0xe7
"0x7c 0xc0" contains the following non-Latin-1 character: 0xc0
"0xac 0x14 0x5e" contains the following non-Latin-1 character: 0xac
"0x31 0x30 0x57 0x76 0x8a 0xf8" contains the following non-Latin-1 character:  0x8a0xf8
"0x73 0x6b 0x8 0x4b 0x6b 0xc4 0xda" contains the following non-Latin-1 character:  0xc40xda
"0xd2" contains the following non-Latin-1 character: 0xd2
"0x25 0xba 0x58 0x38 0x66 0xbb" contains the following non-Latin-1 character:  0xba0xbb
"0xea" contains the following non-Latin-1 character: 0xea
"0xec 0xd4 0x22" contains the following non-Latin-1 character:  0xec0xd4
"0xec 0xec 0x22" contains the following non-Latin-1 character:  0xec0xec
"0xec 0xec 0x22" contains the following non-Latin-1 character:  0xec0xec
"0xf4 0xe 0xb4 0xde 0x21" contains the following non-Latin-1 character:  0xf4 0xb40xde
"0xfa" contains the following non-Latin-1 character: 0xfa
"0xa8" contains the following non-Latin-1 character: 0xa8
"0xbf" contains the following non-Latin-1 character: 0xbf
"0xa2" contains the following non-Latin-1 character: 0xa2
"0xa2" contains the following non-Latin-1 character: 0xa2
"0xa2" contains the following non-Latin-1 character: 0xa2
"0xa2" contains the following non-Latin-1 character: 0xa2
"0x4f 0x98 0x4c" contains the following non-Latin-1 character: 0x98
"0xbc" contains the following non-Latin-1 character: 0xbc
"0x33 0x9a 0x62 0x31" contains the following non-Latin-1 character: 0x9a
"0x94 0x92" contains the following non-Latin-1 character:  0x940x92
"0x72 0xf9 0xff 0xfb 0x24 0x18 0xc7" contains the following non-Latin-1 character:  0xf9 0xff 0xfb0xc7
"0x86" contains the following non-Latin-1 character: 0x86
"0xda" contains the following non-Latin-1 character: 0xda
"0x88" contains the following non-Latin-1 character: 0x88
"0x22 0xf6" contains the following non-Latin-1 character: 0xf6
"0x9a" contains the following non-Latin-1 character: 0x9a

I ran 1000 tests, so I'm pretty confident that I've got all cases where my server fails.

The server I'm using by the way is a local Azure Functions instance with an HTTP trigger.

Edit: Conclusion after looking at this for too long... The accepted character range of headers for my server is 0x01-0x7f.

@Stranger6667
Copy link
Member

Hmm, interesting.

If Schemathesis will have a way to customize this behavior - will it help your use case? To adjust the range for possible characters in HTTP headers.

Btw, I noticed that at the moment, the strategy for the bearer auth type (if specified via security schemas) doesn't limit the range of possible characters at all - will fix this part separately.

@Lakitna
Copy link
Author

Lakitna commented Apr 29, 2021

Yeah that would help for me.

One extra thing I noticed though, Postman also does not support all these characters in headers. For example: 0x80 (= ƒ) results in this error:

image

I would be happy with an option to reduce the char range. At the same time, I think it's worth thinking about future users too. Not having them go through this would be preferable. That being said, I do see the value of generating with the full range for more security-focused tests. So, how about an option to change the char range with a default of 0x01-0x7f?

@Stranger6667
Copy link
Member

I would be happy with an option to reduce the char range. At the same time, I think it's worth thinking about future users too. Not having them go through this would be preferable. That being said, I do see the value of generating with the full range for more security-focused tests. So, how about an option to change the char range with a default of 0x01-0x7f?

I think it is a reasonable thing to do if on average it will increase the Schemathesis effectiveness with default settings.

I don't have such numbers at the moment, but I am working on an evaluation suite for the "Property-Based Testing of Web APIs" paper. As soon as it is ready I'll look at this aspect - so far I didn't see popular stacks from Python / Rust / Go / JavaScript worlds complaining about that matter, but certainly will re-check that more thoroughly.

@Lakitna
Copy link
Author

Lakitna commented Apr 30, 2021

so far I didn't see popular stacks from Python / Rust / Go / JavaScript worlds complaining about that matter

It's a first for me too, but now that I've seen it once... I would have never thought to check Postman for this stuff before!

increase the Schemathesis effectiveness with default settings

That's the goal! This got me thinking though. You'll know better how possible this is.

When running into an issue like this, can you identify it as an 'Only accepts Latin-1 in headers' situation automatically? If that's possible, you can easily keep the full char range and instruct users to set a narrower range when they're using a server like mine.

It would make it a conscious choice to change the range, without the headaches I encountered while troubleshooting.

Property-Based Testing of Web APIs

Sounds interesting!

workaround

For now, I've wrapped case.validate_response into the following function. It simply checks the content-type header. If that one does not exist, I know my server has returned the fallback 400.

def validate_response(case: schemathesis.models.Case, response: schemathesis.models.Response):
    """
    Validate the response. Includes workaround for Azures default 400.

    Normally we return a 400 with a Problem, this is a workaround because we can't always control
    this.

    This is needed because of how schemathesis works. It will try to add all kinds of characters
    to the HTTP headers. However, Azure Functions can only handle `0x01` through `0x7f`.
    Schemathesis will try to insert other characters as well, causing the OAS validation to fail.

    This function lives here pending the open issue on this:
    https://github.com/schemathesis/schemathesis/issues/1142
    """
    if response.headers.get("content-type") is not None:
        # No need for the workaround
        return case.validate_response(response)

    try:
        case.validate_response(response)
        raise AssertionError("Expected an exception to be raised")
    except schemathesis.exceptions.CheckFailed:
        assert response.status_code == 400

        headers = response.headers
        assert headers.get("content-type") is None

        assert response.content == b""

@Stranger6667
Copy link
Member

When running into an issue like this, can you identify it as an 'Only accepts Latin-1 in headers' situation automatically? If that's possible, you can easily keep the full char range and instruct users to set a narrower range when they're using a server like mine.

I think it depends on the server implementation and how it responds, probably there is no direct way to detect it for Schemathesis automatically. However, I have a prototype that uses Targeted Property-based testing (via Hypothesis's target function) to improve a more general situation. The idea is - when the server responds with a non-4xx response in positive testing scenarios - put 1 to the target call, otherwise put 0. This way Hypothesis will adjust data generation so it is more likely to generate non-400 responses (where I believe such cases should fall into), however, it is visible only on a higher number of generated examples (1000+).

Alternatively, if there is a way to detect some known cases, then Schemathesis can run some requests upfront and then adjust data generation strategies. Could you share a bit more details about your server and how I can reproduce it locally? I'll try to play around with it

@Lakitna
Copy link
Author

Lakitna commented Apr 30, 2021

Hypothesis will adjust data generation so it is more likely to generate non-400 responses

That'd be good, indeed. As long as it still does generate some 4xx codes. But I don't think that should be an issue in most cases. Having a wide variety of situations is the great power of property-based testing. IMO that should translate to a variety of status codes as well.

Along the same lines, I have been thinking of enforcing at least one 2xx response per endpoint. Right now, I have a passing test of an endpoint that has not been implemented. All calls will return 404. It's a bit out of scope of my current story though, and I couldn't quickly squeeze it in with hooks (missing afterAll hook).

Alternatively, if there is a way to detect some known cases, then Schemathesis can run some requests upfront and then adjust data generation strategies

I guess OASs examples would be convenient. Right now the first call always follows the examples, but Hypothesis does not seem to iterate on those values. Not sure if that's a bad thing though 🤔

Could you share a bit more details about your server and how I can reproduce it locally?

I'm running Python Azure Functions locally. I'm using a standard HTTP trigger to make my calls. If you follow the guide below you'll quickly end up with a local Azure Functions app with HTTP triggered function :)

See: https://docs.microsoft.com/en-us/azure/azure-functions/create-first-function-cli-python?tabs=azure-cli%2Cbash%2Cbrowser

@Stranger6667
Copy link
Member

Stranger6667 commented May 5, 2021

Thanks for providing more context! I'll check this a bit later.

Meanwhile, the headers can be filtered with the filter_headers hook:

import schemathesis


@schemathesis.hook
def filter_headers(context, headers):
    for name, value in headers.items():
        if not in_range(name) or not in_range(value):
            return False
    return True


def in_range(string):
    return all(0x01 <= ord(char) <= 0x7f for char in string)

More about hooks - https://schemathesis.readthedocs.io/en/stable/extending.html#hooks

@Lakitna
Copy link
Author

Lakitna commented Oct 22, 2021

I actually ran into this today with a coworker with 0x100 which is not in the 0x00-0xff range. This time it is the Python HTTP lib that raises an exception.

Did something change with a recent release?

Stacktrace of shrunk error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/schemathesis/runner/impl/core.py", line 490, in network_test
    response = _network_test(
  File "/usr/local/lib/python3.9/site-packages/schemathesis/runner/impl/core.py", line 544, in _network_test
    response = case.call(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/schemathesis/models.py", line 325, in call
    response = session.request(**data)  # type: ignore
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 239, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.9/http/client.py", line 1279, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.9/http/client.py", line 1320, in _send_request
    self.putheader(hdr, value)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 224, in putheader
    _HTTPConnection.putheader(self, header, *values)
  File "/usr/local/lib/python3.9/http/client.py", line 1252, in putheader
    values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u0100' in position 0: ordinal not in range(256)

@Stranger6667
Copy link
Member

Thank you for leaving a comment. It is strange, I will look at it over the weekend.

@Lakitna
Copy link
Author

Lakitna commented Oct 22, 2021

Sounds good. For now, we've fixed it with the hook ;)

@Stranger6667
Copy link
Member

@Lakitna

I think I made a mistake in 2418188 when tried to optimize header generation. It removes filtering more cases than it should - it should fallback to filtering.

Could you, please share the API schema of your header parameters (just the top-level of the schema keyword should be enough) so it will be easier for me to confirm that the issue is fixed?

@hoog1511
Copy link

hoog1511 commented Oct 25, 2021

Hello @Stranger6667 I'm the coworker mentioned by @Lakitna, here are the headers used within our schema. We also insert another header through the commandline which is a string of up to 40 characters long.

  • name: someID
    in: header
    description: -
    schema:
    type: string
    format: uuid
    example: "8c36e86c-13b9-4102-a44f-646015dfd981"
    required: true
    - name: SomeOtherID
    in: header
    description: -
    schema:
    type: string
    example: "ABC_DEF_GHIJKL_MNOP"
    required: true
    - name: SomeTimestamp
    in: header
    description: -
    pattern: "\d{10}"
    type: string
    example: "1632741919"

@Stranger6667
Copy link
Member

@hoog1511 thanks!

@hoog1511
Copy link

hoog1511 commented Oct 25, 2021

@Stranger6667 the bug seems to have been introduced in version 3.10.0, I reverted to version 3.9.7 and the encoding error seems to no longer appear during testing. Hope this helps.

@Stranger6667
Copy link
Member

@hoog1511

Thank you, it confirms, that the root cause is this commit. It should be relatively easy to fix, I'll work on this today/tomorrow

@Stranger6667
Copy link
Member

FYI, the fix for the recent issue is released in 3.11

@KioLion
Copy link

KioLion commented Apr 22, 2022

Hello people,

I currently have encountered a similar bug in my application. The bug occures when using "Bearer {token}" for authentication. The token is automatically generated by OpenIdConnect while using FastAPI.
In some cases (some generated tokens have the issue, not all) the generated token is not encodeable by latin-1. So I get the a similar error to Lakitna. The error is an UnicodeEncodeError.

Is this a known issue already? Is there any fix available?

Thanks in advance :)

@Stranger6667
Copy link
Member

Now there is a way to control what characters are used for strings (globally, not for headers specifically). I also will consider adjusting the defaults for header generation in Schemathesis 4.0. Meanwhile, if anybody has any issue with header generation, feel free to open a new issue.

@Stranger6667 Stranger6667 added this to the 3.22 milestone Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: Beginner Ideal for newcomers Priority: Medium Planned for regular releases Type: Bug Errors or unexpected behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants