Skip to content

CherryPy will allways decode basic authentication information with ISO-8859-1 #1680

Closed
@PJaros

Description

@PJaros

I have submitted the observed behavior on stackoverflow.

import cherrypy

class SimpleWebpage(object):
    @cherrypy.expose
    def index(self):
        return "<html><head></head><body>Authenticated</body></html>"

def hexcode(s):
    return ' '.join(hex(ord(x))[2:] for x in s)

def test_hexcode(realm, username, password):
    username_hexcode = hexcode(username)
    password_hexcode = hexcode(password)
    print(f"realm: {realm!r}, username: {username!r}-{username_hexcode!r}, "
          f"password: {password!r}-{password_hexcode!r}")
    return False

cherrypy.tree.mount(SimpleWebpage(), '/',
                    {'/': {'tools.auth_basic.checkpassword': test_hexcode,
                           'tools.auth_basic.on': True,
                           'tools.auth_basic.realm': 'MY_REALM',}})

cherrypy.config.update({'tools.sessions.on': True,})
cherrypy.server.socket_host = '0.0.0.0'
cherrypy.engine.autoreload.unsubscribe()
cherrypy.engine.start()
cherrypy.engine.block()

Issuing
curl -u 'curl:€öäü' -i -X GET http://10.25.5.17:8080/
will print
realm: 'MY_REALM', username: 'curl'-'63 75 72 6c', password: 'â\x82¬Ã¶Ã¤Ã¼'-'e2 82 ac c3 b6 c3 a4 c3 bc'
on the python console.

RFC-7617 explains how encoding should be handled concerning basic authentication.

Currently CherryPy doesn't offer a way to indicate what encoding scheme it wants and how it decodes it. Right now browsers will use different charsets:

On Redhat 6

  • curl 7.19.7 (x86_64-redhat-linux-gnu) sends UTF-8

On Windows 10 Version 1709

  • Firefox 57.0.2 (32-Bit) sends (probably) ISO-8859-1
  • curl 7.56.1 (i686-pc-cygwin) sends UTF-8
  • Google-Chrome 63.0.3239.108 (64-Bit) sends UTF-8
  • Internet Explorer 11.786.15063.0 sends (probably) ISO-8859-1
  • Edge 40.15063.674.0 sends (probably) ISO-8859-1

A possible solution would be if CherryPy would send WWW-Authenticate: Basic realm="foo", charset="UTF-8" and decode the authentication string as utf-8 instead of ISO-8851-1. But I'm sure that some other things might need to be considered. And I haven't tested if Firefox will switch to UTF-8 with this header.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions