Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF8 path string invalid when using app.mount() #602

Open
onny opened this issue Mar 21, 2014 · 10 comments
Open

UTF8 path string invalid when using app.mount() #602

onny opened this issue Mar 21, 2014 · 10 comments
Labels
Bug This issue is an actual confirmed bug that needs fixing

Comments

@onny
Copy link

onny commented Mar 21, 2014

test.py:

#!/usr/bin/python
import bottle
import testapp

bottle.debug(True)
app = bottle.Bottle()

app.mount('/test',testapp.app)

app.run(reloader=True, host='0.0.0.0', port=8080)

run(host="localhost",port=8080)

testapp.py:

import bottle

app = bottle.Bottle()

@app.route("/:category", method=["GET","POST"])
def admin(category):
    try:
        return category
    except Exception(e):
        print ("e:"+str(e))

Trying to access: http://127.0.0.1:8080/test/äöü results in following error:

Error: 400 Bad Request
Invalid path string. Expected UTF-8

Running Python 3.4.0 with python-bottle 0.12.5.

@onny
Copy link
Author

onny commented Apr 19, 2014

We have to notice that UTF8 path is generally working:
test_working.py:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import bottle
import testapp

bottle.debug(True)
app = bottle.Bottle()

@app.route("/test/:category", method=["GET","POST"])
def admin(category):
    try:
        return category
    except Exception(e):
        print ("e:"+str(e))

app.run(reloader=True, host='0.0.0.0', port=8080)

run(host="localhost",port=8080)

Visiting http://127.0.0.1:8080/test/äöü prints the special chars without any problems :/

@defnull defnull added the Bug label Apr 19, 2014
@onny
Copy link
Author

onny commented May 3, 2014

app.mount() accepts special characters if I uncomment/ignore this exception:

    def _handle(self, environ):
        path = environ['bottle.raw_path'] = environ['PATH_INFO']

        if py3k:
            try:
                environ['PATH_INFO'] = path.encode('latin1').decode('utf8')
            except UnicodeError:
                print("unicode error")
                # return HTTPError(400, 'Invalid path string. Expected UTF-8')

@defnull
Copy link
Member

defnull commented Sep 13, 2018

The actual bug (or bad design decision) seems to be that bottle overwrites environ['PATH_INFO'] in Bottle._handle() with a re-encoded value, which breaks the WSGI spec for mounted WSGI apps, including bottle itself. The mounted app will try to re-encode the already re-encoded string again, assuming it came from a valid WSGI environment.

Bottle should not change environ['PATH_INFO'] at all, but instead re-encode the path in Request.path() on demand, and use that for request matching. I'm not sure if this change might break existing applications, but this may be a hard enough bug (breaking WSGI spec) that a backwards incompatible fix would be justifiable.

@sharpaper
Copy link

sharpaper commented Jun 19, 2019

This bug is 5 years old and still exists. It creates problems with UTF-8 URL because Bottle returns 400. Can somebody knowledgeable of Bottle please look into fixing this? Thank you.

@sharpaper
Copy link

sharpaper commented Jun 19, 2019

To reproduce:

import bottle
from bottle import get

@get("/<name>")
def index(name):
    return name

go to http://localhost/%E8

result:

Error: 400 Bad Request
Sorry, the requested URL 'http://localhost/%C3%A8' caused an error:
Invalid path string. Expected UTF-8

@sharpaper
Copy link

This issue seems related to this other one: #792
It looks like both are fixed by the new version 0.13-dev. Would it be possible please to merge these Unicode fixes to the next release, instead of waiting for the release of the whole 0.13 version? Unicode mishandling are a major bug and stopper. Thanks.

@defnull
Copy link
Member

defnull commented Jun 20, 2019

Since you are not mounting or redirecting, your issue is not related to this bug or #792. Please open a new issue. Also #792 seems to be fixed by now.

@defnull
Copy link
Member

defnull commented Jun 20, 2019

Also, your exact example works fine for valid utf-8 strings, encoded or not. %E8 is not a valid UTF-8 string, so the error message is quite accurate.

@defnull
Copy link
Member

defnull commented Jun 20, 2019

The original issue (mounting apps) is still present, though. Pull requests are welcomed.

@nobrin
Copy link

nobrin commented Feb 26, 2020

I faced this bug on my app too. In my case, I avoid the bug as follows and the app works fine.
Under, Python 3.6.8 + Bottle.py 0.12.18

#!/usr/bin/env python3
import functools
import bottle

def pathinfo_adjust_wrapper(func):
    # A wrapper for _handle() method
    @functools.wraps(func)
    def _(environ):
        environ["PATH_INFO"] = environ["PATH_INFO"].encode("utf8").decode("latin1")
        return func(environ)
    return _
api = bottle.Bottle()
api._handle = pathinfo_adjust_wrapper(api._handle)

@api.route("/<path:path>")
def callback(path):
    return {"name": path}

application = bottle.default_app()
application.mount("/api/", api)

application.run(host="0.0.0.0")

Access to URL "/api/日本語/filename" (this is in Japanese).

$ curl 127.0.0.1:8080/api/%E6%97%A5%E6%9C%AC%E8%AA%9E/filename
{"name": "\u65e5\u672c\u8a9e/filename"}

It seems to OK. This way does not need any code change on bottle.py.

In this code, the pathinfo_adjust_wrapper() re-encode and re-decode PATH_INFO as inverse of the original _handle(). So, this avoid the UnicodeError.

I love 💖 the bottle.py framework for developing web apps. Thank you!

sirex added a commit to sirex/bottle that referenced this issue Sep 29, 2020
It looks that bottlepy#602 is gone, but I'm not sure when exactly it was fixed.

I see that there is a suspicious change [here](bottlepy@d85a698):

    environ['PATH_INFO'] = path.encode('latin1').decode('utf8', 'ignore')

Where original cause of the error was removed and replaced by `'ignore'`.

Also fixed deprecation warnings, where:

    @self.subapp.route('')

Does exactly same thing as this:

    @self.subapp.route('/test/<test>')

Where `''` is passed to `route()`, then `makelist()` utility assumes that nothing is passed and falls back to path autogeneration from callbackf unction. Not sure if this is expected behaviour?
defnull pushed a commit that referenced this issue Dec 31, 2020
It looks that #602 is gone, but I'm not sure when exactly it was fixed.

I see that there is a suspicious change [here](d85a698):

    environ['PATH_INFO'] = path.encode('latin1').decode('utf8', 'ignore')

Where original cause of the error was removed and replaced by `'ignore'`.

Also fixed deprecation warnings, where:

    @self.subapp.route('')

Does exactly same thing as this:

    @self.subapp.route('/test/<test>')

Where `''` is passed to `route()`, then `makelist()` utility assumes that nothing is passed and falls back to path autogeneration from callbackf unction. Not sure if this is expected behaviour?
martinkirch pushed a commit to martinkirch/showergel that referenced this issue Apr 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug This issue is an actual confirmed bug that needs fixing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants