Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show decoded urls in log #1115

Closed
dima74 opened this issue May 5, 2017 · 9 comments

Comments

Projects
None yet
3 participants
@dima74
Copy link

commented May 5, 2017

Hello! Werkzeug logs urls as encoded strings. For example if I have route with Cyrillic characters: @app.route('/привет') in Flask (https://gist.github.com/dima74/5e025da90ee90a78b65bc3e3c69623ea), then Werkzeug will log GET /%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82 HTTP/1.1. It would be very nice if there is option for Werkzeug to log decoded urls. (Thanks for Werkzeug!)

@untitaker

This comment has been minimized.

Copy link
Member

commented May 5, 2017

@dima74

This comment has been minimized.

Copy link
Author

commented May 6, 2017

Yes, I agree with you, thank you

@dima74

This comment has been minimized.

Copy link
Author

commented May 6, 2017

What if only letters will be decode, that is characters c for which c.isalpha() == True. Something like this:

import urllib.parse
import re


def decode_letters_in_url(url_encoded):
    result = ''
    url_encoded_parts = re.split('((?:%..)+)', url_encoded)
    for part in url_encoded_parts:
        if part.startswith('%'):
            part_decoded = urllib.parse.unquote(part)
            for letter in part_decoded:
                result += letter if letter.isalpha() else urllib.parse.quote(letter)
        else:
            result += part
    return result
>>> decode_letters_in_url('/query?args=1,2')
'/query?args=1,2'
>>> decode_letters_in_url('/russian/%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82')
'/russian/привет'
>>> decode_letters_in_url('/args?%D0%B0%D1%80%D0%B3%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D1%8B=1,2,3')
'/args?аргументы=1,2,3'
>>> decode_letters_in_url('/query?spaces=_%20_%20_')
'/query?spaces=_%20_%20_'
>>> decode_letters_in_url('/%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82?space=%20russian=%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9')
'/привет?space=%20russian=русский'
@untitaker

This comment has been minimized.

Copy link
Member

commented May 6, 2017

Do you need this for production or for development? In development we can make this much simpler since we don't need to worry about the developer owning himself.

@dima74

This comment has been minimized.

Copy link
Author

commented May 6, 2017

For development. As far as I know Werkzeug is for development, not for production, doesn't it? In production I use Gunicorn, and I want that Gunicorn logs decoded url too, but it is another story.

@untitaker

This comment has been minimized.

Copy link
Member

commented May 6, 2017

@dima74

This comment has been minimized.

Copy link
Author

commented May 6, 2017

OK. Now I use only Werkzeug server for development

@untitaker

This comment has been minimized.

Copy link
Member

commented May 6, 2017

@davidism davidism changed the title [Future] Option to get decoded urls in log Show decoded urls in log Dec 10, 2018

@davidism

This comment has been minimized.

Copy link
Member

commented Jan 12, 2019

uri_to_iri has been fixed to leave unsafe and invalid characters quoted, so this should probably be possible without a separate function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.