Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show decoded urls in log #1115

Closed
dima74 opened this issue May 5, 2017 · 9 comments
Closed

Show decoded urls in log #1115

dima74 opened this issue May 5, 2017 · 9 comments
Labels
Milestone

Comments

@dima74
Copy link

@dima74 dima74 commented May 5, 2017

Hello! Werkzeug logs urls as encoded strings. For example if I have route with Cyrillic characters: @app.route('/привет') in Flask (https://gist.github.com/dima74/5e025da90ee90a78b65bc3e3c69623ea), then Werkzeug will log GET /%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82 HTTP/1.1. It would be very nice if there is option for Werkzeug to log decoded urls. (Thanks for Werkzeug!)

@untitaker
Copy link
Member

@untitaker untitaker commented May 5, 2017

@dima74
Copy link
Author

@dima74 dima74 commented May 6, 2017

Yes, I agree with you, thank you

@dima74
Copy link
Author

@dima74 dima74 commented May 6, 2017

What if only letters will be decode, that is characters c for which c.isalpha() == True. Something like this:

import urllib.parse
import re


def decode_letters_in_url(url_encoded):
    result = ''
    url_encoded_parts = re.split('((?:%..)+)', url_encoded)
    for part in url_encoded_parts:
        if part.startswith('%'):
            part_decoded = urllib.parse.unquote(part)
            for letter in part_decoded:
                result += letter if letter.isalpha() else urllib.parse.quote(letter)
        else:
            result += part
    return result
>>> decode_letters_in_url('/query?args=1,2')
'/query?args=1,2'
>>> decode_letters_in_url('/russian/%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82')
'/russian/привет'
>>> decode_letters_in_url('/args?%D0%B0%D1%80%D0%B3%D1%83%D0%BC%D0%B5%D0%BD%D1%82%D1%8B=1,2,3')
'/args?аргументы=1,2,3'
>>> decode_letters_in_url('/query?spaces=_%20_%20_')
'/query?spaces=_%20_%20_'
>>> decode_letters_in_url('/%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82?space=%20russian=%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9')
'/привет?space=%20russian=русский'

@untitaker
Copy link
Member

@untitaker untitaker commented May 6, 2017

Do you need this for production or for development? In development we can make this much simpler since we don't need to worry about the developer owning himself.

@dima74
Copy link
Author

@dima74 dima74 commented May 6, 2017

For development. As far as I know Werkzeug is for development, not for production, doesn't it? In production I use Gunicorn, and I want that Gunicorn logs decoded url too, but it is another story.

@untitaker
Copy link
Member

@untitaker untitaker commented May 6, 2017

@dima74
Copy link
Author

@dima74 dima74 commented May 6, 2017

OK. Now I use only Werkzeug server for development

@untitaker
Copy link
Member

@untitaker untitaker commented May 6, 2017

@davidism davidism changed the title [Future] Option to get decoded urls in log Show decoded urls in log Dec 10, 2018
@davidism
Copy link
Member

@davidism davidism commented Jan 12, 2019

uri_to_iri has been fixed to leave unsafe and invalid characters quoted, so this should probably be possible without a separate function.

@davidism davidism added this to the 0.15 milestone Jan 12, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants