Don't replace spaces for %20 in URLs #270

kylef · 2013-12-03T09:57:29Z

The markdown parser from John Gruber doesn't do this, and there is no mention of this in the specification.

This is a broken implementation because it doesn't encode everything. Ie, there may already be a %20 in the URL and a space. The original %20 should become %2520 and the space would already be turned into %20. If this should be percent encoding the URL's then it should use urllib.quote so it handles these cases and also encodes every other character.

The markdown parser from John Gruber doesn't do this, and there is no mention of this in the specification. This is a broken implementation because it doesn't encode everything. Ie, there may already be a `%20` in the URL and a space. The original `%20` should become `%2520` and the space would already be turned into `%20`. If this should be percent encoding the URL's then it should use `urllib.quote` so it handles these cases and also encodes every other character.

waylan · 2013-12-04T03:33:49Z

John Gruber has stated in the past that just because his implementation behaves a certain way doesn't mean it is the correct behavior. I would say this is such a case. Markdown.pl's behavior is a bug IMO.

Of course, the next question is: what is the correct behavior? I looked to other implementations for that. The current behavior was a fix to issue #152 based on what other implementations do. Follow the link to babelmark from that bug report.

Of course, that doesn't address the edge case you provided. A better fix is certainly welcome. Fell free to try out various edge cases on babelmark and see how other implementations handle them.

One final thought. Your patch did not address the tests. With your patch applied we have failing tests. If you'd like me to accept a pull request it needs to have passing tests. Fell free to add additional edge cases to the tests as well.

kylef · 2013-12-04T09:30:33Z

Okay, thanks for the feedback @waylan. I will update the pull-request to correctly encode it if this is expected behaviour when I get a chance.

Btw, you should have a look at hooking up travis-ci to run the tests, then it can mark the pull-request as build failure.

waylan · 2014-01-10T04:35:34Z

After giving this more thought, I'm not sure about my previous position on this issue. The fact is, an author might paste in a url that is already url encoded. In that case, we want to leave it as-is (for example %20 should not become %2520). However, if a url is not already url encoded, it should be. Not sure of the best way to do that. Perhaps unencode and then reencode??

Also, only the path section of the url should be percent encoded. The query string part should be "plus sign encoded" (use urllib.quote_plus). Regardless of what way I go, the current implementation is clearly wrong as it percent encodes all spaces -- even in the query string.

waylan · 2014-01-10T04:53:49Z

This has been fixed in d0e088d. Silly me, I forgot to mention this issue in the commit message.

kylef · 2014-01-10T07:43:35Z

@waylan That commit did what I did, instead of quoting the whole thing as your comment mentions? Is this correct?

waylan · 2014-01-11T01:37:44Z

@kylef yes, I just turned off the incorrect encoding (percent encoding query strings) until I can figure out a reliable way to properly encode a url without messing with already encoded urls. Perhaps the only way to do that is to do nothing.

waylan closed this Jan 10, 2014

kylef deleted the percent-encoding branch January 10, 2014 07:43

This was referenced Jun 18, 2018

Double escaping of ampersand in URLs #669

Closed

Fix double escaping of amp in attributes #670

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't replace spaces for %20 in URLs #270

Don't replace spaces for %20 in URLs #270

Uh oh!

kylef commented Dec 3, 2013

Uh oh!

waylan commented Dec 4, 2013

Uh oh!

kylef commented Dec 4, 2013

Uh oh!

waylan commented Jan 10, 2014

Uh oh!

waylan commented Jan 10, 2014

Uh oh!

kylef commented Jan 10, 2014

Uh oh!

waylan commented Jan 11, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Don't replace spaces for %20 in URLs #270

Don't replace spaces for %20 in URLs #270

Uh oh!

Conversation

kylef commented Dec 3, 2013

Uh oh!

waylan commented Dec 4, 2013

Uh oh!

kylef commented Dec 4, 2013

Uh oh!

waylan commented Jan 10, 2014

Uh oh!

waylan commented Jan 10, 2014

Uh oh!

kylef commented Jan 10, 2014

Uh oh!

waylan commented Jan 11, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants