Exclude edge cases where get_relative_url depends on CWD #2296

oprypin · 2021-02-06T03:04:38Z

When only one of the two passed paths starts with a slash, this function would produce results that depend on (and expose parts of) the actual current working directory.
In a similar manner, it was also possible to "break out" of the "current directory" by starting one of the paths with ../../.. etc.

Additionally, the fact that paths always end up being resolved relative to the current working directory made this function less efficient than it needs to be.
Fun fact: get_relative_url('path_a', 'path_b') actually ended up looking up how to get to /current/working/directory/path_a from /current/working/directory/path_b.

Luckily, none of these behaviors can ever actually come into effect (the function is always called without a leading slash and without paths deliberately trying to escape), but the fix is still good to reduce surprise.

The actual fix is that we make the leading slash ignored (or quite the opposite - we always add it).
Now when either of the two arguments try to go up higher than the top level, they just end up at the top level (e.g. foo/../.. is effectively .), only then the "relative" calculation happens.

This pull request makes the behavior 100% the same as in #2272, but without the disadvantage of being a "custom replacement for the standard lib function". It also brings 35% of the performance advantage.

Build times of a site that has ~350 pages:

When running from directory /home/oprypin/repos/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/athena-website:
- Before: 11.1 sec
- After: 8.22 sec
When running from directory /home/oprypin/repos/athena-website:
- Before: 9.02 sec
- After: 8.15 sec

(meanwhile, #2272 achieves 6.65 sec)

When only one of the two passed paths starts with a slash, this function would produce results that depend on (and expose parts of) the actual current working directory. In a similar manner, it was also possible to "break out" of the "current directory" by starting one of the paths with `../../..` etc. Additionally, the fact that paths always end up being resolved relative to the current working directory made this function less efficient than it needs to be. Fun fact: `get_relative_url('path_a', 'path_b')` actually ended up looking up how to get to `/current/working/directory/path_a` from `/current/working/directory/path_b`. Luckily, none of these behaviors can ever actually come into effect (the function is always called without a leading slash and without paths deliberately trying to escape), but the fix is still good to reduce surprise. *The actual fix* is that we make the leading slash ignored (or quite the opposite - we always add it). Now when either of the two arguments try to go up higher than the top level, they just end up at the top level (e.g. `foo/../..` is effectively `.`), only then the "relative" calculation happens.

waylan · 2021-02-06T21:44:42Z

If your thinking is that this might get accepted more quickly than your other PR, you're wrong. My assumption is that the standard lib to correct in its handling of any input and performance is not enough of a motivator to get me to spend any time on it. And well, I just don't have the time to go through every scenario in your tests and confirm that every one is correct. Sorry, but I'm not willing to take your assertions as face value. I need to be convinced.

To make matters worse, you obscure some of the tests with this:

        for url_slash in ('', '/'):
             for other_slash in ('', '/'):

Spell out every combination in text so I can see them all. Then, point out the ones which fail before the change (side-by-side tests showing which pass/fail for each implementation would be ideal). That will give me a lot less to focus on.

oprypin · 2021-02-06T21:48:18Z

How can you keep saying that the standard lib is correct, if the usage of it is outright wrong. A function that resolves paths relative to the current working directory is being used to resolve abstract paths. How is it in any way OK that get_relative_url produces a different result depending on which directory you happen to be running in?

To clarify: this comment describes the current state, not the state after this pull request closes those loopholes.

I will print out an expanded list here, sure.

oprypin · 2021-02-06T21:59:25Z

To make matters worse, you obscure some of the tests with this

It's not really obscuring as much as, pointing out that the leading slash doesn't matter

Sorry for the prior lack of examples, that's a good point.

Printed them out here

oprypin · 2021-02-06T22:04:23Z

UPD: a clearer version:

Click to expand:

Both with slashes - normally never used like this

`get_relative_url(url, other)`	now	before
`'/foo/bar'`, `'/foo'`	`bar`	same
`'/foo/bar.txt'`, `'/foo'`	`bar.txt`	same
`'/foo'`, `'/foo/bar'`	`..`	same
`'/foo'`, `'/foo/bar.txt'`	`.`	same
`'/foo/../../bar'`, `'/.'`	`bar`	same
`'/foo/../../bar'`, `'/foo'`	`../bar`	same
`'/foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	same
`'/a/b/.././../c'`, `'/.'`	`c`	same
`'/a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	same
`'/a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	same
`'/foo'`, `'/bar.'`	`foo`	same
`'/foo'`, `'/bar./'`	`../foo`	same
`'/foo'`, `'/foo/bar./'`	`..`	same
`'/foo'`, `'/foo/bar./.'`	`..`	same
`'/foo'`, `'/foo/bar././'`	`..`	same
`'/foo/'`, `'/foo/bar././'`	`../`	same
`'/foo'`, `'/foo'`	`.`	same
`'/.foo'`, `'/.foo'`	`.foo`	same
`'/.foo/'`, `'/.foo'`	`.foo/`	same
`'/.foo'`, `'/.foo/'`	`.`	same
`'/.foo/'`, `'/.foo/'`	`./`	same
`'////'`, `'/'`	`./`	same
`'/a///'`, `'/'`	`a/`	same
`'/a///'`, `'/a'`	`./`	same
`'/.'`, `'/here'`	`..`	same
`'/..'`, `'/here'`	`..`	same
`'/../..'`, `'/here'`	`..`	same
`'/../../a'`, `'/here'`	`../a`	same
`'/..'`, `'/here.txt'`	`.`	same
`'/a'`, `'/'`	`a`	same
`'/a'`, `'/..'`	`a`	same
`'/a'`, `'/b'`	`../a`	same
`'/a'`, `'/b/..'`	`../a`	same
`'/a'`, `'/b/../..'`	`a`	same
`'/'`, `'/'`	`./`	same
`'/.'`, `'/'`	`.`	same
`'/'`, `'/.'`	`./`	same
`'/.'`, `'/.'`	`.`	same
`'/a/..../b'`, `'/a/../b'`	`../a/..../b`	same
`'/a/я/b'`, `'/a/я/c'`	`../b`	same
`'/a/я/b'`, `'/a/яя/c'`	`../../я/b`	same

Neither has slashes - 100% of real usages are in this category as far as I see

`get_relative_url(url, other)`	now	before
`'foo/bar'`, `'foo'`	`bar`	same
`'foo/bar.txt'`, `'foo'`	`bar.txt`	same
`'foo'`, `'foo/bar'`	`..`	same
`'foo'`, `'foo/bar.txt'`	`.`	same
`'foo/../../bar'`, `'.'`	`bar`	`../bar`
`'foo/../../bar'`, `'foo'`	`../bar`	`../../bar`
`'foo//./bar/baz'`, `'foo/bar/baz'`	`.`	same
`'a/b/.././../c'`, `'.'`	`c`	same
`'a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	same
`'a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	same
`'foo'`, `'bar.'`	`foo`	same
`'foo'`, `'bar./'`	`../foo`	same
`'foo'`, `'foo/bar./'`	`..`	same
`'foo'`, `'foo/bar./.'`	`..`	same
`'foo'`, `'foo/bar././'`	`..`	same
`'foo/'`, `'foo/bar././'`	`../`	same
`'foo'`, `'foo'`	`.`	same
`'.foo'`, `'.foo'`	`.foo`	same
`'.foo/'`, `'.foo'`	`.foo/`	same
`'.foo'`, `'.foo/'`	`.`	same
`'.foo/'`, `'.foo/'`	`./`	same
`'///'`, `''`	`./`	`../../../../`
`'a///'`, `''`	`a/`	same
`'a///'`, `'a'`	`./`	same
`'.'`, `'here'`	`..`	same
`'..'`, `'here'`	`..`	`../..`
`'../..'`, `'here'`	`..`	`../../..`
`'../../a'`, `'here'`	`../a`	`../../../a`
`'..'`, `'here.txt'`	`.`	`..`
`'a'`, `''`	`a`	same
`'a'`, `'..'`	`a`	same
`'a'`, `'b'`	`../a`	same
`'a'`, `'b/..'`	`../a`	same
`'a'`, `'b/../..'`	`a`	same
`''`, `''`	`.`	`ValueError('no path specified')`
`'.'`, `''`	`.`	same
`''`, `'.'`	`.`	`ValueError('no path specified')`
`'.'`, `'.'`	`.`	same
`'a/..../b'`, `'a/../b'`	`../a/..../b`	same
`'a/я/b'`, `'a/я/c'`	`../b`	same
`'a/я/b'`, `'a/яя/c'`	`../../я/b`	same

No-slash relative to with-slash - normally never used like this

`get_relative_url(url, other)`	now	before
`'foo/bar'`, `'/foo'`	`bar`	`../home/oprypin/repos/mkdocs/foo/bar`
`'foo/bar.txt'`, `'/foo'`	`bar.txt`	`../home/oprypin/repos/mkdocs/foo/bar.txt`
`'foo'`, `'/foo/bar'`	`..`	`../../home/oprypin/repos/mkdocs/foo`
`'foo'`, `'/foo/bar.txt'`	`.`	`../home/oprypin/repos/mkdocs/foo`
`'foo/../../bar'`, `'/.'`	`bar`	`home/oprypin/repos/bar`
`'foo/../../bar'`, `'/foo'`	`../bar`	`../home/oprypin/repos/bar`
`'foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	`../../../home/oprypin/repos/mkdocs/foo/bar/baz`
`'a/b/.././../c'`, `'/.'`	`c`	`home/oprypin/repos/mkdocs/c`
`'a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	`../../../../../home/oprypin/repos/mkdocs/a/b/c/d/ee`
`'a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	`../../../../../home/oprypin/repos/mkdocs/a/b/c/d/ee`
`'foo'`, `'/bar.'`	`foo`	`home/oprypin/repos/mkdocs/foo`
`'foo'`, `'/bar./'`	`../foo`	`../home/oprypin/repos/mkdocs/foo`
`'foo'`, `'/foo/bar./'`	`..`	`../../home/oprypin/repos/mkdocs/foo`
`'foo'`, `'/foo/bar./.'`	`..`	`../../home/oprypin/repos/mkdocs/foo`
`'foo'`, `'/foo/bar././'`	`..`	`../../home/oprypin/repos/mkdocs/foo`
`'foo/'`, `'/foo/bar././'`	`../`	`../../home/oprypin/repos/mkdocs/foo/`
`'foo'`, `'/foo'`	`.`	`../home/oprypin/repos/mkdocs/foo`
`'.foo'`, `'/.foo'`	`.foo`	`home/oprypin/repos/mkdocs/.foo`
`'.foo/'`, `'/.foo'`	`.foo/`	`home/oprypin/repos/mkdocs/.foo/`
`'.foo'`, `'/.foo/'`	`.`	`../home/oprypin/repos/mkdocs/.foo`
`'.foo/'`, `'/.foo/'`	`./`	`../home/oprypin/repos/mkdocs/.foo/`
`'///'`, `'/'`	`./`	same
`'a///'`, `'/'`	`a/`	`home/oprypin/repos/mkdocs/a/`
`'a///'`, `'/a'`	`./`	`../home/oprypin/repos/mkdocs/a/`
`'.'`, `'/here'`	`..`	`../home/oprypin/repos/mkdocs`
`'..'`, `'/here'`	`..`	`../home/oprypin/repos`
`'../..'`, `'/here'`	`..`	`../home/oprypin`
`'../../a'`, `'/here'`	`../a`	`../home/oprypin/a`
`'..'`, `'/here.txt'`	`.`	`home/oprypin/repos`
`'a'`, `'/'`	`a`	`home/oprypin/repos/mkdocs/a`
`'a'`, `'/..'`	`a`	`home/oprypin/repos/mkdocs/a`
`'a'`, `'/b'`	`../a`	`../home/oprypin/repos/mkdocs/a`
`'a'`, `'/b/..'`	`../a`	`../home/oprypin/repos/mkdocs/a`
`'a'`, `'/b/../..'`	`a`	`home/oprypin/repos/mkdocs/a`
`''`, `'/'`	`.`	`ValueError('no path specified')`
`'.'`, `'/'`	`.`	`home/oprypin/repos/mkdocs`
`''`, `'/.'`	`.`	`ValueError('no path specified')`
`'.'`, `'/.'`	`.`	`home/oprypin/repos/mkdocs`
`'a/..../b'`, `'/a/../b'`	`../a/..../b`	`../home/oprypin/repos/mkdocs/a/..../b`
`'a/я/b'`, `'/a/я/c'`	`../b`	`../../../home/oprypin/repos/mkdocs/a/я/b`
`'a/я/b'`, `'/a/яя/c'`	`../../я/b`	`../../../home/oprypin/repos/mkdocs/a/я/b`

Slash relative to no-slash - normally never used like this

`get_relative_url()`	now	before
`'/foo/bar'`, `'foo'`	`bar`	`../../../../../foo/bar`
`'/foo/bar.txt'`, `'foo'`	`bar.txt`	`../../../../../foo/bar.txt`
`'/foo'`, `'foo/bar'`	`..`	`../../../../../../foo`
`'/foo'`, `'foo/bar.txt'`	`.`	`../../../../../foo`
`'/foo/../../bar'`, `'.'`	`bar`	`../../../../bar`
`'/foo/../../bar'`, `'foo'`	`../bar`	`../../../../../bar`
`'/foo//./bar/baz'`, `'foo/bar/baz'`	`.`	`../../../../../../../foo/bar/baz`
`'/a/b/.././../c'`, `'.'`	`c`	`../../../../c`
`'/a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	`../../../../../../../../../a/b/c/d/ee`
`'/a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	`../../../../../../../../../a/b/c/d/ee`
`'/foo'`, `'bar.'`	`foo`	`../../../../foo`
`'/foo'`, `'bar./'`	`../foo`	`../../../../../foo`
`'/foo'`, `'foo/bar./'`	`..`	`../../../../../../foo`
`'/foo'`, `'foo/bar./.'`	`..`	`../../../../../../foo`
`'/foo'`, `'foo/bar././'`	`..`	`../../../../../../foo`
`'/foo/'`, `'foo/bar././'`	`../`	`../../../../../../foo/`
`'/foo'`, `'foo'`	`.`	`../../../../../foo`
`'/.foo'`, `'.foo'`	`.foo`	`../../../../.foo`
`'/.foo/'`, `'.foo'`	`.foo/`	`../../../../.foo/`
`'/.foo'`, `'.foo/'`	`.`	`../../../../../.foo`
`'/.foo/'`, `'.foo/'`	`./`	`../../../../../.foo/`
`'////'`, `''`	`./`	`../../../../`
`'/a///'`, `''`	`a/`	`../../../../a/`
`'/a///'`, `'a'`	`./`	`../../../../../a/`
`'/.'`, `'here'`	`..`	`../../../../..`
`'/..'`, `'here'`	`..`	`../../../../..`
`'/../..'`, `'here'`	`..`	`../../../../..`
`'/../../a'`, `'here'`	`../a`	`../../../../../a`
`'/..'`, `'here.txt'`	`.`	`../../../..`
`'/a'`, `''`	`a`	`../../../../a`
`'/a'`, `'..'`	`a`	`../../../../a`
`'/a'`, `'b'`	`../a`	`../../../../../a`
`'/a'`, `'b/..'`	`../a`	`../../../../../a`
`'/a'`, `'b/../..'`	`a`	`../../../../a`
`'/'`, `''`	`./`	`../../../../`
`'/.'`, `''`	`.`	`../../../..`
`'/'`, `'.'`	`./`	`../../../../`
`'/.'`, `'.'`	`.`	`../../../..`
`'/a/..../b'`, `'a/../b'`	`../a/..../b`	`../../../../../a/..../b`
`'/a/я/b'`, `'a/я/c'`	`../b`	`../../../../../../../a/я/b`
`'/a/я/b'`, `'a/яя/c'`	`../../я/b`	`../../../../../../../a/я/b`

oprypin · 2021-02-06T22:15:21Z

Ha I found an even more interesting result. The tables above were produced when running under the directory /home/oprypin/repos/mkdocs. Now see what happens when running under the directory /:

The old implementation now almost fully matches the new implementation, but doesn't match itself having run at a different current working directory.

`get_relative_url()`	now	before
`'/foo/bar'`, `'/foo'`	`bar`	same
`'/foo/bar.txt'`, `'/foo'`	`bar.txt`	same
`'/foo'`, `'/foo/bar'`	`..`	same
`'/foo'`, `'/foo/bar.txt'`	`.`	same
`'/foo/../../bar'`, `'/.'`	`bar`	same
`'/foo/../../bar'`, `'/foo'`	`../bar`	same
`'/foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	same
`'/a/b/.././../c'`, `'/.'`	`c`	same
`'/a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	same
`'/a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	same
`'/foo'`, `'/bar.'`	`foo`	same
`'/foo'`, `'/bar./'`	`../foo`	same
`'/foo'`, `'/foo/bar./'`	`..`	same
`'/foo'`, `'/foo/bar./.'`	`..`	same
`'/foo'`, `'/foo/bar././'`	`..`	same
`'/foo/'`, `'/foo/bar././'`	`../`	same
`'/foo'`, `'/foo'`	`.`	same
`'/.foo'`, `'/.foo'`	`.foo`	same
`'/.foo/'`, `'/.foo'`	`.foo/`	same
`'/.foo'`, `'/.foo/'`	`.`	same
`'/.foo/'`, `'/.foo/'`	`./`	same
`'////'`, `'/'`	`./`	same
`'/a///'`, `'/'`	`a/`	same
`'/a///'`, `'/a'`	`./`	same
`'/.'`, `'/here'`	`..`	same
`'/..'`, `'/here'`	`..`	same
`'/../..'`, `'/here'`	`..`	same
`'/../../a'`, `'/here'`	`../a`	same
`'/..'`, `'/here.txt'`	`.`	same
`'/a'`, `'/'`	`a`	same
`'/a'`, `'/..'`	`a`	same
`'/a'`, `'/b'`	`../a`	same
`'/a'`, `'/b/..'`	`../a`	same
`'/a'`, `'/b/../..'`	`a`	same
`'/'`, `'/'`	`./`	same
`'/.'`, `'/'`	`.`	same
`'/'`, `'/.'`	`./`	same
`'/.'`, `'/.'`	`.`	same
`'/a/..../b'`, `'/a/../b'`	`../a/..../b`	same
`'/a/я/b'`, `'/a/я/c'`	`../b`	same
`'/a/я/b'`, `'/a/яя/c'`	`../../я/b`	same

`get_relative_url()`	now	before
`'foo/bar'`, `'foo'`	`bar`	same
`'foo/bar.txt'`, `'foo'`	`bar.txt`	same
`'foo'`, `'foo/bar'`	`..`	same
`'foo'`, `'foo/bar.txt'`	`.`	same
`'foo/../../bar'`, `'.'`	`bar`	same
`'foo/../../bar'`, `'foo'`	`../bar`	same
`'foo//./bar/baz'`, `'foo/bar/baz'`	`.`	same
`'a/b/.././../c'`, `'.'`	`c`	same
`'a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	same
`'a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	same
`'foo'`, `'bar.'`	`foo`	same
`'foo'`, `'bar./'`	`../foo`	same
`'foo'`, `'foo/bar./'`	`..`	same
`'foo'`, `'foo/bar./.'`	`..`	same
`'foo'`, `'foo/bar././'`	`..`	same
`'foo/'`, `'foo/bar././'`	`../`	same
`'foo'`, `'foo'`	`.`	same
`'.foo'`, `'.foo'`	`.foo`	same
`'.foo/'`, `'.foo'`	`.foo/`	same
`'.foo'`, `'.foo/'`	`.`	same
`'.foo/'`, `'.foo/'`	`./`	same
`'///'`, `''`	`./`	same
`'a///'`, `''`	`a/`	same
`'a///'`, `'a'`	`./`	same
`'.'`, `'here'`	`..`	same
`'..'`, `'here'`	`..`	same
`'../..'`, `'here'`	`..`	same
`'../../a'`, `'here'`	`../a`	same
`'..'`, `'here.txt'`	`.`	same
`'a'`, `''`	`a`	same
`'a'`, `'..'`	`a`	same
`'a'`, `'b'`	`../a`	same
`'a'`, `'b/..'`	`../a`	same
`'a'`, `'b/../..'`	`a`	same
`''`, `''`	`.`	`ValueError('no path specified')`
`'.'`, `''`	`.`	same
`''`, `'.'`	`.`	`ValueError('no path specified')`
`'.'`, `'.'`	`.`	same
`'a/..../b'`, `'a/../b'`	`../a/..../b`	same
`'a/я/b'`, `'a/я/c'`	`../b`	same
`'a/я/b'`, `'a/яя/c'`	`../../я/b`	same

`get_relative_url()`	now	before
`'foo/bar'`, `'/foo'`	`bar`	same
`'foo/bar.txt'`, `'/foo'`	`bar.txt`	same
`'foo'`, `'/foo/bar'`	`..`	same
`'foo'`, `'/foo/bar.txt'`	`.`	same
`'foo/../../bar'`, `'/.'`	`bar`	same
`'foo/../../bar'`, `'/foo'`	`../bar`	same
`'foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	same
`'a/b/.././../c'`, `'/.'`	`c`	same
`'a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	same
`'a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	same
`'foo'`, `'/bar.'`	`foo`	same
`'foo'`, `'/bar./'`	`../foo`	same
`'foo'`, `'/foo/bar./'`	`..`	same
`'foo'`, `'/foo/bar./.'`	`..`	same
`'foo'`, `'/foo/bar././'`	`..`	same
`'foo/'`, `'/foo/bar././'`	`../`	same
`'foo'`, `'/foo'`	`.`	same
`'.foo'`, `'/.foo'`	`.foo`	same
`'.foo/'`, `'/.foo'`	`.foo/`	same
`'.foo'`, `'/.foo/'`	`.`	same
`'.foo/'`, `'/.foo/'`	`./`	same
`'///'`, `'/'`	`./`	same
`'a///'`, `'/'`	`a/`	same
`'a///'`, `'/a'`	`./`	same
`'.'`, `'/here'`	`..`	same
`'..'`, `'/here'`	`..`	same
`'../..'`, `'/here'`	`..`	same
`'../../a'`, `'/here'`	`../a`	same
`'..'`, `'/here.txt'`	`.`	same
`'a'`, `'/'`	`a`	same
`'a'`, `'/..'`	`a`	same
`'a'`, `'/b'`	`../a`	same
`'a'`, `'/b/..'`	`../a`	same
`'a'`, `'/b/../..'`	`a`	same
`''`, `'/'`	`.`	`ValueError('no path specified')`
`'.'`, `'/'`	`.`	same
`''`, `'/.'`	`.`	`ValueError('no path specified')`
`'.'`, `'/.'`	`.`	same
`'a/..../b'`, `'/a/../b'`	`../a/..../b`	same
`'a/я/b'`, `'/a/я/c'`	`../b`	same
`'a/я/b'`, `'/a/яя/c'`	`../../я/b`	same

`get_relative_url()`	now	before
`'/foo/bar'`, `'foo'`	`bar`	same
`'/foo/bar.txt'`, `'foo'`	`bar.txt`	same
`'/foo'`, `'foo/bar'`	`..`	same
`'/foo'`, `'foo/bar.txt'`	`.`	same
`'/foo/../../bar'`, `'.'`	`bar`	same
`'/foo/../../bar'`, `'foo'`	`../bar`	same
`'/foo//./bar/baz'`, `'foo/bar/baz'`	`.`	same
`'/a/b/.././../c'`, `'.'`	`c`	same
`'/a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	same
`'/a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	same
`'/foo'`, `'bar.'`	`foo`	same
`'/foo'`, `'bar./'`	`../foo`	same
`'/foo'`, `'foo/bar./'`	`..`	same
`'/foo'`, `'foo/bar./.'`	`..`	same
`'/foo'`, `'foo/bar././'`	`..`	same
`'/foo/'`, `'foo/bar././'`	`../`	same
`'/foo'`, `'foo'`	`.`	same
`'/.foo'`, `'.foo'`	`.foo`	same
`'/.foo/'`, `'.foo'`	`.foo/`	same
`'/.foo'`, `'.foo/'`	`.`	same
`'/.foo/'`, `'.foo/'`	`./`	same
`'////'`, `''`	`./`	same
`'/a///'`, `''`	`a/`	same
`'/a///'`, `'a'`	`./`	same
`'/.'`, `'here'`	`..`	same
`'/..'`, `'here'`	`..`	same
`'/../..'`, `'here'`	`..`	same
`'/../../a'`, `'here'`	`../a`	same
`'/..'`, `'here.txt'`	`.`	same
`'/a'`, `''`	`a`	same
`'/a'`, `'..'`	`a`	same
`'/a'`, `'b'`	`../a`	same
`'/a'`, `'b/..'`	`../a`	same
`'/a'`, `'b/../..'`	`a`	same
`'/'`, `''`	`./`	same
`'/.'`, `''`	`.`	same
`'/'`, `'.'`	`./`	same
`'/.'`, `'.'`	`.`	same
`'/a/..../b'`, `'a/../b'`	`../a/..../b`	same
`'/a/я/b'`, `'a/я/c'`	`../b`	same
`'/a/я/b'`, `'a/яя/c'`	`../../я/b`	same

oprypin · 2021-02-06T22:55:10Z

To make matters worse, you obscure some of the tests with this

Now this is also addressed

oprypin · 2021-02-13T22:56:50Z

I have done everything you've asked for and everything at all possible. Please work with me on this. I don't know at all why you dismiss this.

My assumption is that the standard lib to correct in its handling of any input

I have shown that no, this stdlib function does not handle this usage correctly because it was not designed for it. The output of the old function depends on the current working directory.

Please, even run this yourself (without this PR):

>>> import os
>>> from mkdocs.utils import get_relative_url

>>> os.chdir('/tmp/testing')
>>> get_relative_url('bar', '../foo')
'../testing/bar'
>>> get_relative_url('../bar', 'foo')
'../../bar'

>>> os.chdir('/')
>>> get_relative_url('bar', '../foo')
'../bar'
>>> get_relative_url('../bar', 'foo')
'../bar'

Actually, now that I fully realize this, I have made a much better table. Apparently, there is not a single situation where the new and old functions differ and the old function's result does not depend on the current working directory.

If anyone ever used those cases, they are currently getting garbage from the old function.

The new function's result is consistent in those cases, and for the already-consistent cases of the old function, the result 100% matches.

Click to expand the tables

`get_relative_url()`	now	before
`'foo/bar'`, `'foo'`	`bar`	same
`'foo/bar.txt'`, `'foo'`	`bar.txt`	same
`'foo'`, `'foo/bar'`	`..`	same
`'foo'`, `'foo/bar.txt'`	`.`	same
`'foo/../../bar'`, `'.'`	`bar`	Depends on `os.getcwd()`
`'foo/../../bar'`, `'foo'`	`../bar`	Depends on `os.getcwd()`
`'foo//./bar/baz'`, `'foo/bar/baz'`	`.`	same
`'a/b/.././../c'`, `'.'`	`c`	same
`'a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	same
`'a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	same
`'foo'`, `'bar.'`	`foo`	same
`'foo'`, `'bar./'`	`../foo`	same
`'foo'`, `'foo/bar./'`	`..`	same
`'foo'`, `'foo/bar./.'`	`..`	same
`'foo'`, `'foo/bar././'`	`..`	same
`'foo/'`, `'foo/bar././'`	`../`	same
`'foo'`, `'foo'`	`.`	same
`'.foo'`, `'.foo'`	`.foo`	same
`'.foo/'`, `'.foo'`	`.foo/`	same
`'.foo'`, `'.foo/'`	`.`	same
`'.foo/'`, `'.foo/'`	`./`	same
`'///'`, `''`	`./`	Depends on `os.getcwd()`
`'a///'`, `''`	`a/`	same
`'a///'`, `'a'`	`./`	same
`'.'`, `'here'`	`..`	same
`'..'`, `'here'`	`..`	Depends on `os.getcwd()`
`'../..'`, `'here'`	`..`	Depends on `os.getcwd()`
`'../../a'`, `'here'`	`../a`	Depends on `os.getcwd()`
`'..'`, `'here.txt'`	`.`	Depends on `os.getcwd()`
`'a'`, `''`	`a`	same
`'a'`, `'..'`	`a`	same
`'a'`, `'b'`	`../a`	same
`'a'`, `'b/..'`	`../a`	same
`'a'`, `'b/../..'`	`a`	same
`''`, `''`	`.`	`ValueError('no path specified')`
`'.'`, `''`	`.`	same
`''`, `'.'`	`.`	`ValueError('no path specified')`
`'.'`, `'.'`	`.`	same
`'a/..../b'`, `'a/../b'`	`../a/..../b`	same
`'a/я/b'`, `'a/я/c'`	`../b`	same
`'a/я/b'`, `'a/яя/c'`	`../../я/b`	same

`get_relative_url()`	now	before
`'foo/bar'`, `'/foo'`	`bar`	Depends on `os.getcwd()`
`'foo/bar.txt'`, `'/foo'`	`bar.txt`	Depends on `os.getcwd()`
`'foo'`, `'/foo/bar'`	`..`	Depends on `os.getcwd()`
`'foo'`, `'/foo/bar.txt'`	`.`	Depends on `os.getcwd()`
`'foo/../../bar'`, `'/.'`	`bar`	Depends on `os.getcwd()`
`'foo/../../bar'`, `'/foo'`	`../bar`	Depends on `os.getcwd()`
`'foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	Depends on `os.getcwd()`
`'a/b/.././../c'`, `'/.'`	`c`	Depends on `os.getcwd()`
`'a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	Depends on `os.getcwd()`
`'a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	Depends on `os.getcwd()`
`'foo'`, `'/bar.'`	`foo`	Depends on `os.getcwd()`
`'foo'`, `'/bar./'`	`../foo`	Depends on `os.getcwd()`
`'foo'`, `'/foo/bar./'`	`..`	Depends on `os.getcwd()`
`'foo'`, `'/foo/bar./.'`	`..`	Depends on `os.getcwd()`
`'foo'`, `'/foo/bar././'`	`..`	Depends on `os.getcwd()`
`'foo/'`, `'/foo/bar././'`	`../`	Depends on `os.getcwd()`
`'foo'`, `'/foo'`	`.`	Depends on `os.getcwd()`
`'.foo'`, `'/.foo'`	`.foo`	Depends on `os.getcwd()`
`'.foo/'`, `'/.foo'`	`.foo/`	Depends on `os.getcwd()`
`'.foo'`, `'/.foo/'`	`.`	Depends on `os.getcwd()`
`'.foo/'`, `'/.foo/'`	`./`	Depends on `os.getcwd()`
`'///'`, `'/'`	`./`	same
`'a///'`, `'/'`	`a/`	Depends on `os.getcwd()`
`'a///'`, `'/a'`	`./`	Depends on `os.getcwd()`
`'.'`, `'/here'`	`..`	Depends on `os.getcwd()`
`'..'`, `'/here'`	`..`	Depends on `os.getcwd()`
`'../..'`, `'/here'`	`..`	Depends on `os.getcwd()`
`'../../a'`, `'/here'`	`../a`	Depends on `os.getcwd()`
`'..'`, `'/here.txt'`	`.`	Depends on `os.getcwd()`
`'a'`, `'/'`	`a`	Depends on `os.getcwd()`
`'a'`, `'/..'`	`a`	Depends on `os.getcwd()`
`'a'`, `'/b'`	`../a`	Depends on `os.getcwd()`
`'a'`, `'/b/..'`	`../a`	Depends on `os.getcwd()`
`'a'`, `'/b/../..'`	`a`	Depends on `os.getcwd()`
`''`, `'/'`	`.`	`ValueError('no path specified')`
`'.'`, `'/'`	`.`	Depends on `os.getcwd()`
`''`, `'/.'`	`.`	`ValueError('no path specified')`
`'.'`, `'/.'`	`.`	Depends on `os.getcwd()`
`'a/..../b'`, `'/a/../b'`	`../a/..../b`	Depends on `os.getcwd()`
`'a/я/b'`, `'/a/я/c'`	`../b`	Depends on `os.getcwd()`
`'a/я/b'`, `'/a/яя/c'`	`../../я/b`	Depends on `os.getcwd()`

`get_relative_url()`	now	before
`'/foo/bar'`, `'foo'`	`bar`	Depends on `os.getcwd()`
`'/foo/bar.txt'`, `'foo'`	`bar.txt`	Depends on `os.getcwd()`
`'/foo'`, `'foo/bar'`	`..`	Depends on `os.getcwd()`
`'/foo'`, `'foo/bar.txt'`	`.`	Depends on `os.getcwd()`
`'/foo/../../bar'`, `'.'`	`bar`	Depends on `os.getcwd()`
`'/foo/../../bar'`, `'foo'`	`../bar`	Depends on `os.getcwd()`
`'/foo//./bar/baz'`, `'foo/bar/baz'`	`.`	Depends on `os.getcwd()`
`'/a/b/.././../c'`, `'.'`	`c`	Depends on `os.getcwd()`
`'/a/b/c/d/ee'`, `'a/b/c/d/e'`	`../ee`	Depends on `os.getcwd()`
`'/a/b/c/d/ee'`, `'a/b/z/d/e'`	`../../../c/d/ee`	Depends on `os.getcwd()`
`'/foo'`, `'bar.'`	`foo`	Depends on `os.getcwd()`
`'/foo'`, `'bar./'`	`../foo`	Depends on `os.getcwd()`
`'/foo'`, `'foo/bar./'`	`..`	Depends on `os.getcwd()`
`'/foo'`, `'foo/bar./.'`	`..`	Depends on `os.getcwd()`
`'/foo'`, `'foo/bar././'`	`..`	Depends on `os.getcwd()`
`'/foo/'`, `'foo/bar././'`	`../`	Depends on `os.getcwd()`
`'/foo'`, `'foo'`	`.`	Depends on `os.getcwd()`
`'/.foo'`, `'.foo'`	`.foo`	Depends on `os.getcwd()`
`'/.foo/'`, `'.foo'`	`.foo/`	Depends on `os.getcwd()`
`'/.foo'`, `'.foo/'`	`.`	Depends on `os.getcwd()`
`'/.foo/'`, `'.foo/'`	`./`	Depends on `os.getcwd()`
`'////'`, `''`	`./`	Depends on `os.getcwd()`
`'/a///'`, `''`	`a/`	Depends on `os.getcwd()`
`'/a///'`, `'a'`	`./`	Depends on `os.getcwd()`
`'/.'`, `'here'`	`..`	Depends on `os.getcwd()`
`'/..'`, `'here'`	`..`	Depends on `os.getcwd()`
`'/../..'`, `'here'`	`..`	Depends on `os.getcwd()`
`'/../../a'`, `'here'`	`../a`	Depends on `os.getcwd()`
`'/..'`, `'here.txt'`	`.`	Depends on `os.getcwd()`
`'/a'`, `''`	`a`	Depends on `os.getcwd()`
`'/a'`, `'..'`	`a`	Depends on `os.getcwd()`
`'/a'`, `'b'`	`../a`	Depends on `os.getcwd()`
`'/a'`, `'b/..'`	`../a`	Depends on `os.getcwd()`
`'/a'`, `'b/../..'`	`a`	Depends on `os.getcwd()`
`'/'`, `''`	`./`	Depends on `os.getcwd()`
`'/.'`, `''`	`.`	Depends on `os.getcwd()`
`'/'`, `'.'`	`./`	Depends on `os.getcwd()`
`'/.'`, `'.'`	`.`	Depends on `os.getcwd()`
`'/a/..../b'`, `'a/../b'`	`../a/..../b`	Depends on `os.getcwd()`
`'/a/я/b'`, `'a/я/c'`	`../b`	Depends on `os.getcwd()`
`'/a/я/b'`, `'a/яя/c'`	`../../я/b`	Depends on `os.getcwd()`

`get_relative_url()`	now	before
`'/foo/bar'`, `'/foo'`	`bar`	same
`'/foo/bar.txt'`, `'/foo'`	`bar.txt`	same
`'/foo'`, `'/foo/bar'`	`..`	same
`'/foo'`, `'/foo/bar.txt'`	`.`	same
`'/foo/../../bar'`, `'/.'`	`bar`	same
`'/foo/../../bar'`, `'/foo'`	`../bar`	same
`'/foo//./bar/baz'`, `'/foo/bar/baz'`	`.`	same
`'/a/b/.././../c'`, `'/.'`	`c`	same
`'/a/b/c/d/ee'`, `'/a/b/c/d/e'`	`../ee`	same
`'/a/b/c/d/ee'`, `'/a/b/z/d/e'`	`../../../c/d/ee`	same
`'/foo'`, `'/bar.'`	`foo`	same
`'/foo'`, `'/bar./'`	`../foo`	same
`'/foo'`, `'/foo/bar./'`	`..`	same
`'/foo'`, `'/foo/bar./.'`	`..`	same
`'/foo'`, `'/foo/bar././'`	`..`	same
`'/foo/'`, `'/foo/bar././'`	`../`	same
`'/foo'`, `'/foo'`	`.`	same
`'/.foo'`, `'/.foo'`	`.foo`	same
`'/.foo/'`, `'/.foo'`	`.foo/`	same
`'/.foo'`, `'/.foo/'`	`.`	same
`'/.foo/'`, `'/.foo/'`	`./`	same
`'////'`, `'/'`	`./`	same
`'/a///'`, `'/'`	`a/`	same
`'/a///'`, `'/a'`	`./`	same
`'/.'`, `'/here'`	`..`	same
`'/..'`, `'/here'`	`..`	same
`'/../..'`, `'/here'`	`..`	same
`'/../../a'`, `'/here'`	`../a`	same
`'/..'`, `'/here.txt'`	`.`	same
`'/a'`, `'/'`	`a`	same
`'/a'`, `'/..'`	`a`	same
`'/a'`, `'/b'`	`../a`	same
`'/a'`, `'/b/..'`	`../a`	same
`'/a'`, `'/b/../..'`	`a`	same
`'/'`, `'/'`	`./`	same
`'/.'`, `'/'`	`.`	same
`'/'`, `'/.'`	`./`	same
`'/.'`, `'/.'`	`.`	same
`'/a/..../b'`, `'/a/../b'`	`../a/..../b`	same
`'/a/я/b'`, `'/a/я/c'`	`../b`	same
`'/a/я/b'`, `'/a/яя/c'`	`../../я/b`	same

oprypin · 2021-02-13T22:58:39Z

The code used to make that last table: oprypin@2ce90ec

oprypin · 2021-02-13T23:22:28Z

A fuzzer proving the same - very easy to understand this time:
https://gist.github.com/oprypin/ca27096823e806244632496c15d1c8d7

It shows that the new function 100% matches the old function, provided that the old function can match itself rather than varying on the current working directory.

I need to be convinced.

If this doesn't convince you, then I need to be convinced with the reasoning why it doesn't.

waylan · 2021-02-14T00:14:09Z

So at this point you have more than demonstrated that an issue exists. But now I need to go through every single one of those tests and confirm that what your "fix" is outputting is correct. That is going to take some time. Please be patient.

waylan · 2021-04-04T19:21:51Z

Finally got around to reviewing this. Sorry for the delay and thanks for the work.

oprypin added 2 commits February 6, 2021 03:45

Also bring over the documentation

da41b28

Unroll some tests

552a5a7

waylan merged commit fe6a389 into mkdocs:master Apr 4, 2021

waylan mentioned this pull request Apr 4, 2021

Optimize getting relative page URLs #2272

Closed

oprypin mentioned this pull request May 16, 2021

Optimize getting relative page URLs, now with less custom code #2407

Merged

oprypin mentioned this pull request Mar 25, 2022

relative url start with "../" in nav no longer works #2752

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exclude edge cases where get_relative_url depends on CWD #2296

Exclude edge cases where get_relative_url depends on CWD #2296

oprypin commented Feb 6, 2021

waylan commented Feb 6, 2021

oprypin commented Feb 6, 2021

oprypin commented Feb 6, 2021 •

edited

oprypin commented Feb 6, 2021 •

edited

oprypin commented Feb 6, 2021

oprypin commented Feb 6, 2021

oprypin commented Feb 13, 2021

oprypin commented Feb 13, 2021

oprypin commented Feb 13, 2021

waylan commented Feb 14, 2021

waylan commented Apr 4, 2021

Exclude edge cases where get_relative_url depends on CWD #2296

Exclude edge cases where get_relative_url depends on CWD #2296

Conversation

oprypin commented Feb 6, 2021

waylan commented Feb 6, 2021

oprypin commented Feb 6, 2021

oprypin commented Feb 6, 2021 • edited

oprypin commented Feb 6, 2021 • edited

oprypin commented Feb 6, 2021

oprypin commented Feb 6, 2021

oprypin commented Feb 13, 2021

oprypin commented Feb 13, 2021

oprypin commented Feb 13, 2021

waylan commented Feb 14, 2021

waylan commented Apr 4, 2021

oprypin commented Feb 6, 2021 •

edited

oprypin commented Feb 6, 2021 •

edited