Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs for sites not in the webserver root are broken [was: The image url is not correct.] #986

Closed
ghost opened this issue Jan 19, 2014 · 47 comments
Assignees
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Jan 19, 2014

My site is in a subfolder, such as http://www.abc.com/blog/. The url of image in the gallery is not correct.
If I put a jpg file in galleries/software/screen.jpg, after building, the url will be http://www.abc.com/galleries/software/screen.jpg. In face it should be http://www.abc.com/blog/galleries/software/screen.jpg.

The images in the posts are good.

Nikola version 6.2.1

@Kwpolska
Copy link
Member

I believe there was an issue of this sort already. Nikola seems not to like sites deployed to a subdirectory. But this needs more investigation.

@Kwpolska
Copy link
Member

Looks like it’s a part of #912. Copy-pasting so it’s in a sane location, also re-titled the issue properly:

Unfortunately, I do not know the gallery code well enough to easily hack on it (without any doubts I will break something, and I know it’s easy to break). @ralsina, mind taking a look?

It looks like we have:

render_gallery_index.img_list → 'galleries/Finestrat/2013-09-06 18.40.30.jpg'
render_gallery_index.thumb    → 'output/galleries/Finestrat/2013-09-06 18.40.30.thumbnail.jpg'

And that probably causes our problem. post.permalink also should be checked, as it also creates incorrect URLs. I think link is our cause because it tries to do /[anything], and then passes that (a) as the return value or (b) to urljoin, which happily fucks it up, ignoring the /zeug/ part.

@Kwpolska Kwpolska mentioned this issue Jan 21, 2014
@da2x
Copy link
Contributor

da2x commented Jan 21, 2014

Is BASE_URL not working in this case? or not set correctly to http://www.abc.com/blog/?

@Kwpolska
Copy link
Member

I have identified the issue before (when the issue came up, might not be up-to-date) to be urljoin doing this: http://abc.com/blog/ + /galleries/ = http://abc.com/galleries/

@Kwpolska
Copy link
Member

>>> urljoin('http://a.com/foo/', 'bar/')
'http://a.com/foo/bar/'

>>> urljoin('http://a.com/foo/', '/bar/')
'http://a.com/bar/'

@da2x
Copy link
Contributor

da2x commented Jan 21, 2014

@Kwpolska, that is because beginning with / refers to root. Should be easy enough to strip out the leading slash.

@Kwpolska
Copy link
Member

We are actually doing this in some places! .lstrip('/') — will fix now.

@ghost ghost assigned Kwpolska Jan 21, 2014
Kwpolska added a commit that referenced this issue Jan 21, 2014
…-subdirectory work

Signed-off-by: Chris “Kwpolska” Warrick <kwpolska@gmail.com>
@Kwpolska Kwpolska reopened this Jan 21, 2014
@Kwpolska
Copy link
Member

not fixed yet, sorry.

@Kwpolska
Copy link
Member

I had to take out a semi-fix because it crashed the tests. It won’t be included in the new release then.

@Kwpolska Kwpolska reopened this Jan 22, 2014
@mike4999
Copy link

i'd like to try and reproduce this, anyone with some public m.w.e.? is it known to affect 9f2d691 ? (current master as of the time im writting)

@Kwpolska
Copy link
Member

@mike4999 yes: it’s all broken everywhere. urljoin is the main culprit, there is also some crazy code in galleries.

@mike4999
Copy link

@Kwpolska so to reproduce all that needs to be done is create galleries in a project with
BASE_URL = http://www.abc.com/blog/ right?

@Kwpolska
Copy link
Member

yes. More things are broken, including links in RSS, sitemaps; see 87bb8b7 for more (possibly not everything)

@ralsina
Copy link
Member

ralsina commented Jan 23, 2014

One way to fix it is write a replacement urljoin that strips the initial "/" in the paths.

@Kwpolska
Copy link
Member

or better, if the input starts with a / it could do it mindful of the base url.

@ralsina
Copy link
Member

ralsina commented Jan 27, 2014

Working on it.

@ralsina
Copy link
Member

ralsina commented Jan 27, 2014

Reproduced.

@max-arnold
Copy link
Contributor

RSS feeds are also broken.

@ralsina
Copy link
Member

ralsina commented Jan 27, 2014

@max-arnold which rss feeds, the gallery ones?

@max-arnold
Copy link
Contributor

Posts for example. http://ar0.me/blog/en/rss.xml http://ar0.me/blog/rss.xml Both SITE_URL and BASE_URL are set to http://ar0.me/blog/. Galleries too.

@smartass101
Copy link
Contributor

This bug has an even more obvious effect on the blog name link on the navbar due to the base template using abs_link("/") to get the blog URL.
I think writing our own urljoin is better and more consistent. It would also require less code change.
Suggested implementation:

def urljoin(base, url):
    if url.startswith("/"): # mind our SITE_URL root
        template = urllib.parse.urljoin("http://base_placeholder/", url)
        return template.replace("http://base_placeholder", base.rstrip("/"))
    else:
        return urllib.parse.urljoin(base, url)

@Kwpolska
Copy link
Member

Probably wasn’t meant to be closed. Reopening.

@Kwpolska Kwpolska reopened this Jan 28, 2014
@ralsina
Copy link
Member

ralsina commented Jan 28, 2014

Indeed, I just fixed a part of it.

@max-arnold
Copy link
Contributor

Also, I'm struggling to set absolute path for favicon. It is located at http://example.com/static/favicon.ico. Blog is located at http://example.com/blog/. If I specify it this way:

FAVICONS = {
    ("shortcut icon", "/static/favicon.ico", "16x16"),
}

actual generated link will become relative static/favicon.ico.

@Kwpolska
Copy link
Member

@max-arnold, workaround (untested but should work): ('shortcut icon', 'http://example.com/static/favicon.ico', '16x16')

@max-arnold
Copy link
Contributor

@Kwpolska It works, but embeds domain name and http scheme into the link. Any chances it will be fixed properly someday?

@ralsina
Copy link
Member

ralsina commented Jan 31, 2014

@max-arnold I am taking a piece-by-piece approach here. Can you open a separate issue describing that specific problem? I promise to fix it in the following day.

@ralsina
Copy link
Member

ralsina commented Jan 31, 2014

@max-arnold I am looking at the favicon issue.

What Nikola is doing is considering "/static/favicon.ico" as meaning "this path inside what Nikola generates". And you want it to mean "this path relative to the site root".

I agree that your reading makes sense and I am trying to figure out a fix.

@ralsina
Copy link
Member

ralsina commented Jan 31, 2014

@max-arnold ok, so, good news, bad news ;-)

Bad news: it's very hard to make the favicon path mean what you want it to mean. Basically, Nikola always generates URLs as if it's the only thing in the world, then makes them relative so the site can be "relocated". That means / is Nikola's output at that point in processing.

Good news: you can use if the site is in /blog/ you can use ../static/favicon.ico and get a better URL

@max-arnold
Copy link
Contributor

Yes, explicit way to produce both absolute and relative links in Nikola would be good.

@ralsina
Copy link
Member

ralsina commented Jan 31, 2014

@max-arnold I suppose we could use some overloaded syntax. For example, we have link:// for links that are converted into configurable destinations. We could use link:///foo to mean "just use /foo"

@max-arnold
Copy link
Contributor

Good news: you can use if the site is in /blog/ you can use ../static/favicon.ico and get a better URL

Thanks, this is better workaround than embedding absolute urls.

@max-arnold
Copy link
Contributor

In theory, to get absolute urls in conf.py we can use something like this, although I don't like another config variable:

BLOG_PREFIX = '/blog/'
BASE_URL = 'http://example.com' + BLOG_PREFIX

SOME_URL = BLOG_PREFIX + 'path/to/page.html'

What to do in templates/posts/articles/galleries is more tricky. Adding new overloaded syntax for links in markdown/rest is probably bad idea.

@Kwpolska
Copy link
Member

The syntax is already implemented, all it needs is:

if dst_url.scheme == 'link':
    if dst_url.startswith('/'):
        return dst_url

Though I personally believe re-using link:// is not the best idea.

@ralsina
Copy link
Member

ralsina commented Jan 31, 2014

@Kwpolska well, more or less that, yes, I just proposed a branch for it. It's a harmless workaround until we can find a better solution, and it may be handy in itself. link is undocumented anyway ;-)

@Kwpolska
Copy link
Member

We should create some more documentation of our internals. It could even sit in the github wiki instead of getnikola.com.

Kwpolska added a commit that referenced this issue Jan 31, 2014
Fix absolute paths in favicons (Issue #986)
ralsina added a commit that referenced this issue Feb 11, 2014
@ralsina
Copy link
Member

ralsina commented Feb 11, 2014

So, I fixed links in galleries and RSS. Are there any other broken links? Or we just close this one?

@Kwpolska
Copy link
Member

Sitemaps.

@ralsina
Copy link
Member

ralsina commented Feb 11, 2014

I don't get that for sitemaps. Setting BASE_URL to http://getnikola.com/blog/ I get these:

<loc>http://getnikola.com/blog/categories/blog.html</loc>

@Kwpolska
Copy link
Member

Actually, no. Sitemaps work. @asmeurer’s site is broken because he set BASE_URL incorrectly…

Though there is still an issue with URL_TYPE set to absolute or full_path (the default setting of rel_path works)

@ralsina
Copy link
Member

ralsina commented Feb 11, 2014

@Kwpolska can you give me a hint of where it fails and how? I can fix these things really quickly but diagnosing takes more time than I have today :-)

@Kwpolska
Copy link
Member

Sure thing!

Well, basically, everything fails. SITE_URL = 'http://getnikola.com/subsite/'

with URL_TYPE = 'full_path' (snippets of index.html):

    <link rel="canonical" href="http://getnikola.com/index.html">
        <a class="navbar-brand" href="http://getnikola.com/">Demo Site</a>
                <li><a href="/archive.html">Archives</a>
                </li><li><a href="/categories/index.html">Tags</a>
                </li><li><a href="/rss.xml">RSS</a>

with URL_TYPE = 'absolute':

    <link rel="canonical" href="http://getnikola.com/index.html">
        <a class="navbar-brand" href="http://getnikola.com/">Demo Site</a>
                <li><a href="http://getnikola.com/archive.html">Archives</a>
                </li><li><a href="http://getnikola.com/categories/index.html">Tags</a>
                </li><li><a href="http://getnikola.com/rss.xml">RSS</a>

No links are correct.

@ralsina
Copy link
Member

ralsina commented Feb 11, 2014

Looks like that "feature" was never tested much. I'd say let's close this bug and open one or two more for those.

@Kwpolska
Copy link
Member

Considering it’s not the default and affecting only willing people, we can do this. For the record, tried this: 3210de1

And travis crashed badly. I took it out because it was right before the release and I didn’t have time to tinker with it and debug.

@ralsina
Copy link
Member

ralsina commented Feb 11, 2014

I can work with that. So, closing this one, and opened #1046

@ralsina ralsina closed this as completed Feb 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants