Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ATOM_PATH setting #3015

Merged
merged 8 commits into from Apr 18, 2018
Merged

Add ATOM_PATH setting #3015

merged 8 commits into from Apr 18, 2018

Conversation

@davidak
Copy link
Contributor

davidak commented Apr 9, 2018

Pull Request Checklist

  • I’ve read the guidelines for contributing.
  • I updated AUTHORS.txt and CHANGES.txt (if the change is non-trivial) and documentation (if applicable).
  • I tested my changes.

Description

Add ATOM_PATH setting, similar to RSS_PATH.

Reference #2971

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

I started working on this feature.

Currently i struggle to test it. Maybe Pythons “Development Mode” can help.

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

Now i have a shell with the dependencies installed and nikola installed with setup.py develop.

Nix is a great help as it has this integrated: https://nixos.org/nixpkgs/manual/#development-mode

Just put default.nix in project folder and run nix-shell.

with import <nixpkgs> {};
with pkgs;
with python36Packages;

buildPythonPackage rec {
  name = "nikola";
  src = ./.;
  propagatedBuildInputs = [
    pygments pillow dateutil docutils Mako unidecode lxml Yapsy PyRSS2Gen
    Logbook blinker setuptools natsort requests piexif markdown phpserialize
    jinja2 doit lessc sass
  ];
  checkInputs = [ pytest pytestcov mock glibcLocales pyyaml ];
}

This works at least on Linux, on macOS is a dependency missing i think. Would you like a PR for that?

Now i'm able to build my website, no change in atom path as expected.

@davidak davidak force-pushed the davidak:atom_path branch from 1f356cd to 97f4df2 Apr 9, 2018
@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

Now it works!

Please review.

This feature changes the default behavior of the atom link generation. Should that be documented clearer?

@davidak davidak changed the title [WIP] Add ATOM_PATH setting Add ATOM_PATH setting Apr 9, 2018
Copy link
Contributor

felixfontein left a comment

Looks good, except: in case ATOM_PATH was not set by the user, but INDEX_PATH was set, the Atom for the main index is suddenly put somewhere else with this change.
Because of that, I would initialize ATOM_PATH with the value of INDEX_PATH in case that was set by the user (somewhere in nikola.py), and adjust the documentation accordingly. Otherwise, we would be breaking backwards compatibility.

@@ -92,7 +92,8 @@ def get_path(self, classification, lang, dest_type='page'):
"""Return a path for the given classification."""
if dest_type == 'rss':
return [self.site.config['RSS_PATH'](lang)], True

This comment has been minimized.

Copy link
@felixfontein

felixfontein Apr 9, 2018

Contributor

Please change the True to 'auto' (I apparently forgot to change this a long time ago).

@@ -92,7 +92,8 @@ def get_path(self, classification, lang, dest_type='page'):
"""Return a path for the given classification."""
if dest_type == 'rss':
return [self.site.config['RSS_PATH'](lang)], True
# 'page' (index) or 'feed' (Atom)
if dest_type == 'feed':
return [self.site.config['ATOM_PATH'](lang)], True

This comment has been minimized.

Copy link
@felixfontein

felixfontein Apr 9, 2018

Contributor

Please also change the True to "auto" here.

This comment has been minimized.

Copy link
@davidak

davidak Apr 9, 2018

Author Contributor

Is 'auto' ok or exactly "auto"? I think it makes no difference and is better in this case.

This comment has been minimized.

Copy link
@felixfontein

felixfontein Apr 9, 2018

Contributor

Ah sorry, I wanted to write 'auto' here as well :)

@felixfontein
Copy link
Contributor

felixfontein commented Apr 9, 2018

Apropos, I'm ususally testing with pip install -e /path/to/nikola/git/checkout. Not sure how that works with Nix, though.

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

@felixfontein Nix does that, this way pip installs every dependency that is not installed by Nix into /tmp/.... This way you get a temporary development environment without messing up your python install.

When i set

RSS_PATH = INDEX_PATH
ATOM_PATH = INDEX_PATH

the links in the index.html are

<link rel="alternate" type="application/rss+xml" title="RSS" href="blog.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="blog.atom">

Now the RSS and Atom link are similar, but have strange file names. Still breaking backwards compatibility.

With this setting

RSS_PATH = INDEX_PATH + "/rss"
ATOM_PATH = INDEX_PATH + "/index"

the links are

<link rel="alternate" type="application/rss+xml" title="RSS" href="blog/rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="blog/index.atom">

I can set this for Atom and leave RSS as is, then it's backwards compatible. But we should change this in 8.0 to have consistent links among RSS and Atom by default.

I can't set it like this in nikola.py

'ATOM_PATH': INDEX_PATH + '/index', # TODO: '' in 8 to be consistent with RSS

NameError: name 'INDEX_PATH' is not defined

I have to sleep now. Day is dawning already.

@davidak davidak force-pushed the davidak:atom_path branch 2 times, most recently from f1a967b to dc46106 Apr 9, 2018
@Kwpolska
Copy link
Member

Kwpolska commented Apr 9, 2018

As you noticed in #3016, Atom feeds work differently to RSS. We generate them with the same data as index.html, only in a different storage format. With that, I’m not sure if this PR/feature request will work out.

'ATOM_PATH': INDEX_PATH + '/index', # TODO: '' in 8 to be consistent with RSS

This feature will be v8-only.

To do this properly, you’d do it after config is loaded from conf.py, and then use something like:

config['ATOM_PATH'] = config['ATOM_PATH'] if config['ATOM_PATH'] else config['INDEX_PATH']
# ninja edit:
config['ATOM_PATH'] = config['ATOM_PATH'] or config['INDEX_PATH']

[Nix stuff] works at least on Linux, on macOS is a dependency missing i think. Would you like a PR for that?

I don’t think we need that. The more typical route (virtualenv + pip install -e .) is enough for most people, and we really don’t want yet another dependency list to update.

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

As you noticed in #3016, Atom feeds work differently to RSS. We generate them with the same data as index.html, only in a different storage format.

@Kwpolska What is the reason for that? It seems not to be useful to subscribe to the index-1.atom, since it will never change. Or is there a different use case for Atom feeds?

With your suggested change, i get href="blog.atom"> when ATOM_PATH = '' and INDEX_PATH = "blog" in config.py. Is that what you want or did you want the past behavior?

That can be achived with this lines added in nikola.py at line 616:

        # Use ATOM_PATH when available
        self.config['ATOM_PATH'] = self.config['ATOM_PATH'] or self.config['INDEX_PATH'] + '/index'

Resulting in href="blog/index.atom">.

I suggest to have the same behavior with RSS and Atom path, at least in 8 since it breaks backwards compatibility.

@Kwpolska
Copy link
Member

Kwpolska commented Apr 9, 2018

That’s the way this feature was written. I suppose the use-case was to have pagination between atom feeds. But perhaps we could clean it up and drop that?

My code snippet was just an untested example. That said, if you’re letting people customize the path, you should also ask for the file extension (some people might want atom.xml or whatever)

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

But perhaps we could clean it up and drop that?

If there is no good reason for this behavior, sure. That's a topic for the other issue.

My code snippet was just an untested example. That said, if you’re letting people customize the path, you should also ask for the file extension (some people might want atom.xml or whatever)

My first idea was to have /rss.xml and /atom.xml for my blog, but after a little research i found out that the .atom file extension is recommended in RFC 4287. So i will leave it index.atom. Should we still let people change it?

What default behavior do you want for this feature?

  1. INDEX_PATH + '/index' (backwards compatible)
  2. index (consistent with RSS)
  3. something else
@Kwpolska
Copy link
Member

Kwpolska commented Apr 9, 2018

I vote for option 1.

fix #2971
@davidak davidak force-pushed the davidak:atom_path branch from dc46106 to 972d660 Apr 9, 2018
@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

Thanks. I changed it accordingly.

I noticed that when i set ATOM_PATH = "käse" (with german umlauts), the link in index.html is broken: href="k%C3%A4se.atom">. But i think that's not new and nobody should use such link.

@Kwpolska
Copy link
Member

Kwpolska commented Apr 9, 2018

Are you sure it’s broken? It’s percent-encoding, and looks like UTF-8, so it will work in most sane environments. (Will work on modern Linux. May crash on macOS and Windows if web server is not careful enough.)

@davidak
Copy link
Contributor Author

davidak commented Apr 9, 2018

It works, but look strange. So, no problem.

@felixfontein
Copy link
Contributor

felixfontein commented Apr 10, 2018

I've tried running your branch with one of my blogs, and I got:

TaskError - taskid:render_taxonomies:/index.atom
PythonAction Error
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/doit/action.py", line 424, in execute
    returned_value = self.py_callable(*self.args, **kwargs)
  File "/home/.../nikola/nikola.py", line 2401, in atom_feed_renderer
    with io.open(output_path, "w+", encoding="utf-8") as atom_file:
PermissionError: [Errno 13] Permission denied: '/index.atom'

I didn't set ATOM_PATH, and INDEX_PATH is '' (default value).

After replacing self.config['INDEX_PATH'] + '/index' with os.path.join(self.config['INDEX_PATH'], 'index') (probably doesn't work under Windows, but for a quick test that's enough). After this change, everything worked (and produced the same output as before). So I guess you need to catch the case where self.config['INDEX_PATH'] is empty or ends with a slash. (Too bad os.path.join cannot be used under Windows...)

@tritium21
Copy link
Contributor

tritium21 commented Apr 13, 2018

... Why can't os.path.join be used on windows? As primarily a windows users, who uses the path module quite a bit, I'm quite surprised to hear that a function I use a lot cant be used on the platform I use it on every day...

@Kwpolska
Copy link
Member

Kwpolska commented Apr 13, 2018

@tritium21 you’re missing context. os.path.join works fine on Windows to produce paths that use backslashes. Which can’t be used in URLs.

@tritium21
Copy link
Contributor

tritium21 commented Apr 14, 2018

@Kwpolska if you're building URLs (not for paths for files on disk, which i assumed the context was), then, yes, the os.path alias is not what to use. import posixpath directly (yes, it is always available, even on windows)

@felixfontein
Copy link
Contributor

felixfontein commented Apr 15, 2018

@davidak: you'll need to rebase this. The behavior of RSS_PATH changed (see #3036), so your implementation of ATOM_PATH probably needs to be adjusted accordingly. Actually, this should make the + '/index' part superfluous.

@felixfontein
Copy link
Contributor

felixfontein commented Apr 15, 2018

Looks like I was able to do that myself.

@felixfontein
Copy link
Contributor

felixfontein commented Apr 15, 2018

It now works with my blogs (using the default ATOM_PATH and INDEX_PATH being unset).

@davidak: does the current version also works with your blog?

@felixfontein
Copy link
Contributor

felixfontein commented Apr 15, 2018

Another thing to look at: 987f186

@davidak
Copy link
Contributor Author

davidak commented Apr 17, 2018

@felixfontein i have also tested it with my blog and default RSS_PATH, default ATOM_PATH and INDEX_PATH being unset.

URLs as expected:

<link rel="alternate" type="application/rss+xml" title="RSS" href="rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="index.atom">

And also with INDEX_PATH = "blog":

<link rel="alternate" type="application/rss+xml" title="RSS" href="rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="blog/index.atom">

and no change with ATOM_PATH = "" in addition to that:

<link rel="alternate" type="application/rss+xml" title="RSS" href="rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="blog/index.atom">

with ATOM_PATH = "index" it is:

<link rel="alternate" type="application/rss+xml" title="RSS" href="rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="index/index.atom">

with ATOM_PATH = "/" i get:

.  render_taxonomies:/index.atom
########################################
TaskError - taskid:render_taxonomies:/index.atom
PythonAction Error
Traceback (most recent call last):
  File "/nix/store/zfhsdk9f6vn3b9vs1mcja90rkrblk681-doit-0.30.3/lib/python3.6/site-packages/doit/action.py", line 403, in execute
    returned_value = self.py_callable(*self.args, **kwargs)
  File "/home/davidak/code/nikola/nikola/nikola.py", line 2412, in atom_feed_renderer
    with io.open(output_path, "w+", encoding="utf-8") as atom_file:
PermissionError: [Errno 13] Permission denied: '/index.atom'

with ATOM_PATH = "." i get:

<link rel="alternate" type="application/rss+xml" title="RSS" href="rss.xml">
--
  | <link rel="alternate" type="application/atom+xml" title="Atom" href="index.atom">

That was my initial requirement for this feature, great.

davidak and others added 2 commits Apr 17, 2018
@felixfontein
Copy link
Contributor

felixfontein commented Apr 17, 2018

If this is merged after #3043, ATOM_PATH needs to be adjusted as in that PR. Otherwise, if #3043 is merged after this one, it has to be adjusted. Whatever comes last :-)

@felixfontein felixfontein requested a review from Kwpolska Apr 17, 2018
@Kwpolska
Copy link
Member

Kwpolska commented Apr 17, 2018

#3043 was just merged.

Copy link
Member

Kwpolska left a comment

LGTM, except for the fact that ALL_PAGE_DEPS (brand new, needs merge/rebase from master) should now be used instead of GLOBAL_CONTEXT.

@@ -225,6 +225,10 @@
# output / TRANSLATION[lang] / RSS_PATH / rss.xml
# RSS_PATH = ""

# Final location for the blog main Atom feed is:
# output / TRANSLATION[lang] / ATOM_PATH / index.atom
# ATOM_PATH = ""

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Apr 17, 2018

Member

(FYI, this file does not need to be updated.)

This comment has been minimized.

Copy link
@felixfontein

felixfontein Apr 18, 2018

Contributor

Should we leave this change anyway?

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Apr 18, 2018

Member

Whatever.

@@ -95,7 +95,8 @@ def get_path(self, classification, lang, dest_type='page'):
self.site.config['RSS_PATH'](lang),
self.site.config['RSS_FILENAME_BASE'](lang)
], 'auto'
# 'page' (index) or 'feed' (Atom)
if dest_type == 'feed':
return [self.site.config['ATOM_PATH'](lang)], 'always'

This comment has been minimized.

Copy link
@Kwpolska

Kwpolska Apr 17, 2018

Member

@felixfontein: why is page number support not implemented for feeds?

This comment has been minimized.

Copy link
@felixfontein

felixfontein Apr 18, 2018

Contributor

Well, that would be a simple change. But:

  1. nobody asked for it so far (I think);
  2. there's still the big open question why the many .atom feed files are needed after all (#3016).

I'd like to see an answer to #3016 first.

@Kwpolska Kwpolska merged commit fda3c35 into getnikola:master Apr 18, 2018
3 checks passed
3 checks passed
Codacy/PR Quality Review Up to standards. A positive pull request.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@Kwpolska
Copy link
Member

Kwpolska commented Apr 18, 2018

Thanks for contributing, @davidak!

@davidak davidak deleted the davidak:atom_path branch Apr 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.