Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for EarlyTasks. #1632

Closed
wants to merge 8 commits into from

Conversation

Projects
4 participants
@felixfontein
Copy link
Contributor

commented Mar 8, 2015

The earlytask_impl branch implements support for EarlyTasks in Nikola, which are tasks that are executed before Nikola scans for posts. See #1562 for more details. After some testing (sorry it took so long) I think it works rather well now, so it's time to start a PR :-)

What this branch improves over current Nikola:

  • there's a better separation between Tasks and LateTasks: LateTasks won't be processed before Tasks aren't done; this makes in particular a difference when multithreading/multiprocessing is used;
  • scanning for posts is now collected to happen at precisely one place: in stage 0, which is before the stage where all Task objects are processed, and after all EarlyTask objects are processed;

What this branch requires:

  • an up-to-date version of doit

What this branch cannot deliver at the moment:

  • nikola build <filename> does not work without further support in doit (see pydoit/doit#20, haven't had time to really work on it yet);
  • nikola build <taskname> only works for tasks named after the plugin itself, and not for (potential) otherwise named tasks created by the plugins;
  • when running nikola in parallel (via -n 4 for example), one has to tell doit (via -P thread) to use threads instead of processes.

(fixes #1553 and #1562)

Review on Reviewable

@felixfontein felixfontein added this to the Whenever milestone Mar 8, 2015

@Kwpolska

This comment has been minimized.

Copy link
Member

commented Mar 8, 2015

How up-to-date is up-to-date? Change the dependency in requirements.txt to match.

Also, can't we tell doit to use threads from code?

@Kwpolska

This comment has been minimized.

Copy link
Member

commented Mar 8, 2015

Also, the tests fail on Travis, which is a very bad thing you absolutely need to fix, especially because there are dependency errors in integration tests.

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 8, 2015

As up-to-date as it can be ;-) I updated the requirement, but before merging we probably have to update it again.

Tell doit to use threads from code: yes, that's the plan. I'll check for a more elegant way later (instead of just inserting -P thread into the command line arguments).

Tests: they probably fail because the new post list rendering code requires the post to be already rendered. I updated the dependencies.

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 8, 2015

Hmm, at the moment I don't see a completely satisfactory way to force the use of threads via code, except by injecting -P thread into the command line arguments. Overwriting _execute is not a good idea (in case the signature of doit's Run._execute every changes), but that's essentially the place where the default value comes from. @schettino72 or do I miss something?

@schettino72

This comment has been minimized.

Copy link
Member

commented Mar 9, 2015

It needs doit 0.27 or above. Apart from setting the minimum version on setup.py and requirements files, please also set minversion.

Please refresh my memory... Why parallel execution with multi-process doesnt work?

You can set (almost) any command line attribute on DOIT_CONFIG, you need to check the source code for the key name as it might be different from the command line parameter name. In your case it is:

DOIT_CONFIG = {
    'num_process': 4,
    'par_type': "thread",
}
@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 9, 2015

Ah, I forgot about DOIT_CONFIG. I now added 'par_type' as well as 'minversion'. Thanks for mentioning that!

About multi-processing: the implementation of multi-processing in doit requires to pickle task objects, which doesn't work well with closures (search for "pickled" in #1562). Since Nikola, default plugins and 3rd party plugins use closures in tasks, this could be somewhat problematic. At least for me, it always fails quite early when using multi-processing.

@Kwpolska

This comment has been minimized.

Copy link
Member

commented Mar 15, 2015

You set minversion incorrectly. It should be (0, 27, 0) (a tuple and not a string)

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 16, 2015

Hmm, this looks like a bug in doit (or a Python string problem) to me; after all, the documentation (http://pydoit.org/cmd_run.html#minversion) says you should use a string. I guess the problem is in version_tuple in doit/cmd_base.py:

parts = ver_in.split('.') if isinstance(ver_in, str) else ver_in

in combination with:

from __future__ import print_function, unicode_literals

in nikola/__main__.py, where later DOIT_CONFIG['minversion'] = '0.27.0' is set.

A simple fix (which ignores the main problem) would be to use a tuple, but that somewhat bothers me since it is not clear if that's officially allowed (the documentation only mentions strings, not tuples). @schettino72: should using tuples be officially allowed?

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 16, 2015

(The problem doesn't happen with Python 3, BTW, that's why I never experienced it locally...)

@schettino72

This comment has been minimized.

Copy link
Member

commented Mar 17, 2015

It should support both string and tuple. The documentation on the site is incomplete, docs on source code are good: https://github.com/pydoit/doit/blob/master/doit/cmd_base.py#L14

And there is a bug on python2 it should check for six.string_types not for plain str.
https://github.com/pydoit/doit/blob/master/doit/cmd_base.py#L19

It is an easy fix (patches welcome), but I suggest you guys just use a tuple or you will have to wait for next doit release.

@schettino72

This comment has been minimized.

Copy link
Member

commented Mar 22, 2015

FYI. I fixed the the minversion problem. I also improved the docs saying both strings and tuples are supported. To be released on doit 0.28

pydoit/doit@1b8f8b4

felixfontein added a commit that referenced this pull request Apr 2, 2015

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2015

Thanks a lot! I'm now using a tuple.

@ralsina

This comment has been minimized.

Copy link
Member

commented Apr 24, 2015

So doit 0.28 is out. What's the status of this? Does it work? Does it have any limitations?

@Kwpolska

This comment has been minimized.

Copy link
Member

commented Apr 24, 2015

It cannot work yet, because we need to fix things to actually support doit v0.28.0. See #1655.

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Apr 27, 2015

The main limitation is still that it is currently not possible to specify targets on the command line, like nikola build output/index.html, and also it is not possible to specify task names which are different from their plugin's name. While I think the second might not be that important, the first one still bothers me quite a lot. How this will resolve depends a lot on what doit will do about it; the current discussion can be found here: pydoit/doit#46

@Kwpolska Kwpolska referenced this pull request May 12, 2015

Closed

Version 8 #1718

26 of 26 tasks complete

@felixfontein felixfontein force-pushed the earlytask_impl branch from ea46730 to 0e8c12f May 17, 2015

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented May 17, 2015

I now rebased the earlytask_impl branch to current master; this also includes support for doit 0.28. (The old branch for doit 0.27 is preserved as https://github.com/getnikola/nikola/tree/earlytask_impl_old, just in case.)

Note that with doit 0.28, you still cannot specify targets via nikola build <targetname>, but what you can do (at least with the current doit master) nikola build <taskname>:<targetname>, for example nikola build render_tags:output/categories/demo.xml. (Simply running nikola build render_tags works too, of course.)

@ralsina

This comment has been minimized.

Copy link
Member

commented May 20, 2015

@felixfontein in what ways does this break backwards compatibility? If it really does, then I'll do a 7.4.2 before merging it.

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented May 20, 2015

Well, currently what doesn't work is nikola build <file name to build>, i.e. specifying a target directly (see my comment from 3 days ago for more details). Also, running nikola build <name of task> doesn't work when <name of task> doesn't equals the plugin's name. Whether the first one will change (to be backwards compatible) depends on what will be implemented in doit; the second can work if the other task names are explicitly given to doit, there's no support for that implemented in earltask_impl at the moment.

There might be some other minor incompatibilities, in particular with other task plugins than the default ones.

Also, I'd prefer to rebase/merge with master again after when you merged the scanposts refactoring, since these two branches will have some conflicts.

@felixfontein felixfontein force-pushed the earlytask_impl branch from 0e8c12f to 87e9801 May 23, 2015

@ralsina

This comment has been minimized.

Copy link
Member

commented May 23, 2015

The task/target specification limitations, while a change, I don't think is too bad. We could even check that plugins generate tasks with the right name.

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2015

Hmm, a problem I just noted: if a task has more than one target, there are two problems:
a) if you write nikola build <taskname>:<targetname>, it won't find the task if you don't specify the first target;
b) nikola check -f marks all other targets except the first one as an unknown file.
Both points also affect current Nikola. Since most task plugins never generate more than one file per task, this is hardly ever noticed. Also, one can get around of a) by leaving the task name away.
(In fact, I only noticed this since one of my plugins generates two files per task -- a HTML file and a CSS file --, and I was surprised to find the second file among the orphans.)

I don't know what to do about a) (@schettino72: any idea?). For b), this is currently a result because doit's list --all command returns a list of tasks in the form <taskname>:<target>, where target is the first target in the list if more than one is given. One has to manually call doit info <taskname>:<target> for every taskname and (first) target to get a list of all targets. Hmm, now that I think about it, both problems (a and b) are somewhat related.

@ralsina

This comment has been minimized.

Copy link
Member

commented May 25, 2015

I suppose nikola check could do something smarter than reading the output of the list command. It just never seemed necessary so far.

If you file a bug I'll get around to it eventually.

@felixfontein felixfontein force-pushed the earlytask_impl branch 3 times, most recently from 01699f4 to dadb91e Aug 15, 2016

@ralsina ralsina modified the milestones: v8.0.0, 7.8.1 Aug 29, 2016

felixfontein added some commits Jan 18, 2015

Split tasks into stages, with three (four) default stages: EarlyTask,…
… Task and LateTask.

Scanning posts is done in additional stage between EarlyTask and Task.
Using delayed loading per plugin task object when cmd.execute_tasks is True, which allows to use `nikola build <task_name>`.
Setting doit minimal version.
Setting doit parallelism mechanism to threading, as multi-processing does not work due to pickling.

@felixfontein felixfontein force-pushed the earlytask_impl branch from dadb91e to b272ac1 Oct 9, 2016

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Oct 9, 2016

At least some of the current failures seem to be caused by the problem described in pydoit/doit#152. I'll take another look when that's fixed.

@Kwpolska Kwpolska modified the milestones: v7.8.1, v7.8.2 Oct 13, 2016

@Kwpolska Kwpolska modified the milestones: v7.8.2, v7.8.3 Jan 8, 2017

@felixfontein

This comment has been minimized.

Copy link
Contributor Author

commented Mar 12, 2017

Does anyone care to keep this? I've stopped using and maintaining this a long time ago (some eight months or so), and it doesn't work well anyway (since doit doesn't implement some required features).

@Kwpolska

This comment has been minimized.

Copy link
Member

commented Mar 12, 2017

👍 for killing it

@Kwpolska Kwpolska closed this Mar 18, 2017

@Kwpolska Kwpolska deleted the earlytask_impl branch Mar 18, 2017

@Kwpolska Kwpolska moved this from In Progress to Ideas in Version 8 Jun 4, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.