Processors #424

dbarrosop · 2019-08-08T18:51:18Z

Processors implement a set of functions that are called at different stages:

When a task starts - before any of the hosts starts. The event would receive the Task object.
When a task completes - after all of the hosts are completed. The event would receive the Task and the AggregatedResult objects.
When a host starts - for each host, right before starting it. The event would receive the Task and the Host objects.
When a host completes - for each host, right after completing it. The event would receive the Task, the Host and the MultiResult objects.

This allows you to process those events very easily and do things like:

Your own logging/tracing
To add observability
Stream results
Print results
etc...

The idea is that you can have as many processors as you want and all of them will be called.

As an example I re-implmented print_results using a processor. As you can see the implementation is a bit cleaner.

Another benefit is that because the functions are called when the events happen you don't have to wait until all the hosts are completed to get feedback, you can start processing results as they are made available.

ktbyers · 2019-08-08T19:16:47Z

@dbarrosop Can you expand a little bit on the meaning of when a host starts/host completes (that meaning is pretty obvious for a task, but I didn't really follow it for a host)?

dbarrosop · 2019-08-09T06:47:26Z

Ok, added further clarifications to the description. Obviously all of that will be properly documented :)

dbarrosop · 2019-08-17T14:09:30Z

This one is ready, if nobody objects I will merge it next week. I will also release a new version of nornir.

ktbyers

Looks good to me.

A few minor things (I am fine with the current PR though):

The "Processor" term definitely had me scratching my head for a bit on what this was about and for a while I thought it was more oriented towards a task replacement pattern (instead of what it is).

You frequently used the term "Events" so I wonder if something like "Event Processor" or "Results Processor" might make sense for at least the documentation.

Should we use an ABC for the Processor class?

ktbyers · 2019-08-20T17:12:41Z

docs/tutorials/intro/processors.ipynb

+   "source": [
+    "The first thing you probably noticed is that we got all those messages on screen printed for us. That was done by our processor `PrintResult`. You probably also noticed we got the `AggregatedResult` back but we didn't even bother saving it into a variable as we don't needed it here.\n",
+    "\n",
+    "Now, let's see if `SaveResultToDict` did something to the dictionary `data`L"


Extra trailing L at end of sentence here.

ktbyers · 2019-08-20T17:14:10Z

docs/tutorials/intro/processors.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As you can see, performing various actions on the results becomes quite easy thanks to the processors. You still get the result back but thanks to this plugins you may not needed them anymore.\n",


Minor spelling should be "need" instead of "needed".

ogenstad · 2019-08-23T18:52:33Z

Nice work!

I think it would be nice to also support processors for sub-tasks, currently it's tied to the Nornir object and not for Task.run().

We talked on Slack about maybe moving what's currently the host processors to Task.start().

I'm struggling a bit with the names task_started, task_completed, host_started and host_completed.

I think I would like to replace the current names for host_X -> task_X and have the current task_X be something like run_X or execution_X.

For sub tasks I think it would be fine to only have the start/complete processors be for the main .run(), but then have a start/complete for each task and subtask (I hope this makes sense :) )

What are your thoughts?

ktbyers · 2019-08-24T02:08:17Z

@ogenstad Good points.

I wonder if we should just map it closer to Nornir concepts/naming i.e. something like nornir_start, nornir_complete, task_start, task_complete (where task here is task object so if you did subtasks you would have multiple of these as you referenced above).

dbarrosop · 2019-09-06T14:48:00Z

Subtasks issue fixed.

Re naming, I like the current naming, let me explain why. I think the problem is that we are very ambiguous in our language when it comes to nornir, we call everything a task but we need to distinguish between a task (the concept of it, the instructions we want to perform) and an actual run/instance of a task on a given host. So I think we need to clarify that and follow suit. My suggestion is to cal them:

task to the concept to the actual concept, i.e., the code/plugins. This aligns with how we call plugins already.
task_instance (tai for short) for the individual execution of a task on a given host.

(2) has the problem that we will have to rename some parts of the code, for instance, task signatures will change to:

def my_task(tai):
     ....

Thoughts?

ogenstad · 2019-09-06T14:53:54Z

That sounds good to me, my main issue was that the word host was used for what is above described as a task_instance. I think task_instance is much more clear and also works better with sub-tasks.

ktbyers · 2019-09-06T17:34:52Z

I am still worried this naming is going to be confusing.

Say for example, we have:

nr.run(task=foo)

def foo(task):
    task.run(whatever)

And here nr contains host1 and host2, then we would have:

Task Start Event shortly after nr.run is executed.
Task Instance Start Event (host1-foo)
Task Instance Start Event (host2-foo)
Task Instance Start Event (host1-whatever)
Task Instance Start Event (host2-whatever)
Task Instance Completed Event (host1-whatever)
Task Instance Completed Event (host2-whatever)
Task Instance Completed Event (host1-foo)
Task Instance Completed Event (host2-foo)
Task Completed Event i.e. all hosts are done executing

And here by our new terminology we have two Tasks i.e. foo and whatever, but we only have one Task Start processor event.

Just let me know if I misunderstood this, however. It is definitely possible :-)

I think in practice when explaining this to other people...I would pretty much say, Task Start actually means execution starting or nr run starting.

This stems in part from the dual run methods i.e. the one at the nr.run level and the one at the task instance level.

I am pretty much fine with any of the proposed names, however, as long as the changes don't break existing code...David confirmed to me earlier they don't so we should be good on that front.

I also don't want to drag on too long so we should probably just decide and move on.

dbarrosop · 2019-09-07T13:34:02Z

And here by our new terminology we have two Tasks i.e. foo and whatever, but we only have one Task Start processor event.

That's a good point and it basically highlights we need a better definition of task so let me try again taking into consideration the point you just raised:

A task plugin is a function that takes a task instance as first argument and any extra arguments the task requires, for instance my_task(tai, my_arg1, my_arg2) and returns a Result. A task plugin can call other task plugins via tai.run, these calls are referred as subtasks.
A task is an instruction given to a Nornir object to run a task plugin. This is done via the the Nornir.run method. Note that, as mentioned in the previous point, a task plugin can call other task plugins via the tai argument, however, these are not considered a task, they are considered a subtask.
A task instance is the actual execution of a task on a given host. If a task plugin contains subtasks, they don't spawn a new task instance and they are considered to be part of the parent task plugin.

This leads to the following observation though, to make this consistent with the processors we need to either revert to the previous behavior when we didn't call any processors on subtasks or add a new event specifically for the subtasks. I am going to implement the latter in the meantime but feel free to keep with the discussion, I think the lack of definitions plus the will to make things simpler via utility functions (like task.run) have lead to a few poor decisions that makes things a bit confusing and I'd love to make things clearer so we can decide what we need to change for a nicer API (thinking about nornir 3).

dbarrosop · 2019-09-08T09:35:15Z

Ok, check the latest version. Needs some cleaning on the docs side (waiting to confirm we like this), you can see in test_processor.py how it works and how you can build a datastructure describing the execution of the task.

ktbyers · 2019-09-10T23:16:40Z

This terminology if fine with me.

I will probably end up referring to item 2 as the "primary task" (or some similar adjective to try to make the distinction clearer) as I think there will be quite a bit of confusion on the term "task" (at least until "task instance" and "tai.run()" becomes a more common pattern).

…ocessors

dbarrosop added 4 commits August 8, 2019 18:23

minor cleaning

49da3c3

poc processors

7f674b5

progress

b8a1466

added missing dep

23cdeee

completed

1e2361f

dbarrosop marked this pull request as ready for review August 17, 2019 14:08

added docs for processors

610668f

dbarrosop added 2 commits August 18, 2019 19:21

revert

813e877

revert

8cf2b4a

ktbyers approved these changes Aug 20, 2019

View reviewed changes

progress

d5a055a

dbarrosop force-pushed the processors branch from 4fbc271 to d5a055a Compare September 6, 2019 14:45

dbarrosop added 2 commits September 8, 2019 11:32

progress

36a5dc8

progress

95f68b9

dbarrosop added 2 commits September 21, 2019 14:16

asdsad

8506b80

Merge branch 'develop' of github.com:nornir-automation/nornir into pr…

1c8763a

…ocessors

dbarrosop merged commit 200caca into develop Sep 21, 2019

dbarrosop deleted the processors branch September 21, 2019 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processors #424

Processors #424

dbarrosop commented Aug 8, 2019 •

edited

Loading

ktbyers commented Aug 8, 2019

dbarrosop commented Aug 9, 2019

dbarrosop commented Aug 17, 2019

ktbyers left a comment

ktbyers Aug 20, 2019

ktbyers Aug 20, 2019

ogenstad commented Aug 23, 2019

ktbyers commented Aug 24, 2019

dbarrosop commented Sep 6, 2019 •

edited

Loading

ogenstad commented Sep 6, 2019

ktbyers commented Sep 6, 2019

dbarrosop commented Sep 7, 2019

dbarrosop commented Sep 8, 2019

ktbyers commented Sep 10, 2019

Processors #424

Processors #424

Conversation

dbarrosop commented Aug 8, 2019 • edited Loading

ktbyers commented Aug 8, 2019

dbarrosop commented Aug 9, 2019

dbarrosop commented Aug 17, 2019

ktbyers left a comment

Choose a reason for hiding this comment

ktbyers Aug 20, 2019

Choose a reason for hiding this comment

ktbyers Aug 20, 2019

Choose a reason for hiding this comment

ogenstad commented Aug 23, 2019

ktbyers commented Aug 24, 2019

dbarrosop commented Sep 6, 2019 • edited Loading

ogenstad commented Sep 6, 2019

ktbyers commented Sep 6, 2019

dbarrosop commented Sep 7, 2019

dbarrosop commented Sep 8, 2019

ktbyers commented Sep 10, 2019

dbarrosop commented Aug 8, 2019 •

edited

Loading

dbarrosop commented Sep 6, 2019 •

edited

Loading