Make Processor Plugins hierarchical #7

DiegoPino · 2020-03-17T19:48:46Z

What is this?

See #5 and #4 and #6

The idea is that Plugins, which are driven by config entities, as defined here https://github.com/esmero/strawberry_runners/pull/5/files#diff-6cb3b61e72b132f4e76eaf33127a920e are not only sorted by weight but also can/should be hierarchical. Why? Because we would like to allow, by logic, to have Post Processors Plugins to work on other Post processors Outputs. E.g One Post Processor extract files from a PDF, then another uses those files to process HOCR.

How to accomplish this?

Our configuration entity needs more logic. First step would be to add two properties for that

Parent
Depth

Which would allow us to use something similar to this form in the Entity List Builder https://api.drupal.org/api/drupal/core%21modules%21system%21tests%21modules%21tabledrag_test%21src%21Form%21TableDragTestForm.php/class/TableDragTestForm/8.7.x
To allow people to move/drag Plugin instances into hierarchy.

Parent can be NULL (Top Post Processor) or another Post Processor Config Entity UUID/ID
Depth Can be used to find quickly siblings, etc.

Our Event Subscriber that will get all the JSON events or the EVENT itself then needs better logic to be able to build a tree of execution. Means if we push data in to a QUEUE, then ITEM B can not process until ITEM A. That is quite complex and we can discuss how to deal with this. Options are (thinking loud)
A. ITEM A is actually the one that in its process adds a new QUEUE ITEM for ITEM B. Means each TOP Post Processor is the responsible to (Parent/sibling) for generating the output but also, triggering the next processing (please ask if this makes no sense)
OR
B. We have many QUEUES, one per Depth... and each QUEUE once processed triggers a new Event that then sets some flag that allows a the NEXT DEPTH should be processed. This can lead to unnecessary processing.

The text was updated successfully, but these errors were encountered:

giancarlobi · 2020-03-18T11:56:38Z

@DiegoPino I'm thinking about this. I need to make more clear the workflow in my head. Option A seems to be the right one but I need to solve other dubs as: do we need enable plugin per ADO or per as: entries? What about if I want put exif data manually? and more. I'll use this afternoon to think about this.

giancarlobi · 2020-03-18T14:42:38Z

@DiegoPino Can we think in hierarchy of Plugins as workflow? Then a Plugin can participate in more than one workflow, right? Each workflow starts with a Plugin Top, the first of the chain.
Then we have to pair workflow with specific as: , right? Or assign workflow to ADO type? Or ...
More afternoon thinking about this.

DiegoPino · 2020-03-18T14:45:34Z

@giancarlobi yes, a diagram of the workflow can help. Let me see if i can get something done today.
Option A, after sleeping, makes more sense to me. Imagine like traversing a tree from the root to the leaves. Each processor triggers its next child after processing. So the event really just adds the top level processor to the queue. Each Processor is then responsible for looking if there is a child processor that depends on it and adds the corresponding element to a queue, once ready.

About adding data manually.

We could need a flag?. I would say, a signature produced by the queue worker and then some conditional, that says, if ALREADY present, and, not sure how, manual, don't process again? I feel Islandora was not doing right the derivation, because there is no way, of, example given, trigger via the UI a Thumbnail only, if there is no Thumbnail yet. We have no thumbnails of course. But how we decide when processing is needed can be a decision based on what is there (your own EXIF) + a file change? I feel we can almost use some type of Version Control via checksums like git does? Plus timestamps? Open to ideas, but we need to be consistent when we code and also make sure we document this

Thanks my friend

giancarlobi · 2020-03-18T14:56:00Z

@giancarlobi yes, a diagram of the workflow can help. Let me see if i can get something done today.
Option A, after sleeping, makes more sense to me. Imagine like traversing a tree from the root to the leaves. Each processor triggers its next child after processing. So the event really just adds the top level processor to the queue. Each Processor is then responsible for looking if there is a child processor that depends on it and adds the corresponding element to a queue, once ready.

@DiegoPino Yes, really we need a workflow diagram.
Perfect, the event just adds the top level processor of a specific workflow.
Each processor has to know which workflow is member of so it can take the right decision when ends. Also because each processor could be member of more than one workflow.

giancarlobi · 2020-03-18T14:59:21Z

About second question I'm running my neurons and I answer late 😄

giancarlobi · 2020-03-18T15:24:42Z

About adding data manually.

We could need a flag?. I would say, a signature produced by the queue worker and then some conditional, that says, if ALREADY present, and, not sure how, manual, don't process again?

@DiegoPino I'd like something simple as Archipelago philosophy and SBFJSON based.

Derivatives are generated only if the user needs them, the default is no derivatives
User can enable derivatives by UI, Webform, script, ...
User has to be able to fine select derivatives per single as:
User has to be able to set a default for all ADO as:
User has to be able to set derivatives regeneration

So, what about a flag into SBF JSON?
We can set a main flag for all as: at root level and/or a flag at as: level.
The flag stores the processor state (ToDO, DONE, ReDO, ERROR, ...).
The UI/Webform/script writes the flag into SBF-JSON then the event read the flags and execute the workflow when required.
When workflow ends the flag will be set based on workflow result.
When an ADO as: file is updated, we can suggest to the user to regenerate derivatives but the user has to decide it, no automatic.
What do you think, is this too much manually??

DiegoPino · 2020-03-18T15:54:03Z

I like the asking people. I feel automatic-invisible is not the right way. We wan always, add, in case of need some rule based system that executes derivatives automatically for certain type of users that won't understand what is needed, and as you say, can be just a flag that we set as hidden on certain forms. Good!

So we need to decide on those states right? We need logs + info on JSON about the status.

About your question of workflows:

YES. So here is how plugins work (imagine Blocks, which are also plugins)
A plugin is just code that does stuff
Each Plugin, to be useful needs to have some settings. So really the settings, provided by the Configuration Entity, are what people see/interact with. Same Plugin can be used by different "config entities".
A Workflow would be then a set of Config entities, all connected to each other, each one triggering the logic of one or more Plugins

That said, we don't have right now Multiple Workflows, so i would suggest either we do this. Like there is another Config Entity that groups all Single Plugin Config Entities into a Workflow (Lets say its named "PDF Processing for Books".

Or, we start simple and have no Worklows yet, just a single Worklow. Once we get that running we add the top wrapper to have named group of Configurations that run Plugins/Processors.

Let me know if this makes sense?

giancarlobi · 2020-03-18T16:20:07Z

@DiegoPino that make really sense.
We can start with small single steps then we add more pieces to our puzzle.

a single Plugin == single Workflow
a simple Flavour Status Flag: NULL(not present)=Do nothing; 0_FLName=To Process FLName; 1_FLName=OK Done ; 2_FLName= Error
Do you like this?

giancarlobi · 2020-03-18T16:21:12Z

Obviously Flavour Status Flag as 0_FLName could be expanded into JSON syntax.

DiegoPino · 2020-03-18T16:23:43Z

Nice!

DiegoPino · 2020-12-07T14:12:53Z

Solved!

DiegoPino added the enhancement New feature or request label Mar 17, 2020

DiegoPino mentioned this issue Mar 17, 2020

ISSUE-4: First pass on SBF runners plugin system #5

Merged

DiegoPino closed this as completed Dec 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Processor Plugins hierarchical #7

Make Processor Plugins hierarchical #7

DiegoPino commented Mar 17, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

DiegoPino commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020 •

edited

Loading

DiegoPino commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

DiegoPino commented Mar 18, 2020

DiegoPino commented Dec 7, 2020

Make Processor Plugins hierarchical #7

Make Processor Plugins hierarchical #7

Comments

DiegoPino commented Mar 17, 2020

What is this?

How to accomplish this?

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

DiegoPino commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020 • edited Loading

DiegoPino commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

giancarlobi commented Mar 18, 2020

DiegoPino commented Mar 18, 2020

DiegoPino commented Dec 7, 2020

giancarlobi commented Mar 18, 2020 •

edited

Loading