Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providing input data to Actors using a shared context #109

Closed
joachimvh opened this issue Mar 12, 2018 · 20 comments
Closed

Providing input data to Actors using a shared context #109

joachimvh opened this issue Mar 12, 2018 · 20 comments

Comments

@joachimvh
Copy link
Member

Some context: Miel wants to make a Memento actor. All HTTP calls that get executed should be sent to that actor, who then forwards the request to an actual HTTP actor after doing some changes to the request. This itself is actually quite easy in Comunica: there just needs to be a Bus with this Memento actor in between actors that request an HTTP query and those that resolve one.

The problem is that this Memento actor needs some metadata (a date) that is provided at the entry level (e.g., through the command line input). The question now is how to best make sure that date arrives at the Memento actor, preferably in such a way that it arrives together with the actual HTTP requests to make sure the date isn't actually used for requests that should not need this.

Some random ideas:

  1. The one that was discussed the most. Create a general (read-only) "context" object that gets passed to all actors in the call chain. The init actor that parses the command line arguments would then fill in this object. This would require a change to the input interface of the actors to also take the context as input (which is easy) and a change to how all actors currently call their mediator to make sure they pass the context object down the chain (which would be more work). A potential problem is that somehow the init actor needs to use the same keys when writing the context object as the memento actor will use to read out the metadata, so these need to be synced somehow.

  2. (The other ideas are things that were less discussed and maybe just mentioned). Have some sort of shared memory that all actors can access. In this case the context object would be stored there instead of being passed to all the actors. One problem here is that the metadata doesn't arrive at the same time as the query. It might also decrease the independency of all actors?

  3. Have some sort of Bus where this context gets sent on separately. That way all actors that are interested can subscribe to this Bus. This also has the disadvantage of not having the metadata arrive at the same time as the query.

  4. ... (I'm sure there are still many potential solutions. Like overloading the run function of the HTTP actors and other weird stuff).

@rubensworks
Copy link
Member

Option 2 reminds me of the blackboard pattern, which is also often used in cases like these, so we could probably learn some things from that.

@rubensworks
Copy link
Member

It may also be worth to mention the other option of using the query operation context for the most part, and creating/adapting the deeper actor action interfaces that exist between the QPF actor and the HTTP actor. I'll look into this option a bit more in any case, it may be simpler than it sounds.

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Mar 12, 2018

The context might conceptually solve it, but what obligation is there for actors to look at the context? For instance, it could be cheaper to ignore the context, so actors would explicitly need to acknowledge that they reckon with it.

I'm thinking of a kind of decorator pattern, where you have a Memento actor as follows:

  1. receives a job X with constraint "datetime = t"
  2. puts the job X back on the bus without the constraint, asking for a solution
  3. some actor Y takes on job X, resulting in n subtasks
  4. the Memento actor sticks the constraint "datetime = t" on all of the subtasks

In any case, I think this mechanism is crucial and deserves a brainstorm meeting.

@joachimvh
Copy link
Member Author

The part where Memento has access to the subtasks would be hard with the current framework, there is no way for it to know what is going on deeper down the pipeline until it gets a result back. For Memento it is not required though, the idea here would be

  1. receives a job X with constraint "datetime = t"
  2. puts the job X back on the bus without the constraint, asking for a solution
  3. some actor Y takes on job X, and returns the solution (being the HTTP result specifically here)
  4. the Memento actor does whatever it needs to do with that result (which could be another call to actor Y).

But the main problem of the issue is how to do step 1, that is: how to get that constraint to the Memento actor.

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Mar 12, 2018

I would Step 1 to be really simple, actually. Something like ConstrainedProblem { origProblem, constraints: [ DateTimeConstraint {} ] }.

@rubensworks
Copy link
Member

@RubenVerborgh The constraint approach sounds interesting, but it would probably require refactoring all actors and interfaces to make the actions constrainable actions. So we have to make sure about this before we do it. Also, it would at least require some additional checking mechanism in all actors. (but as @joachimvh has mentioned, it does not fully solve the problem of passing down information to deeper actors in the tree)


For reference, the only interfaces that need to be changed if we just want to pass down the current context deeper are IActionRdfDereferencePaged and IActionRdfDereference.
Instead of both taking a single URL, they would then also take an optional context, just like the query operation actors.
I think this would probably be the easiest and most consistent way of solving this.

@joachimvh
Copy link
Member Author

The problem might be in the future if there are new actors that also need some sort of input from the root level. Only changing the actors that need to change sort of decrease the independency since IActionRdfDereferencePaged now needs to contain a field just because it might be needed further down the road. So I do think that if we do it by passing a context object, it should be changed at the level of the abstract class so all actors have to support context.

@RubenVerborgh
Copy link
Contributor

Indeed; I think we should solve the general problem, not just the Memento case.

@mielvds
Copy link

mielvds commented Mar 13, 2018

Don't let the (big) effort of changing the code hold you back, it's just going to become more work in the long run :)

@rubensworks
Copy link
Member

rubensworks commented Mar 13, 2018

Alright, here are three possible architectures for the constraints. (Note that the query operation context could also be removed in favor of these)

Constrainable Actions

This changes the parameter of run(IAction) and test(IAction) to run(IConstrainableAction) and test(IConstrainableAction).

interface IConstrainableAction {
  public getAction(checker: (constraint: IConstraint) => boolean): IAction throws UnsatisfiedConstraintException
}
type IConstraint = { type: string; value: any };

The callback would be used to iterate over all constraints (possibly none) in this IConstrainableAction, and throw an exception if one of them is not satisfied. Otherwise, the action is returned.

This will make it so that all actors now will be forced to check the constraints before they can read the actual action.

Pros

  • Constraint checking is forced.

Cons

  • Will require a lot of refactoring.
  • No way of doing partial constraint checking and passing down the other constraints.

Partial Constrainable Actions

This also changes the parameter of run(IAction) and test(IAction) to run(IPartialConstrainableAction) and test(IPartialConstrainableAction).

interface IPartialConstrainableAction {
  public getPartial(partialChecker: (constraint: IConstraint) => boolean): IPartialConstrainableAction
  public getAction(checker: (constraint: IConstraint) => boolean): IAction throws UnsatisfiedConstraintException
}
type IConstraint = { type: string; value: any };

The partialChecker callback would be used to iterate over all constraints (possibly none). All constraints that were not satisfied, will be included in the returned action as well.
The other method is the same as in IConstrainableAction, and should be used in the final actors that will not pass on the action further down anymore.

This will make it so that all actors now will be forced to check the constraints before they can read the actual action, and they can do partial constraint checking.

Pros

  • Constraint checking is forced.
  • Compatible with partial constraint checking, as the remaining constraints can be passed down.

Cons

  • Will require a lot of refactoring.

Constraints Field

This requires not changes to the run and test methods. It only requires an additional field in the IAction interface (which is currently empty).

interface IAction {
  public constraints: IConstraint[];
}

In this case, the actors are responsible for checking the constrainst themselves.

Pros

  • Simpler and less invasive changes
  • Compatible with partial constraint checking, as the remaining constraints can be passed down.

Cons

  • As the responsibility now lies in the actor itself, developers could forget to check the constraints, which would lead to bugs.

Conclusion

I think Partial Constrainable Actions is the best long-term solution of these 3.

@rubensworks rubensworks mentioned this issue Mar 14, 2018
@rubensworks
Copy link
Member

I think we'll still need a plain context object as well (possible added to IAction as well) for things that are not a constraint, but merely optional data.

A logger (#114) is one example of something that needs to be passed down, but is not a constraint.

@rubensworks rubensworks removed this from To Do in Development 1.0.0 Mar 14, 2018
@rubensworks rubensworks added this to To Do in Development Major Mar 14, 2018
@joachimvh
Copy link
Member Author

I'm slightly confused now. Is this now to solve the problem of getting the input context to the Memento actor, or to do the restraint thingy Ruben V mentioned? And where would these constraints come from and be defined?

Would it be possible to show in a small example how this would be applied to the Memento problem for example?

@rubensworks
Copy link
Member

Is this now to solve the problem of getting the input context to the Memento actor, or to do the restraint thingy Ruben V mentioned?

Both :-) The Memento use case could be solved using constraints as I see it. But not sufficient for all cases (such as logging), for which a context would still be required. These constraints would be defined by the engine caller (as the root call will also be a constrainable context).

For the Memento use case, this could look like this:

actorInitSparql.run(new ConstrainableAction({ query: 'SELECT ...' }, [ { type: 'memento:datetime', value: '...' } ]));

The Memento actor would then be subscribed to the HTTP bus, where it accepts action with the memento:datetime constraint, and transform it to an action on the bus where this datetime has been transformed to an additional HTTP header.

@joachimvh
Copy link
Member Author

It's the getAction function that is throwing me off. So a ConstrainableAction contains an Action and a list of constraints (key/value pairs), right? But what should an Actor provide as input for the getAction function? I assume most of them would just provide null as an input since they have no restrictions on the input? And Memento would then call input.getAction((constraint) => constraint['memento:datetime])? I just don't see how this is different from passing a context (or list of constraints) and checking if the required key in the context is present when it is required (i.e., Memento checks if its key is in there, if it is it adds the new header), which is what the Constraints Field option is.

So I guess I just don't really understand yet how the getAction functions is supposed to work and who is supposed to call it with what parameters, which might be clear from all my sentences ending with a question mark. :P

@rubensworks
Copy link
Member

Sure, the actor could still implement the checker in an incorrect/unsafe way, which would make it similar to the context.

The alternative would be to move this checking behaviour to the Actor class, but even then, stuff might be implemented wrongly, there's probably no way around that.

Perhaps it's best to not complicate things too much (as people can break it anyways if they really want to), and just add a context and constraints field to the IAction interface, and explain in jsdoc how they should be used.

@mielvds
Copy link

mielvds commented Mar 27, 2018

@joachimvh @rubensworks any resolution for this? Not having Memento would break compatibility with the archives, which is not problematic, but a pity.

@joachimvh
Copy link
Member Author

I discussed this with @rubensworks in our last meeting. We'll probably look into some way to pass a context-like thing down the pipeline (which is doable since all the query actors already do this). But will be for after the release.

@rubensworks rubensworks added this to To do in Development 1.2.0 via automation Jun 13, 2018
@rubensworks rubensworks removed this from To Do in Development Major Jun 13, 2018
@rubensworks rubensworks changed the title Providing input data to Actors Providing input data to Actors using a shared context Jun 13, 2018
@rubensworks
Copy link
Member

Note to self: for this context we'll need a proper namespacing strategy to avoid entry name conflicts for different usages. For instance, we could require the bus/actor package name as a prefix.

@mielvds
Copy link

mielvds commented Jul 11, 2018

@rubensworks what's the status on this?

@rubensworks
Copy link
Member

This is the third issue on the dev 1.2.0 project board, so within the next couple of weeks I guess. Unless other major maintenance issues would arise.

rubensworks added a commit that referenced this issue Jul 24, 2018
This is an abstraction of the query operation context,
and allows data to be passed to any actor.

Closes #109
rubensworks added a commit that referenced this issue Jul 26, 2018
This is an abstraction of the query operation context,
and allows data to be passed to any actor.

Closes #109
rubensworks added a commit that referenced this issue Jul 27, 2018
This is an abstraction of the query operation context,
and allows data to be passed to any actor.

Closes #109
Development 1.2.0 automation moved this from To do to Done Jul 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

4 participants