Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading Scenarios #813

Open
cantino opened this issue May 2, 2015 · 11 comments
Open

Upgrading Scenarios #813

cantino opened this issue May 2, 2015 · 11 comments

Comments

@cantino
Copy link
Member

cantino commented May 2, 2015

Currently, Huginn Scenarios contain a set of Agents. They are much like tags, in that an Agent can be “in” many Scenarios at once, and the Scenario serves mostly as a label and as a means of import and export. There are a couple of things that currently make Scenario reuse challenging.

First, it’s challenging to share and reuse Scenarios because they are not variablized, meaning that if I make a RainWarningScenario and give it to you, you have to manually edit the contained WeatherAgent and change the location that it watches, instead of just setting a location variable on the Scenario itself. This gets more difficult with complex Scenarios where necessary configuration may be scattered across a bunch of Agents.

Second, it’s not currently possible to make Scenarios that themselves act like Agents, meaning that you can’t currently create a blackbox “Should Send Alert Scenario”, containing some number of Agents, where some Agent(s) in the Scenario subscribe to outside Events, other contained Agents process them, and finally some Agent(s) emit them again back to the outside world. If this were possible, you could create complex Scenarios and treat them as single abstract Agents in a larger system.

Solving these two problems would make Huginn Scenarios much more powerful and reusable.

Proposed changes:

  • Agents can no longer be in multiple Scenarios at the same time.
    • Add migrator that duplicates any Agents that exist in multiple Scenarios, wiring them into the Event flow correctly.
  • Options
    • Scenarios have options that can be edited. All contained Agents have access to these options via a Liquid tag. (Something like scenerio_option ‘foo’)
    • Scenarios have editable default_options.
    • When exporting a Scenario, the default_options are included, but the options are not. This way you can share a Scenario without sharing your configuration of it. (Or perhaps inclusion of options is optional?)
  • Event flow
    • One or more Agents can be selected as the “input” Agents of a Scenario.
    • One or more Agents can be selected as the “output” Agents of a Scenario.
    • Add an indexed scenario_id to Event
    • Scenarios that have at least one output Agent show up in Agent's Source lists, so an Agent can listen for all Events from a given Scenario.
    • A Scenario can subscribe to Events from an Agent, and all Agents in the Scenario marked as “input” Agents will receive those Events. (Is there a cleaner way to do this than having it on the Scenario edit page?)
  • Scenarios are displayed in the primary Agent diagram as a single node, or perhaps as a set of nodes with a box around them.
  • Reuse
    • Option 1: Every Scenario has a “prototype” Scenario that is abstract and cannot receive or emit events. You can make “instances” of the base prototype Scenario for use, which cannot be edited, but can receive and emit events.
    • Option 2: All Scenarios can be used and edited. You can make a copy of a Scenario for reuse. When developing a Scenario, it’s up to the user to be careful to avoid having a bunch of different copies with small edits in them. (I think I prefer this option.)

Realistically, I don’t think I have time to build all of this myself, but I do have time to help manage it if I can find some volunteers to collaborate on the effort. I’d also be glad to mentor an intern over the summer on this effort, if anyone here is interested, or knows someone who would be.

Questions and thoughts:

  • This can happen in parallel with the Create a Huginn Scenario directory site  #390 effort to build a community Scenario sharing site. @nerdbaggy
  • How do we make Scenario options work with FormConfigurable? @dsander
  • In addition to Scenario options, do we need a concept of Scenario memory? Should Agents be able to write to shared memory that others can access without using Events? This would facilitate coupled systems for synchronization (“When I post to Twitter, send it to Facebook. When I post to Facebook, send it to Twitter. Please avoid infinite loops!”) and other more complex problem solving. Making Huginn better at bi-directional sync #708 and @adambedford
  • I’ve been thinking about behavior trees, a concept from video game AI, and thinking about how they could be used in Huginn for more complex systems. However, as a programer, I’m hesitant to make a larger and larger visual programming environment. (Perhaps what I’d prefer would be an asynchronous programming language with function calls that can encapsulate complex, long-running asynchronous behavior.)

Feedback and discussion greatly appreciated! @knu, @chriseidhof, @alias1, @andrewcurioso, @virtadpt, @bennlich, @akilism, @albertsun, @CloCkWeRX, @icblenke, @dsander, @gtramontina and everyone else.

@virtadpt
Copy link
Collaborator

virtadpt commented May 2, 2015

"Agents can no longer be in multiple Scenarios at the same time."

Hear, hear. If one doesn't keep a close eye on which scenario an agent needs to go in (I've adopted the convention of putting " - " in front of each agent's name) it's easy for them to wind up someplace unexpected, which makes debugging suck.

"Options"

This would be useful, sort of like a subset of the Credentials export functionality. Having an option to check or un-check to clear the options on a scenario as its exported would be nice, to minimize the risk of accidentally outing one's API keys or credentials.

"Event flow"

If I'm reading this right you mean tinkertoying scenarios - GenericRSSInputNetwork, GenericXMPPOutputNetwork, stuff like that.

"A Scenario can subscribe to Events from an Agent, and all Agents in the Scenario marked as “input” Agents will receive those Events."

I'm not entirely sure how useful this bit would be but I'd like to hear from everybody else before forming an opinion.

"Scenarios are displayed in the primary Agent diagram as a single node, or perhaps as a set of nodes with a box around them."

I'm inclined to ++ the latter, if only to make it easier to look at the name of the input variable the agent network requries. The former would be helpful if there was a way to click on or hover over it to glimpse at the nodes in that particular network, I think.

"Reuse - Option 2"

This would be my preferred means of implementation because specifics could be tweaked as necessary. For example, setting the "clean" option in instances of the WebsiteAgent because some websites and RSS feeds are only parsable when it's turned on, and some are only parsable when it's turned off. Needing two prototype scenarios set both ways would be clunky to say the least. Being able to clone a scenario would be great under this option.

I like the idea of scenario memory for just the sample you gave. It would be a huge timesaver for crossposting.

As for adding behavior trees, I'd be more inclined to implement those outside of Huginn as separate software systems (virtadpt/exocortex-halo) and expose REST APIs to instances of WebsiteAgent to make better use of system resources and not tax Huginn's scheduler overmuch. These seem specialized enough that they're probably not a good fit for Huginn.

@0xdevalias
Copy link
Member

Overall sounds like good ideas to me.

I like the idea of the scenario as a whole being a 'black box' with defined inputs/outputs. I think there would be a lot more value in subscribing it to inputs if they had a 'type' though (eg. EventDomain/EventType I talked about a while back)

Displaying on the graph as a single node, but with the ability to 'inspect' it and see it's inner agents is a good idea I think.

If we can black box a scenario, and it's agents are self contained, for composability we should allow scenarios to use other scenarios within them. Which brings up the question of versioning.

Shared memory for scenarios for sure. This is similar to how the rulesets for a pico can all read/write from the pico PDS (personal data store), or from their own local store (for private memory) in CloudOS and/or https://github.com/welcomer/framework

Don't know enough about behaviour trees to comment, but as far as async language/etc, that's why we chose to use scala and akka for the welcomer framework. Not sure of the parallels in rails-land (though I still have it on my backlog to look into running huginn on jruby / jRoR)

@virtadpt
Copy link
Collaborator

virtadpt commented May 3, 2015

One thing: It would be handy for scenarios to some space set aside in the JSON document for internal documentation or arbitrary text, explaining what it was used for, maybe with some versioning or licensing information.

@dsander
Copy link
Collaborator

dsander commented May 3, 2015

This sounds like a great idea.

Agents can no longer be in multiple Scenarios at the same time.

Shouldn't that something the user decides? I find it handy to have a source agent in multiple scenarios which contain the agents that are 'processing' the events the source emits.
I could move the source agent out of the scenario, but that would mean an export of the scenario would miss the source agent. If I move the processing agents into one scenario I would end up with a scenario doing two different things.

Add migrator that duplicates any Agents that exist in multiple Scenarios, wiring them into the Event flow correctly.

In my example that would mean you would end up with duplicated source agents what are both doing the same check to an external service. That brings back #543 which we could use to advice the user to restructure their scenarios to avoid having duplicate source agents.

Scenarios have options that can be edited. All contained Agents have access to these options via a Liquid tag. (Something like scenerio_option ‘foo’)

You would use the scenario options to simplify the agent configuration? For simple (form configurable) agents it could be enough to expose the autocompletable fields when the scenario is imported.

Options, Event flow

Maybe the Scenario should inherit from Agent (or a slimmed down BaseAgent), so it can be used in the existing event propagation flow and job queues?

@gtramontina
Copy link
Contributor

I think I like this idea of scenarios being more like a blackbox too...

A few things/questions that come to my mind (I couldn't read the whole thread yet, so forgive me if anything here has been mentioned already):

  1. "Agents can no longer be in multiple Scenarios at the same time." – to be clear, are we talking about agent instances? If they really can't be in multiple scenarios at the same time, but we can communicate (in terms of input/output) between scenarios, we could have a "single-agent" scenario to aid that case.
  2. How (or if) this relates to extracting agents into plugins/gems (Initial effort on pulling Agents into gems #554, Moving Agents into local gems #293)?
  3. Could we borrow some ideas from https://github.com/soundcloud/pipeline-generator? I'm thinking more about the way "jobs" are described and connected to each other – and represented visually.
  4. Regarding scenario options and default_options, maybe that could be handled during the scenario installation/import process. Huginn would detect that there are required fields/settings, such as credentials, and ask the user to set it up, because this can be very environment-specific.
  5. We could think of a way of describing the agent graph and event flow with a simple language (don't know of any from the top of my head, but something like the following – note that this is very pseudo, but based on neo4j's cypher' syntax):
// -[external_in_event]->(agent_1)-[out_1]->
// -[out_1]->(agent_2)-[out_2]->
// -[out_1]->(agent_3)-[out_3]->
// -[out_2]->(agent_4)                              # note that this agent does not emit anything.
// -[out_3]->(agent_5)-[out_event_to_the_world]->   # would need to find a way to identify external in/out events

@elvetemedve
Copy link
Contributor

Just a tiny suggestion: I would like to see the nodes coloured in the diagram view in the same way how scenario labels are shown in the agent view.

@cantino
Copy link
Member Author

cantino commented Mar 10, 2016

Nice idea @elvetemedve, would you mind submitting a separate issue for this?

@elvetemedve
Copy link
Contributor

@cantino Here you go: #1345

@axsuul
Copy link
Contributor

axsuul commented Jun 22, 2017

I love the scenario blackbox idea. In my use case, I've had to duplicate a lot of my scenarios by hand since I'm doing the same thing a lot for slightly different variables (i.e. crawling different pages of a site, just the url is different). Being able to reuse scenarios with different variables would be interesting and powerful.

@thiagotalma
Copy link
Contributor

Has anyone here been an active user of Yahoo Pipes?
Pipes can give us many good ideas for Hugin's improvement.

I was a heavy user of Pipes and discovered Hugin when I was orphaned by the death of the project.

Hugin's proposal is different from Pipes but I was able to adapt. But sometimes I find myself thinking "if Hugin did such a thing as Pipes would be great."

One of the features that I miss the most is the one that is being discussed here.

Let me give you an example.
I have to monitor multiple feeds, perform specific procedures depending on each item, then merge them into a single feed ensuring uniqueness. The problem is that there are more than 30 feeds and many of the procedures are the same, but they are not all the same.

In Pipes I would create some operators and reuse them for each stream. In Hugin I have to duplicate the entire structure. If there is any modification in the operator I have to make the same modification 30 times (okay, I know there are many ways to solve this, but it's just a didactic example).

Now with this idea I can create the scenarios as operators and reuse them in each of the 30 feeds, just as it could be done in Pipes.

Perhaps this discussion concludes that the concept of scenario is nothing more than a simple labeler for agents and continue as is and a new concept should be created: an Operator Reusable and Shareable, which receives an input, processes and returns an exit.

So, I'm eagerly tracking the progress on this issue.
It's a shame I'm not a Ruby programmer to help them.

@virtadpt
Copy link
Collaborator

virtadpt commented Dec 1, 2017

In some of my scenarios, I do pretty much what you describe. The way I handle that use case is like this:

  • Allocate agents that pull the feeds in question.
  • Group those agents by the transformations that need to be done (none, filter, extraction, math, whatever).
  • Feed all of them, where feasible, into a single agent that carries out the transformation.
  • Feed all of the transformations into a Deduplication Agent.
  • Output the Deduplication Agent.

It takes some planning but it's pretty easy once you have the pattern down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants