Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Storm supports multiple output streams, we could use this #678

Open
johnynek opened this issue Jul 21, 2016 · 1 comment
Open

Storm supports multiple output streams, we could use this #678

johnynek opened this issue Jul 21, 2016 · 1 comment

Comments

@johnynek
Copy link
Collaborator

Consider the case of:

val s1 = src.map(f1)
val s2 = src.map(f2)

s1.sumByKey(store1)
  .also(s2.sumByKey(store2))

In this case, we will serialize the data from src to a node to run f1 and another to run f2. In fact, storm (and I think heron) support multiple output streams from a node. So we could have a single node run both f1 and f2 and put them on different named outputs. Then the stores subscribe to only one of the two outputs.

This could reduce a storm/heron node in the graph and remove some serializations.

/cc @pankajroark @NPraneeth

@pankajroark
Copy link
Contributor

Yes, this will be good to optimize as well. There are a bunch more places where we can eliminate nodes, good to keep track of them via issues. I'll create more.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants