Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreting activties in timeline #91

Open
cfraz89 opened this issue Dec 12, 2022 · 4 comments
Open

Interpreting activties in timeline #91

cfraz89 opened this issue Dec 12, 2022 · 4 comments

Comments

@cfraz89
Copy link
Contributor

cfraz89 commented Dec 12, 2022

    I am little worried this won't scale with all of the Eventuals.
  • Activity - Scheduled, Completed, Failed, TimeoutOut, HeartbeatTimeout - Has Seq
  • Sleep - Started, Complete - Has Seq
  • Workflow - Schedule, Completed, Failed, TimeoutOut - Has Seq
  • Condition - Started - Has Seq
  • ExpectSingal - Started, Timedout - Has Seq
  • Signal - Received - Has ID
  • Events - Published - Has Seq

I think I would... group by seq and then use heuristics to determine 1. start 2. end and 3. other events
For Signals, just show them on the graph on their own (like start and end).

const groups = groupBy(events.filter(isHistoryEvent), event => event.seq);
const eventuals = Object.keys(groups).map(seq => {
    const started = groups[seq].find(isScheduledEvent);
    const ended = groups[seq].find(or(isCompletedEvent, isFailedEvent, isWorkflowTimedOut));
    const other = groups[seq].filter(event => event !== started && event !== ended);
    return { started, ended, other };
});
const otherEvents = events.filter(or(isSignalReceived, isWorkflowStarted, isWorkflowFailed, isWorkflowCompleted, isWorkflowTimedOut));

Originally posted by @thantos in #84 (comment)

@cfraz89 cfraz89 changed the title I am little worried this won't scale with all of the Eventuals. Interpreting activties in timeline Dec 12, 2022
@cfraz89
Copy link
Contributor Author

cfraz89 commented Dec 12, 2022

The code is matching activities right now, but don't see why it can't be broadened to match any schedule/end event, using the same heuristic style as your sampel code. While I do generally like more declarative code, I think in this case doing the grouping / find thing results in a lot of extra processing, doing a bunch of passes over nested lists. It can just be done in one pass by building the output map as you go:

let spanningActivities = Record<seq, TimelineActivity>{}
const otherActivities = TimelineActivity[]
 events.forEach((event) => {
    if (match(event, isWorkflowStarted)) {
      start = event;
    } else if (match(event, isScheduledEvent)) {
      //creates timelineactivity with start time and event type
      spanningActivities[event.seq] = {type: activityType(event.type), start: event.start}
    } else if (match(event, or(isCompletedEvent, isFailedEvent, isWorkflowTimedOut)) {
    //update timelineactivity with state and end time
     spanningActivities[event.seq].state = {status: activityStatus(event), end: event.end}
    } else if (event.match(or(isSignalReceived, isWorkflowStarted, isWorkflowFailed, isWorkflowCompleted, isWorkflowTimedOut)) {
    otherActivities.push(activityOf(event))
    }
}

return  [...Object.entries(spanningActivities), ...otherActivities]

As far as I can tell the outcome is the same, just a bit lower level, a bit more verbose, but saving a bunch of computation.

@thantos
Copy link
Contributor

thantos commented Dec 12, 2022

After grouping, the nested lists should be small, I doubt you'd be able to come up with a realistic history where you'd tell the performance difference. The groupby is doing most of the work here.

Clean/correct code > micro-optimizations. Then optimize if there is an actual issue.

Your implementation:

  1. relies on the order
  2. will always take the last event that matches the condition (mine takes the first)
  3. depends on both events existing in order (not fault tolerant)
  4. doesn't filter out the start or end event from "other"
  5. ignores the non-state/end events that may relate to an "Eventual"/activity.

If your stateful accumulator can resolve this issues (with more code), then I think it would be fine, but... correct > micro-optimized.

@sam-goodwin
Copy link
Owner

Isn't the maximum size of a "group" 2? So it's not really that much extra processing - the finds are constant time

@thantos
Copy link
Contributor

thantos commented Dec 12, 2022

Maybe not max (ex: we may get a "timeout" and then later get the "complete") or we may have intra-activity events in the future like a heartbeat checkpoint that gets stored in the workflow.

Though I'd constrain the size of a group to an average of < magnitude 10. (aka essentially constant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants