Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research how alerting will render charts on the server side #54987

Closed
mikecote opened this issue Jan 15, 2020 · 12 comments
Closed

Research how alerting will render charts on the server side #54987

mikecote opened this issue Jan 15, 2020 · 12 comments
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

No description provided.

@mikecote mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jan 15, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@peterschretlen
Copy link
Contributor

Related #49908

@stacey-gammon
Copy link
Contributor

Implementing #55622 will probably be the solution to this. Then the question just becomes, "how does the alert know the right url". If it's coming from an embeddable, we can create a dedicated embeddable url viewer that takes type and input in the URL and renders it. We could probably do something similar with alerts built off a custom expression language, if we have a generic expression embeddable.

And maybe if the alert is from a regular search, we could build a link to discover showing the results of that search.

I think #25247 will be very useful here.

@pmuellr
Copy link
Member

pmuellr commented Jan 23, 2020

That seems like a lot of work to do a one-time generation of a png to attach to outgoing alert message, but it does seem like a nice super-general approach!

It's probably even more important to have a relevant link back into the app itself, than a graph, so it would be super-nice if both of those URLs were the same, or you could do path-math on them to calculate one from the other. Eg,

http://.../siem/foo 
http://.../siem/foo?render-image=400x300 

@pmuellr
Copy link
Member

pmuellr commented Jan 23, 2020

It makes me wonder if there is maybe a lo-fi solution we could implement that would be "good enough" (and extend it later). Basically, most alerts that would want to include a "graph" in their message are likely going to want a 2-d chart, of some kind, probably over time, showing their service on fire. And they likely would either have data the for that chart already, or be able to get it easily. Perhaps provide just a couple of base charts they could generate, throw the data in there, generate a png as easily as possible, include that in the message.

For node, the "generate a png as easily as possible" seems to be an unsolved problem, popular packages still seem to use imagemagik, phantomjs, Cairo (wow!?!) as canvas renderers. Used to be my bailiwick, decades ago! Why is this still hard? I've used emscripten'ized versions of Graphviz in the past, work great! If we had to generate network graphs, problem solved!

@stacey-gammon
Copy link
Contributor

It's probably even more important to have a relevant link back into the app itself, than a graph, so it would be super-nice if both of those URLs were the same, or you could do path-math on them to calculate one from the other. Eg,

If we had the URL service, I think I would see this working like :

fireSlackNotification({ title, message, previewLinkId,  previewLinkState, contextLinkId, contextLinkState}) {
  const previewUrl = directAccessLinkPlugin.getLinkGenerator(previewLinkId).createUrl(previewLinkState);
 const png = reportingPlugin.createPng(previewUrl);

  const contextUrl = directAccessLinkPlugin.getLinkGenerator(contextLinkId).createUrl(contextLinkState);
});

  sendSlackMessage({ previewUrl, contextUrl, title, message });

We could also store the actual URLs, instead of the generatorId plus state, but the idea behind storing the id and state is that the plugins that registered those specific link generators can handle migrations, not the plugin developers of the alerting framework (how to migrate all the URL strings stored with alerts?). Like the createUrl functionality would be something like:

class LinkGenerator {
createUrl(state) {
 const migratedState = this.migrateState(state);
 return buildUrl(migratedState);
}

migrateState(state) {
  if (state.version === ...) { return ... }
  if (state.version === ...) { return ... }
} 

If you stored the URL string and not the link generator id, then this needs to be handled in the app itself. That is okay but then you are stuck with supporting those older URLs for a longer time, where as this way, the alerting team could write a migration that keeps the state up to date.

  return alerts.map(alert => alert.previewLinkState = directAccessLinkPlugin.getLinkGenerator(alert.previewLinkId).migrateState(alert.previewLinkState)

Then the app can deprecate older state versions and our saved objects are up to date, only bookmarked URLs will fail.

For alerts that came from Embeddables, the view URL would be something generic:

// previewLinkId = `embeddableViewer` - all embeddables can use the same id here...  
const url = directAccessLinkPlugin.getLinkGenerator(previewLinkId).createLink({
      input, type
    });
// http://.../embeddableViewer?input=... 

but the link back to the embeddable in it's original context would be much different:

const url = directAccessLinkPlugin.getLinkGenerator(contextLink).createLink({
      input, type
    });
// http://.../kibana/dashboard?....

So if you create a threshold alert from a panel inside a dashboard, the image preview will just be of that single panel, but clicking on the link would send the user back to the whole dashboard that contained that panel.

@pmuellr
Copy link
Member

pmuellr commented Jan 27, 2020

That all sounds about right.

One thing though - I expect the contextUrl should be expected to be fairly long-lived (and include relevant date info), but the previewUrl - in some cases anyway - only needs to exist for the life of an action execution. Eg, for slack and email, we actually want the bits in the png image, which we will "attach" to the outgoing message, and then that url won't ever be used again. For actions that can't embed image data directly (eg, a webhook service that allows you to pass an image as a URL reference), the previewUrl would probably have the same lifetime as the contextUrl.

@peterschretlen
Copy link
Contributor

peterschretlen commented Feb 10, 2020

Scale and throughput is one area that worries me for rendering alerts. Alerts are time-sensitive, and have spiky loads: one minute there are none, the next 1000 alerts fire in a short period. That’s not a great fit for the current reporting architecture.

The render pipeline for reporting (or any browser based render) is pretty long, something like:
Job -> Generate URL & state -> Load app in Chromium -> Calls made to server to fetch data -> Render DOM -> Screenshot

This pipeline has some bottlenecks, the main ones I think are:

  • “load app” (Loading Kibana just takes a long time)
  • “render dom” (chromium will have throughput limits for how many pages it can load concurrently and the speed it can render them)

Could remove these bottlenecks? This is not a fully baked idea, but say we could:

  1. Not load the app, instead render just an "alert chart" as HTML & CSS server side? Using DOMServer.renderToStaticMarkup() or some equivalent
  2. If the input is static HTML & CSS, create a stateless chromium render service with higher throughput? It wouldn’t need to make calls to kibana or ES, or even load any scripts. Then it could be externalized, e.g. run a pool of them behind a load balancer. Or create a serverless implementation.

I’m sure there are practical considerations I’m not accounting for here, but there may be other ways to remove these bottlenecks.

I realize that’s the exact opposite of the generalized reporting discussed above. But whether we use a general mechanism or something specialized to alerting, I think we need to address the bottlenecks as well.

@stacey-gammon
Copy link
Contributor

For 1. the problem is that we have built our components to not require any particular framework and that is a react specific solution.

  1. is similar - embeddables can render anything.

We could have embeddables "opt-in" to 1. and 2. for improved speed, but what is our fallback? No rendering, or the old reporting infrastructure?

I have thought about 2. as a solution to "shareable embeddables" (#52960). Similar "opt-in" thing, embeddables could convert themselves into a stateless version.

Here is another thought - what if we cached the images created by embeddables? This technically would be easier, and if there are 1000 alerts firing in a short time, what are the chances they are each rendering something different? Considerations there are similar for "make it slow": - security and relative time ranges.

@peterschretlen
Copy link
Contributor

peterschretlen commented Feb 10, 2020

For 1) the problem is that we have built our components to not require any particular framework and that is a react specific solution. 2) is similar - embeddables can render anything.

@stacey-gammon I agree doing this across embeddables would be a no-go, I was thinking more a solution specific to alerting, which could impose some constraints/conventions for these "alert charts" that would appear in the alert creation flyout and the alert details page.

#52960 - exporting the embeddable container - does sound like a better, more general way of handling this.

The caching is a good point - if I'm monitoring a group of N things, they might be shown as N series on the same chart. So if they all alert at the same time, so I might not need to render N images. Something to consider when designing these alert charts.

@stacey-gammon
Copy link
Contributor

Chatted in Slack....

If we base this off the expression language and restrict it to only known renderers that we know map 1:1 to an elasticchart, we might be able to something like that. The renderer is the last function in the expression string so we can split apart the data fns from the renderer function. The data functions, executed, can provide the data used to generate the alert trigger. The embeddable could expose optional output of an expression string. If given, you can create an expression alert from this embeddable. This would still would mean another reporting infrastructure and we'd need the old reports for dashboards, non-expression embeddables, which would be more difficult because they are not backed by a single expression function, and even if they were, the renderer would not map 1:1 with a single renderer.

@mikecote
Copy link
Contributor Author

mikecote commented Aug 6, 2020

Closing issue and merging with #49908.

@mikecote mikecote closed this as completed Aug 6, 2020
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

6 participants