Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Server --> Client communication channel support #98881

Open
mshustov opened this issue Apr 30, 2021 · 11 comments
Open

Investigate Server --> Client communication channel support #98881

mshustov opened this issue Apr 30, 2021 · 11 comments
Labels
enhancement New value added to drive a business result Feature:http Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mshustov
Copy link
Contributor

mshustov commented Apr 30, 2021

Update 23 July 2023

We seem to have consensus that SSE is the easiest to deploy in our users environment and provide enough power for our use cases. The biggest risk is that several plugins open a SSE connection simultaneously leaving only a few connections for other API requests. With the push to remove bfetch we'll be encouranging more users to switch to http2 but adoption might take time.

If we have a need to maintain several SSE connections in the short to medium term we might need to "multiplex" SSE "streams" over a single SSE connection and expose this as a core service. The purpose of this issue is to document and align around a short to medium term plan.


There is a growing number of cases when the Kibana server wants to inform the browser part about an event that occurred in the system. Since the Kibana server doesn't provide this functionality out-of-the-box, Kibana plugins have to work around this limitation by patterns like long-polling, manual request/response batching (bfetch plugin)

There are at least two potential candidates to implement server-client communication:

  • WebSockets
  • Server-Sent Events

We should evaluate risks before introducing one in the Core:

  • proxy and load balancing support
    • ideally, no additional setup required for an intermediate proxy
    • proxy doesn't break communication
  • number of supported parallel connections (might be blocked by lack of http2 support)
  • changes required to the Kibana Security model
  • changes required to the Kibana authentication model

cc @streamich @lizozom

@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc triage_needed enhancement New value added to drive a business result labels Apr 30, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@joshdover
Copy link
Contributor

There is a growing number of cases when the Kibana server wants to inform the browser part about an event that occurred in the system.

Any related issues or known use cases we can link to here?

@lizozom
Copy link
Contributor

lizozom commented May 3, 2021

@joshdover here's one

@mshustov
Copy link
Contributor Author

mshustov commented May 4, 2021

@joshdover upcoming Notification service, licensing plugin to notify the client-side about license status update
IIRC the Core team faced this problem while working on SO tagging or global search. @pgayvallet do you remember the use case?

@joshdover
Copy link
Contributor

I think we need to be quite careful about introducing a new networking protocol into Kibana. Our customers deploy Kibana behind a number of different proxies and other systems and not all are configured to support HTTP/2 and/or WebSockets at this time.

HTTP/2

One major hurdle to introducing HTTP/2 support is the requirement to use TLS. Though not actually required by the HTTP/2 spec, all major browser vendors only allow HTTP/2 connections over TLS.

I suspect that the interactive setup mode project (#89287) may move us closer to being able to require TLS, however we'd still need a long grace period before we could require that all customers enable TLS. We also don't have a fool-proof way to detect how many customers are using TLS since termination could be happening at the load balancer, rather than at Kibana itself.

The connection limit problem really becomes an issue for users who have multiple Kibana tabs open since this cap is enforced across all tabs. It may be interesting to see if we can workaround this issue with a SharedWorker that uses a single dedicated connection shared across multiple tabs, using SSE under-the-hood. It definitely feels like we're trying to implement HTTP/2 over HTTP/1.1 though and I'm not optimistic it will work out. For example, workers must copy all data that is passed to windows or other workers which may be non-trivial overhead.

I think we really need to consider leveraging HTTP/2 so that mechanisms like bfetch aren't necessary anymore. It may mean a less optimal experience for customers without HTTP/2 support in their stack, but they should be able to fallback gracefully. We may even be able to detect this client side and use bfetch as a fallback during the transition period to requiring TLS. We can then start to notify them in the UI when they're using HTTP/1 and start pushing them to reconfigure their stack to support HTTP/2 for performance improvements.

For general performance, my vote would be to start supporting HTTP/2 before exploring other, more specialized approaches like WebSockets. Long-polling (or even just regular polling) would in theory be much less expensive and more performant due to using a long-lived TCP connection that is already up to full-speed. Header compression also helps.

I think trying to exhaust our options with HTTP/2 (and optionally, SSEs) would be wise before we look at WebSockets. It's a much more widely supported technology, has a built-in fallback to HTTP/1.1, and requires much less developer education to adopt and leverage. HTTP/2 would help Kibana's client-side performance across a wide range of touch points in the product, not least of which being initial page load time.

It's important that we continue to consider how to accommodate our users' deployment environments, but we've also seen that customers who frequently update the Elastic Stack are also more likely to be willing and able to upgrade related systems like load balancers and proxies. Typically, a customers who do not upgrade the Stack frequently are the same ones using older proxy configurations that do not support HTTP/2. HTTP/2 is now 6 years old and widely supported.

The primary hurdle remaining is the TLS requirement, but I think we can document and notify our users to guide them towards a more performance Kibana (all while increasing the security of their Stack).

@legrego
Copy link
Member

legrego commented May 4, 2021

I suspect that the interactive setup mode project (#89287) may move us closer to being able to require TLS, however we'd still need a long grace period before we could require that all customers enable TLS.

++ interactive setup mode is a step in the right direction, but our initial scope of work excludes TLS setup for Kibana's web server. Once we have a setup mode, it'll be less work to add TLS, but the primary reason we removed it from the initial scope because of browser trust: we either have to somehow provision certificates that all browsers will trust out-of-the-box (Let's encrypt is not a silver bullet), -or- we teach our users to ignore browser security warnings when we present an untrusted certificate (😬)

We also don't have a fool-proof way to detect how many customers are using TLS since termination could be happening at the load balancer, rather than at Kibana itself.

This should be fairly easy to do with client-side telemetry, if that's a route we want to explore. We can't capture telemetry on older versions, but it would give is more than we have today

@joshdover
Copy link
Contributor

This should be fairly easy to do with client-side telemetry, if that's a route we want to explore. We can't capture telemetry on older versions, but it would give is more than we have today

Great point, I've opened an issue: #99229

@pgayvallet
Copy link
Contributor

but they should be able to fallback gracefully. We may even be able to detect this client side and use bfetch as a fallback during the transition period to requiring TLS

Imho the solution should to be have bfetch switch its transport implementation depending on the current capabilities. It currently only supports one transport, let's call it chunked-content. When we'll support HTTP2, and if the instance's configuration / infra supports it, it should uses SSE instead, and fallback to the current chunked-content otherwise. That way, consumers of the bfetch plugin don't have to care about these implementation details.

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Nov 4, 2021
@pgayvallet
Copy link
Contributor

http2 support has been added, and we know this is the direction we want to go for SSE, so I'll consider the investigations done and close this.

@pgayvallet
Copy link
Contributor

(@afharo you were right in the end!) Closed too soon - we will use this for our experimentations around SSE

@pgayvallet pgayvallet reopened this Jul 11, 2024
@pgayvallet pgayvallet removed loe:small Small Level of Effort impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. triage_needed labels Jul 11, 2024
@tsullivan
Copy link
Member

There is a growing number of cases when the Kibana server wants to inform the browser part about an event that occurred in the system. Since the Kibana server doesn't provide this functionality out-of-the-box, Kibana plugins have to work around this limitation by patterns like long-polling, manual request/response batching (bfetch plugin)

I opened a new issue to brain-dump and discuss why I think that long-polling and manual request/response batching will likely continue to be the best strategy for keeping application state in Kibana up-to-date: #189131. Basically, Elasticsearch doesn't support an event stream that subscribers can listen to (yet). That means we have to have polling happening somewhere, and it's probably least complext for that polling to happen in the browser client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:http Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

9 participants