Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming: support streaming and a javascript test datasource (v2 observer) #16729

Merged
merged 31 commits into from
Apr 25, 2019

Conversation

ryantxu
Copy link
Member

@ryantxu ryantxu commented Apr 23, 2019

This PR makes streaming support a core feature. Rather than:

image

It now looks something like:
image

Diagram: https://www.draw.io/#G19wNQ1N141Unx9v8NBF2UV88_UT82UDRL

@ryantxu ryantxu changed the title Streaming: support streaming and a javascript only test datasource (v2 observer) Streaming: support streaming and a javascript test datasource (v2 observer) Apr 23, 2019
Copy link
Member

@torkelo torkelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than returning a result that the runner subscribes to, this passes an observer with the datasource query. This greatly simplifies the implementation overhead:

For the data source implementer or the code in the PanelQueryState? Passing in the observer looks clean but having to also return a streams events with the shutdown makes it feel a bit disjointed. But I do see that some of the OpenStreams logic is now gone in this version.

Which one do you prefer?

* When streaming, this must return the list of expected events
* other events will be ignored and shutdown
*/
streams?: DataStreamEvent[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the naming here feels very confusing, the obserer takes events, but the request method should return expected events? Is this more like a DataStreamState rather than an event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part I am least happy with -- yes DataStreamState is better.

The key issues we need to solve are:

  • keeping consistent order as stream events arrive (otherwise the series colors change!)
  • surviving a refresh event intelligently
  • closing streams reliably -- when done or query changes

In the approach above, we:

  1. return a list of DataStreamState from the ds.query() -- close open streams that don't match open keys.
  2. as events arrive, it checks that they match a key from the latest response, otherwise close them

I would much rather keep the ds.query() result unchanged, and somehow manage order and stale streams with just the callbacks and query method. I think its possible, but have not figured it out yet :)

@ryantxu
Copy link
Member Author

ryantxu commented Apr 23, 2019

Which one do you prefer?

v2 is easier for most datasource implementations: you get passed a function and start writing values to it. However the semantics of how you shutdown/unsubscribe the server are still akward.

The v1 approach is easier if the datasource is backed by a rxjs Subject, but it adds extra management overhead to simple cases where there is really just one observer (the PanelQueryRunner).

I am focused on v2 now since it feels simpler/cleaner, but have left v1 because it works 😄and is closest to the existing streaming support.

…a-alt

* grafana/master:
  Refactor: Make SelectOptionItem a generic type to enable select value typing (grafana#16718)
  docs: fix upgrade instructions
  Chore: Small improvements to grafana-cli (grafana#16670)
  Chore: Use x/xerrors instead of pkg/errors (grafana#16668)
  Chore: a bit of spring cleaning (grafana#16710)
  Fixes grafana#15863 (grafana#16684)
  Docs: Update notification services (grafana#16657)
  PanelQueryRunner: add datasource name to queries (grafana#16712)
@ryantxu
Copy link
Member Author

ryantxu commented Apr 23, 2019

@torkelo -- I now feel much better about the state of this PR! The main trick was hiding the aggregation details in PanelQueryState#getPanelData() -- it takes a simple path when not streaming and a more complicated one when it is.

I think we should:

  1. do whatever minor cleanup/changes you think make sense
  2. merge
  3. focus on: MetricsPanelCtrl: use shared queryRunner to support query execution #16659

Meanwhile, I will:

Copy link
Member

@torkelo torkelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Looks like it has problems when switch dashboard and come back, streaming does not resume. Might be a problem in StreamHandler or in the closing logic. Will investigate later today I hope
  • Looks like the streaming triggers renders for non visible panels (results in Graph errors when you have 2 panels and edit one of them and the other is streaming and triggers renders). Will try to investigate later.


// Add the stream to our list
let found = false;
const active = this.streams.map(s => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if streams would work better as object map (by key) instead of an array. Maybe makes other stuff clunky, not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried map... But consistent order is important. And we have to loop on the array anyway. When you add to theap it should not change the order of existing series

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, makes sense, array it will be then 👍


const request = this.stream.request;
data.meta = {
request: request.requestId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't find this used anywhere, there is a requestId prop on QueryResultMeta as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- i just removed it. This is an artifact of maintaining #16676

I'll revist this later -- the meta.requestId (not request!) gets used to pick the right sub-request to show in the QueryEditor -- it shows thing like how long it has been running etc.

@torkelo
Copy link
Member

torkelo commented Apr 24, 2019

Looks like it has problems when switch dashboard and come back, streaming does not resume. Might be a problem in StreamHandler or in the closing logic. Will investigate later today I hope

Fixed this problem, was a StreamHandler that tried to reuse workers that had been unsubscribed (so had no observer).

Looks like the streaming triggers renders for non visible panels (results in Graph errors when you have 2 panels and edit one of them and the other is streaming and triggers renders). Will try to investigate later.

This problem is a bit trickier. We have to somehow block renders in PanelChrome when receiving data updates. One solution was to skip setState when not visible but save state and do it on next render (render triggeres when exiting fullscreen / edit mode)

// if we are getting data while another panel is in fullscreen / edit mode
// we need to store the data but not update state yet
this.delayedStateUpdate = stateUpdate;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #15554, we could consider adding something like:
panel.getQueryRunner().enableStreamEvents( this.props.isInView ) when isInView changes:
https://github.com/grafana/grafana/pull/15554/files#diff-791865e8ee91eb8f3e2cfa67eff71ce5R96

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the panel query runner is shared with another panel that is visible? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better idea!

We just unsubscribe() when isInView becomes false, and re-subscribe when isInView becomes true. subscribe() immediately updates the state with any changes that happened while disconnected.

As implemented, the streams stay open even if everyone has disconnected. They only get closed if the next query/refresh does not match any open streams.

Lets wait till #15554 for this change since that PR makes visibility part of the state :)

…into streaming-test-data-alt

* 'streaming-test-data-alt' of github.com:ryantxu/grafana:
  PanelChrome: delay state update while other panel is in fullscreen/edit
  TestData: don't reuse unsubbed stream workers
  PanelQueryState: give code a bit of air (space)
  PanelQuery: Minor refactoring
  sqlstore: use column name in order by (grafana#16583)
  user friendly guide (grafana#16743)
  Provisioning: Interpolate env vars in provisioning files (grafana#16499)
  admin: add more stats about roles (grafana#16667)
  Feature: Migrate Legend components to grafana/ui (grafana#16468)
  playlist: fix loading dashboards by tag (grafana#16727)
  CloudWatch: Use default alias if there is no alias for metrics (grafana#16732)
  Provisioning: Support FolderUid in Dashboard Provisioning Config (grafana#16559)
@ryantxu
Copy link
Member Author

ryantxu commented Apr 25, 2019

I made one more change -- I did not like that the check for isRunning was based on the state variable. The usage is to check if we have an active open Promise... so I changed it to check that explicitly.

@torkelo
Copy link
Member

torkelo commented Apr 25, 2019

Argh, so close to merge this but a few things with the PanelQueryRunner and PanelQueryState is bugging me, feel it very hard to read and understand some parts of this code.

  1. Complex call chains. PanelQueryState has the dataStreamObserver that calls this.streamCallback that is set externally form PanelQueryRunner that points to throttled function that then calls back into PanelQueryState.getPanelData() (and then publish the result on the subject). Since the PanelQueryRunner owns the subject this roundabout is needed.
  2. PanelQueryState.getPanelData() has side effects (updates the stream array). Maybe this can be moved from getPanelData to dataStreamObserver?

@ryantxu
Copy link
Member Author

ryantxu commented Apr 25, 2019

  1. Complex call chains.
    The main issue here is needing to process all requests from the datasource, but throttle the updates to any subscribers.

The streamCallback is weird, but makes this distinction easier to support

  1. dataStreamObserver is only called when it gets data from a stream -- getPanelData is called whenever you want data. so it makes sure the data is valid then returns it. Importantly, this happens after a query executes and within the throttled query callback. I can rename it to validateStreamsAndGetData() (or something similarly wordy) and maybe that will help explain the state manipulation in the getter

@ryantxu
Copy link
Member Author

ryantxu commented Apr 25, 2019

feel it very hard to read and understand some parts of this code.

Ha! it was hard to understand and write parts of this code :)

…into streaming-test-data-alt

* 'streaming-test-data-alt' of github.com:ryantxu/grafana:
  PanelQueryRunner: added more tests
@ryantxu ryantxu merged commit 470634e into grafana:master Apr 25, 2019
@ryantxu ryantxu deleted the streaming-test-data-alt branch April 25, 2019 18:01
ryantxu added a commit to ryantxu/grafana that referenced this pull request Apr 25, 2019
…MetricPanelCtrl

* grafana/master: (34 commits)
  Streaming: support streaming and a javascript test datasource (grafana#16729)
  GraphLegendEditor: use stats picker rather than switches (grafana#16759)
  Feature: add cron setting for the ldap settings (grafana#16673)
  Build: Disables gosec until identified performance problems (grafana#16764)
  Chore: bump jQuery to 3.4.0 including prototype pollution vulnerability fix (grafana#16761)
  elasticsearch: add 7.x version support (grafana#16646)
  Provisioning: Add API endpoint to reload provisioning configs (grafana#16579)
  Config: Show user-friendly error message instead of stack trace (grafana#16564)
  Chore: Lowered implicit anys limit to 5954
  Feature: Enable React based options editors for Datasource plugins (grafana#16748)
  sqlstore: use column name in order by (grafana#16583)
  user friendly guide (grafana#16743)
  Provisioning: Interpolate env vars in provisioning files (grafana#16499)
  admin: add more stats about roles (grafana#16667)
  Feature: Migrate Legend components to grafana/ui (grafana#16468)
  playlist: fix loading dashboards by tag (grafana#16727)
  CloudWatch: Use default alias if there is no alias for metrics (grafana#16732)
  Provisioning: Support FolderUid in Dashboard Provisioning Config (grafana#16559)
  Refactor: Make SelectOptionItem a generic type to enable select value typing (grafana#16718)
  docs: fix upgrade instructions
  ...
ryantxu added a commit to ryantxu/grafana that referenced this pull request Apr 25, 2019
* grafana/master:
  Streaming: support streaming and a javascript test datasource (grafana#16729)
  GraphLegendEditor: use stats picker rather than switches (grafana#16759)
  Feature: add cron setting for the ldap settings (grafana#16673)
  Build: Disables gosec until identified performance problems (grafana#16764)
  Chore: bump jQuery to 3.4.0 including prototype pollution vulnerability fix (grafana#16761)
  elasticsearch: add 7.x version support (grafana#16646)
  Provisioning: Add API endpoint to reload provisioning configs (grafana#16579)
  Config: Show user-friendly error message instead of stack trace (grafana#16564)
  Chore: Lowered implicit anys limit to 5954
  Feature: Enable React based options editors for Datasource plugins (grafana#16748)
  sqlstore: use column name in order by (grafana#16583)
  user friendly guide (grafana#16743)
  Provisioning: Interpolate env vars in provisioning files (grafana#16499)
  admin: add more stats about roles (grafana#16667)
  Feature: Migrate Legend components to grafana/ui (grafana#16468)
  playlist: fix loading dashboards by tag (grafana#16727)
  CloudWatch: Use default alias if there is no alias for metrics (grafana#16732)
  Provisioning: Support FolderUid in Dashboard Provisioning Config (grafana#16559)
ryantxu added a commit to ryantxu/grafana that referenced this pull request Apr 25, 2019
* grafana/master: (31 commits)
  Streaming: support streaming and a javascript test datasource (grafana#16729)
  GraphLegendEditor: use stats picker rather than switches (grafana#16759)
  Feature: add cron setting for the ldap settings (grafana#16673)
  Build: Disables gosec until identified performance problems (grafana#16764)
  Chore: bump jQuery to 3.4.0 including prototype pollution vulnerability fix (grafana#16761)
  elasticsearch: add 7.x version support (grafana#16646)
  Provisioning: Add API endpoint to reload provisioning configs (grafana#16579)
  Config: Show user-friendly error message instead of stack trace (grafana#16564)
  Chore: Lowered implicit anys limit to 5954
  Feature: Enable React based options editors for Datasource plugins (grafana#16748)
  sqlstore: use column name in order by (grafana#16583)
  user friendly guide (grafana#16743)
  Provisioning: Interpolate env vars in provisioning files (grafana#16499)
  admin: add more stats about roles (grafana#16667)
  Feature: Migrate Legend components to grafana/ui (grafana#16468)
  playlist: fix loading dashboards by tag (grafana#16727)
  CloudWatch: Use default alias if there is no alias for metrics (grafana#16732)
  Provisioning: Support FolderUid in Dashboard Provisioning Config (grafana#16559)
  Refactor: Make SelectOptionItem a generic type to enable select value typing (grafana#16718)
  docs: fix upgrade instructions
  ...
ryantxu added a commit to ryantxu/grafana that referenced this pull request Apr 29, 2019
* grafana/master:
  build: restore postgres integration tests (grafana#16801)
  docs: explain correct access control model of GCS buckets (grafana#16792)
  Chore: Fixed no implicit any Typescript errors (grafana#16799)
  Feature: introduce LdapActiveSyncEnabled setting (grafana#16787)
  Plugins: ReactPanelPlugin to VizPanelPlugin (grafana#16779)
  UX: Improve Grafana usage for smaller screens  (grafana#16783)
  ThresholdEditor: Minor style fix for smaller screens (grafana#16791)
  Build: Use isolated modules for ts-jest (grafana#16786)
  LDAP Refactoring to support syncronizing more than one user at a time. (grafana#16705)
  build: removes unused vendored files
  (fix/explore): remove vertical-align looks better for long logs (grafana#16736)
  Chore: bump jQuery to 3.4.0 in grafana/ui (grafana#16781)
  Devenv: Updated home dashboard and added new influxdb test dashboard
  Chore: Lowered implicit anys limit to 5946
  RefreshPicker: minor design update (grafana#16774)
  Streaming: support streaming and a javascript test datasource (grafana#16729)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants