Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide generalized aggregation #5

Closed
wants to merge 2 commits into from

Conversation

filmaj
Copy link
Owner

@filmaj filmaj commented Jan 17, 2018

This implements a generalized method for aggregating time-series data.
Data can be aggregated over week or month intervals with a variety of
aggregation methods to choose from.

This will be useful for providing chart views at different levels (such
as two-year periods vs. just showing the last month). Additionally, the
generalized form of aggregation can be used to smooth out graphs where
the sampling frequency changed with an update to Hubble Enterprise.

The aggregation is done by splitting the time data into subsequent,
gapless periods of time (weeks starting with Mondays or months), for
each of which the aggregated values are then computed and returned.

Aggregation methods define how to aggregate the values within individual
time periods. The following aggregation methods are supported:

  • sum
  • mean
  • min
  • max
  • first (the chronologically first available value for that period)
  • last
  • median

Additionally, periods at the beginning or the end of the time series may
or may not be included if they are incomplete (there isn’t data for each
day in the period). This is controlled by the setting includeIncomplete,
which supports the following values:

  • none
  • start (includes an incomplete period at the beginning of the series)
  • end
  • both

Finally, the pull request usage chart is changed to make use of the new
aggregation facilities to reduce the granularity from daily to monthly
data for now. This might be changed when we implement detail views.

I also added several unit tests to check the aggregation methods (for
off-by-one errors in particular).

pluehne and others added 2 commits January 24, 2018 13:54
This implements a generalized method for aggregating time-series data.
Data can be aggregated over week or month intervals with a variety of
aggregation methods to choose from.

This will be useful for providing chart views at different levels (such
as two-year periods vs. just showing the last month). Additionally, the
generalized form of aggregation can be used to smooth out graphs where
the sampling frequency changed with an update to Hubble Enterprise.

The aggregation is done by splitting the time data into subsequent,
gapless periods of time (weeks starting with Mondays or months), for
each of which the aggregated values are then computed and returned.

Aggregation methods define how to aggregate the values within individual
time periods. The following aggregation methods are supported:

- sum
- mean
- min
- max
- first (the chronologically first available value for that period)
- last
- median

Periods with incomplete data at the beginning or the end of the time
series are excluded from the aggregation.

Finally, the pull request usage chart is changed to make use of the new
aggregation facilities to reduce the granularity from daily to monthly
data for now. This might be changed when we implement detail views.

I also added several unit tests to check the aggregation methods (for
off-by-one errors in particular) as well as a short piece of
documentation on the new configuration options.
@filmaj filmaj force-pushed the patrick/improved-aggregation branch from a188523 to 7aa2df9 Compare January 24, 2018 19:56
@filmaj filmaj closed this Jan 24, 2018
@codecov-io
Copy link

Codecov Report

Merging #5 into master will increase coverage by 10.52%.
The diff coverage is 70.37%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants