Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetricsView and column timeseries decoupling #2925

Merged
merged 36 commits into from
Oct 26, 2023
Merged

Conversation

egor-ryashin
Copy link
Contributor

@egor-ryashin egor-ryashin commented Aug 15, 2023

#2850

Examples:

image

image

where model is:

select * from AdBids where publisher = 'Yahoo' union all
select * from AdBids where publisher = 'Google' and  date_trunc('day', timestamp) = timestamp '2022-02-09'

Summary by CodeRabbit

  • New Feature: Enhanced the efficiency of database queries by optimizing variable declaration, reducing potential bugs and improving performance.
  • Refactor: Improved the structure and readability of the MetricsViewTimeSeries struct and its methods, enhancing modularity and error handling.
  • Test: Added comprehensive test cases for the truncateTime function, ensuring accurate time truncation across different time grains and zones.
  • New Feature: Updated the prepareTimeSeries function to handle missing intervals in a time series, improving data consistency and representation.
  • Test: Added UI test to verify the visibility of the "Total Bid Price" button, enhancing the reliability of the user interface.
  • Refactor: Modified the assertLeaderboards function for more accurate leaderboard entries retrieval.


func (q *MetricsViewTimeSeries) resolveDuckDB(ctx context.Context, rt *runtime.Runtime, instanceID string, mv *runtimev1.MetricsView, priority int) error {
ms, err := resolveMeasures(mv, q.InlineMeasures, q.MeasureNames)
ntr, err := resolveNormaliseTimeRange(ctx, rt, instanceID, priority, mv.Model, mv.TimeDimension, tr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still maintains coupling (the coupling is problematic for the new granular access policies).

But is time range normalization even necessary for the metrics view API? I think it's okay to require the UI to pass a time granularity.

@egor-ryashin
Copy link
Contributor Author

Added null-filling for UI:
image
Model:

select * from AdBids where publisher = 'Yahoo' union all
select * from AdBids where publisher = 'Google' and ( 
date_trunc('day', timestamp) in( timestamp '2022-02-09', timestamp '2022-02-07', timestamp '2022-02-05')
or date_trunc('hour', timestamp) in( timestamp '2022-02-08 01:00:00', timestamp '2022-02-08 02:00:00')
or date_trunc('hour', timestamp) in( timestamp '2022-02-10 01:00:00', timestamp '2022-02-10 02:00:00')
or date_trunc('hour', timestamp) in( timestamp '2022-02-08 22:00:00', timestamp '2022-02-08 23:00:00')
)

pjain1 added a commit that referenced this pull request Aug 29, 2023
since #2925 is not merged yet, we are still using this instead of MV query so cache key will have metricsview policy
@pjain1 pjain1 mentioned this pull request Aug 29, 2023
pjain1 added a commit that referenced this pull request Aug 29, 2023
since #2925 is not merged yet, we are still using this instead of MV query so cache key will have metricsview policy
Egor Ryashin added 2 commits August 30, 2023 16:55
@egor-ryashin egor-ryashin marked this pull request as draft August 31, 2023 09:58
@egor-ryashin egor-ryashin marked this pull request as ready for review September 6, 2023 15:20
@egor-ryashin
Copy link
Contributor Author

On my laptop it works fine:

npx playwright test --trace on dashboards.spec.ts

Running 3 tests using 1 worker
  Slow test file: [chromium] › dashboards.spec.ts (35.1s)
  Consider splitting slow test files to speed up parallel execution
  3 passed (36.3s)

Copy link
Member

@djbarnwal djbarnwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added comments around the code changes.

Moreover -

  1. On changing state and refreshing the page, the time series charts are empty
  2. Dimension comparisons do not work


unmount();
},
{ retry: 5 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @AdityaHegde Can you review these test changes?

Comment on lines 57 to 60
let dtStart = DateTime.fromISO(start, { zone }).startOf(dtu);
const dtEnd = DateTime.fromISO(end, { zone }).startOf(dtu);
let dtCompStart = DateTime.fromISO(compStart, { zone }).startOf(dtu);
const dtCompEnd = DateTime.fromISO(compEnd, { zone }).startOf(dtu);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fetch times are pre-computed with offsets and startOf anchors. Do we still need this here?


if (
i < original.length &&
dtStart.equals(DateTime.fromISO(original[i].ts, { zone }))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do a native ISO string to ISO string comparison which should be much faster than using luxon.

if (comparison) {
if (
k < comparison.length &&
dtCompStart.equals(DateTime.fromISO(comparison[k].ts, { zone }))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with ISO string equality

Comment on lines 107 to 115
switch (dtu) {
case "year":
dtStart = dtStart.plus({ years: 1 });
dtCompStart = dtCompStart.plus({ years: 1 });
break;
case "quarter":
dtStart = dtStart.plus({ quarters: 1 });
dtCompStart = dtCompStart.plus({ quarters: 1 });
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switch case can be replaced with something like -
dtStart = getOffset(dtStart, timeGrain.duration, TimeOffsetType.ADD)

const result = prepareTimeSeries(
original,
comparison,
<TimeGrain>(<any>{ label: "hour", duration: Period.HOUR }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the TIME_GRAIN object present in config.ts directly. ex - TIME_GRAIN.TIME_GRAIN_HOUR

@@ -3,7 +3,8 @@ import { bisectData } from "@rilldata/web-common/components/data-graphic/utils";
import { roundToNearestTimeUnit } from "./round-to-nearest-time-unit";
import { getDurationMultiple, getOffset } from "../../../lib/time/transforms";
import { removeZoneOffset } from "../../../lib/time/timezone";
import { TimeOffsetType } from "../../../lib/time/types";
import { TimeOffsetType, TimeGrain } from "../../../lib/time/types";
import { DateTime, DateTimeUnit } from "luxon";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using luxon directly. We have most functions exposed in /time/transforms

Egor Ryashin added 2 commits October 12, 2023 16:25
@begelundmuller begelundmuller requested review from k-anshul and removed request for begelundmuller October 20, 2023 12:48
@@ -235,10 +141,35 @@ func (q *MetricsViewTimeSeries) resolveDruid(ctx context.Context, olap drivers.O
return err
}

tz := time.UTC
if q.TimeZone != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can only be computed once outside the for loop ?

var data []*runtimev1.TimeSeriesValue
var nullRecords *structpb.Struct
for rows.Next() {
rowMap := make(map[string]any)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be initialised once outside the for loop since we are okay if map's contents are overwritten in the next iteration?
Also consider passing length to make as well since we already know it ?

if nullRecords == nil {
nullRecords = generateNullRecords(records)
}
if start.Before(t) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to understand what the lines from 154-163 does.

  1. start.Before(t) Isn't this condition always true ?
  2. Can we not assign start directly to either q.TimeStart(if non null) or the first row in the record to simplify the if else conditions?
  3. firstDay and firstMonth are set to 1,1 always. Is that correct ?

Copy link
Contributor Author

@egor-ryashin egor-ryashin Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. start can be equal t, if there are gaps in data start can be less than t
  2. no, incorrect. The code for this wasn't merged at the time of the commit.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume, we have the range TimeStart = '2023-10-01' and TimeEnd '2023-10-05' but the data exist only for the '2023-10-05':

├───────┼─────────────────────────────────┤
│     1 │ 2023-10-05 00:00:00             │
└───────┴─────────────────────────────────┘

The code adds null entries from '2023-10-01' to '2023-10-04':

┌─────────┬─────────────────────┐
│ measure │        range        │
│  int32  │      timestamp      │
├─────────┼─────────────────────┤
│         │ 2023-10-01 00:00:00 │
│         │ 2023-10-02 00:00:00 │
│         │ 2023-10-03 00:00:00 │
│         │ 2023-10-04 00:00:00 │
│       1 │ 2023-10-05 00:00:00 │
└─────────┴─────────────────────┘

start < TimeStart here initially

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start can be equal t, if there are gaps in data start can be less than t

But we still don't need start.Before(t)? The addNulls will do nothing if start == t.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if start.Before(t) { // there's a data gap, the first data entry is after start
	if zeroTime.Equal(start) { // check that start hasn't been initialized yet
		if q.TimeStart != nil { // if range is not defined then don't have to fill nulls
			start = truncateTime(q.TimeStart.AsTime(), q.TimeGranularity, tz, int(fdow), int(fmoy))
			data = addNulls(data, nullRecords, start, t, q.TimeGranularity) // range is defined then fill gaps in range [start, t)
		}
	} else { // start has been initialized
		data = addNulls(data, nullRecords, start, t, q.TimeGranularity) // fill gaps in range [start, t]
	}
}

yes, we don't need to fill nulls when start = t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had understood the addNulls part. I wanted to point to simplifying the if else conditions since start.Before(t) if checks seems redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's required when there's a gap in the middle too

├───────┼─────────────────────┤
│       │ 2023-10-04 00:00:00 │
│     1 │ 2023-10-05 00:00:00 │
│       │ 2023-10-06 00:00:00 │
│     2 │ 2023-10-08 00:00:00 │
└───────┴─────────────────────┘

@coderabbitai
Copy link

coderabbitai bot commented Oct 25, 2023

Walkthrough

The changes primarily focus on improving the handling of time series data across various components. This includes refactoring and enhancing the MetricsViewTimeSeries struct, adding test cases for time-related functions, and updating UI components to handle time adjustments. The changes aim to improve efficiency, modularity, error handling, and data consistency.

Changes

File(s) Summary
runtime/queries/.../column_timeseries.go Moved the declaration of rowMap outside the loop to prevent redeclaration and overwriting on each iteration.
runtime/queries/.../metricsview_timeseries.go
runtime/queries/.../metricsview_timeseries_test.go
Refactored MetricsViewTimeSeries struct and its methods. Added test cases for truncateTime function.
runtime/server/.../queries_metrics_timeseries_test.go Updated test cases to handle different time ranges and null values.
runtime/testruntime/.../timeseries_gap_model.sql Added SQL statements to select specific data sets.
web-common/src/.../MetricsTimeSeriesCharts.svelte
web-common/src/.../prepare-timeseries.spec.ts
web-common/src/.../utils.ts
Updated UI components and tests to handle time adjustments.
web-local/test/ui/dashboards.spec.ts
web-local/test/ui/utils/dashboardHelpers.ts
Added UI test case and modified a helper function.

"In the land of code, where logic is king, 🤴

Changes were made, improvements they bring. 🛠️

Time is now tamed, in every test and string, ⏰

Oh, what joy these updates will sing! 🎶"


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

Copy link
Member

@k-anshul k-anshul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for the platform changes. Passing to @djbarnwal and @begelundmuller for further review.

@@ -235,10 +159,26 @@ func (q *MetricsViewTimeSeries) resolveDruid(ctx context.Context, olap drivers.O
return err
}

if nullRecords == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be outside of the for loop as well ?

data = append(data, &runtimev1.TimeSeriesValue{
Ts: timestamppb.New(t),
Records: records,
})
start = addTo(t, q.TimeGranularity)
}
if q.TimeEnd != nil && nullRecords != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move init of nullRecords outside of the for loop then we can remove the nullRecords != nil condition as well.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 6

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 4d675ea and 382bba4.
Files ignored due to filter (1)
  • runtime/testruntime/testdata/timeseries/dashboards/timeseries_gaps.yaml
Files selected for processing (9)
  • runtime/queries/column_timeseries.go (1} hunks)
  • runtime/queries/metricsview_timeseries.go (7} hunks)
  • runtime/queries/metricsview_timeseries_test.go (1} hunks)
  • runtime/server/queries_metrics_timeseries_test.go (5} hunks)
  • runtime/testruntime/testdata/timeseries/models/timeseries_gap_model.sql (1} hunks)
  • web-common/src/features/dashboards/time-series/prepare-timeseries.spec.ts (1} hunks)
  • web-common/src/features/dashboards/time-series/utils.ts (1} hunks)
  • web-local/test/ui/dashboards.spec.ts (1} hunks)
  • web-local/test/ui/utils/dashboardHelpers.ts (1} hunks)
Files skipped from review due to trivial changes (5)
  • runtime/queries/column_timeseries.go
  • runtime/queries/metricsview_timeseries_test.go
  • runtime/testruntime/testdata/timeseries/models/timeseries_gap_model.sql
  • web-local/test/ui/dashboards.spec.ts
  • web-local/test/ui/utils/dashboardHelpers.ts
Additional comments: 23
runtime/server/queries_metrics_timeseries_test.go (9)
  • 216-216: The test TestServer_Timeseries_exclude_notnull_like now expects no results. Ensure that this change is intentional and that it doesn't break any existing functionality.

  • 266-266: The test TestServer_Timeseries_numeric_dim now expects no results. Ensure that this change is intentional and that it doesn't break any existing functionality.

  • 344-364: The test TestServer_Timeseries_TimeRange_Day has been added. It tests the functionality of retrieving time series data for a specific day with a specific measure. The test seems to be well-structured and covers the necessary assertions.

  • 366-390: The test TestServer_Timeseries_TimeRange_Start has been added. It tests the functionality of retrieving time series data starting from a specific date. The test seems to be well-structured and covers the necessary assertions.

  • 392-422: The test TestServer_Timeseries_TimeRange_Start_2day_before has been added. It tests the functionality of retrieving time series data starting from a date two days before the data. The test seems to be well-structured and covers the necessary assertions.

  • 424-451: The test TestServer_Timeseries_TimeRange_End has been added. It tests the functionality of retrieving time series data up to a specific end date. The test seems to be well-structured and covers the necessary assertions.

  • 453-483: The test TestServer_Timeseries_TimeRange_End_2day_after has been added. It tests the functionality of retrieving time series data up to a date two days after the data. The test seems to be well-structured and covers the necessary assertions.

  • 486-520: The test TestServer_Timeseries_TimeRange_middle_nulls has been added. It tests the functionality of retrieving time series data with null values in the middle of the data. The test seems to be well-structured and covers the necessary assertions.

  • 719-719: The test TestServer_Timeseries_1day now expects no results. Ensure that this change is intentional and that it doesn't break any existing functionality.

web-common/src/features/dashboards/time-series/prepare-timeseries.spec.ts (10)
  • 6-25: The test case "should fill in missing intervals" is well written. It tests the prepareTimeSeries function with a set of timestamps and expects the function to fill in the missing intervals. The test case is clear and concise, and it checks the functionality of the prepareTimeSeries function correctly.

  • 27-46: The test case "should fill in missing intervals in winter" is similar to the previous one but tests the function with a different set of timestamps. This test case is also well written and checks the functionality of the prepareTimeSeries function correctly.

  • 48-67: The test case "should fill in missing intervals, CET" tests the prepareTimeSeries function with a different timezone. This is a good practice as it ensures that the function works correctly across different timezones.

  • 69-77: The helper function formatDateToEnUS is used to format the date in a specific format. This function is used in multiple test cases, which is a good practice as it reduces code duplication.

  • 79-109: The test case "adjusts the timestamp for the given timezone" tests the adjustOffsetForZone function with different timezones. This test case is well written and checks the functionality of the adjustOffsetForZone function correctly.

  • 111-146: The test case "comparison, should fill in missing intervals" tests the prepareTimeSeries function with a comparison set of timestamps. This test case is well written and checks the functionality of the prepareTimeSeries function correctly.

  • 148-183: The test case "comparison, should fill in missing intervals, America/Argentina/Buenos_Aires (no DST)" is similar to the previous one but tests the function with a different timezone. This test case is also well written and checks the functionality of the prepareTimeSeries function correctly.

  • 185-207: The test case "should include original records" tests the prepareTimeSeries function with a set of records and expects the function to include the original records in the result. This test case is well written and checks the functionality of the prepareTimeSeries function correctly.

  • 209-245: The test case "should include comparison records" tests the prepareTimeSeries function with a comparison set of records and expects the function to include the comparison records in the result. This test case is well written and checks the functionality of the prepareTimeSeries function correctly.

  • 247-283: Duplicate test case```

Committable suggestion (Beta)
// Remove duplicate test case
runtime/queries/metricsview_timeseries.go (4)
  • 141-142: The rowMap variable declaration has been moved outside the loop to prevent redeclaration and overwriting on each iteration. This is a good performance optimization as it reduces the number of memory allocations.

  • 191-236: The Export method has been refactored to use helper methods for generating filenames and building SQL queries. This improves the readability and maintainability of the code.

  • 277-303: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [247-298]

The buildMetricsTimeseriesSQL method has been significantly refactored. It now accepts an additional olap drivers.OLAPStore parameter and uses the buildMetricsTimeseriesSQL method. This improves the modularity and maintainability of the code.

  • 350-452: The truncateTime, generateNullRecords, addNulls, and addTo helper functions have been added. These functions handle time-related operations more effectively, including adjusting timestamps for time zones and filling in missing intervals in a time series. This improves the accuracy and consistency of the data.

Comment on lines 247 to 283
test("should include comparison records", () => {
const original = [
{
ts: "2020-01-01T00:00:00Z",
records: {
clicks: 100,
revenue: 10,
},
},
];

const comparison = [
{
ts: "2020-01-02T00:00:00Z",
records: {
clicks: 200,
revenue: 20,
},
},
];

const result = prepareTimeSeries(
original,
comparison,
<TimeGrain>(<any>{ label: "hour", duration: Period.HOUR }),
"UTC",
"2020-01-01T00:00:00Z",
"2020-01-01T01:00:00Z",
"2020-01-02T00:00:00Z",
"2020-01-02T01:00:00Z"
);

expect(result[0].clicks).toEqual(100);
expect(result[0].revenue).toEqual(10);
expect(result[0]["comparison.clicks"]).toEqual(200);
expect(result[0]["comparison.revenue"]).toEqual(20);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case appears to be a duplicate of the previous one. It tests the same functionality with the same inputs and expects the same results. If this is not intentional, consider removing this duplicate test case to reduce redundancy.

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines 81 to 92
if err != nil {
return err
}

if opts.PreWriteHook != nil {
err = opts.PreWriteHook(q.generateFilename(q.MetricsView))
if err != nil {
return err
}
}

tmp := make([]*structpb.Struct, 0, len(q.Result.Data))
meta := append([]*runtimev1.MetricsViewColumn{{
Name: q.MetricsView.TimeDimension,
}}, q.Result.Meta...)
for _, dt := range q.Result.Data {
dt.Records.Fields[q.MetricsView.TimeDimension] = structpb.NewStringValue(dt.Ts.AsTime().Format(time.RFC3339Nano))
tmp = append(tmp, dt.Records)
}

switch opts.Format {
case runtimev1.ExportFormat_EXPORT_FORMAT_UNSPECIFIED:
return fmt.Errorf("unspecified format")
case runtimev1.ExportFormat_EXPORT_FORMAT_CSV:
return writeCSV(meta, tmp, w)
case runtimev1.ExportFormat_EXPORT_FORMAT_XLSX:
return writeXLSX(meta, tmp, w)
case runtimev1.ExportFormat_EXPORT_FORMAT_PARQUET:
return writeParquet(meta, tmp, w)
}

return nil
}

func (q *MetricsViewTimeSeries) generateFilename(mv *runtimev1.MetricsViewSpec) string {
filename := strings.ReplaceAll(q.MetricsViewName, `"`, `_`)
if q.TimeStart != nil || q.TimeEnd != nil || q.Filter != nil && (len(q.Filter.Include) > 0 || len(q.Filter.Exclude) > 0) {
filename += "_filtered"
}
return filename
}

func (q *MetricsViewTimeSeries) resolveDuckDB(ctx context.Context, rt *runtime.Runtime, instanceID string, mv *runtimev1.MetricsViewSpec, priority int, policy *runtime.ResolvedMetricsViewSecurity) error {
ms, err := resolveMeasures(mv, q.InlineMeasures, q.MeasureNames)
r, err := ctrl.Get(context.Background(), &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: q.MetricsViewName}, false)
if err != nil {
return err
}

measures, err := toColumnTimeseriesMeasures(ms)
if err != nil {
return err
}

tsq := &ColumnTimeseries{
TableName: mv.Table,
TimestampColumnName: mv.TimeDimension,
TimeRange: &runtimev1.TimeSeriesTimeRange{
Start: q.TimeStart,
End: q.TimeEnd,
Interval: q.TimeGranularity,
},
Measures: measures,
MetricsView: mv,
MetricsViewFilter: q.Filter,
TimeZone: q.TimeZone,
MetricsViewPolicy: policy,
FirstDayOfWeek: mv.FirstDayOfWeek,
FirstMonthOfYear: mv.FirstMonthOfYear,
}
err = rt.Query(ctx, instanceID, tsq, priority)
if err != nil {
return err
}

r := tsq.Result

q.Result = &runtimev1.MetricsViewTimeSeriesResponse{
Meta: r.Meta,
Data: r.Results,
}

return nil
}

func toColumnTimeseriesMeasures(measures []*runtimev1.MetricsViewSpec_MeasureV2) ([]*runtimev1.ColumnTimeSeriesRequest_BasicMeasure, error) {
res := make([]*runtimev1.ColumnTimeSeriesRequest_BasicMeasure, len(measures))
for i, m := range measures {
res[i] = &runtimev1.ColumnTimeSeriesRequest_BasicMeasure{
SqlName: m.Name,
Expression: m.Expression,
}
}
return res, nil
}

func (q *MetricsViewTimeSeries) resolveDruid(ctx context.Context, olap drivers.OLAPStore, mv *runtimev1.MetricsViewSpec, priority int, policy *runtime.ResolvedMetricsViewSecurity) error {
sql, tsAlias, args, err := q.buildDruidMetricsTimeseriesSQL(mv, policy)
mv := r.GetMetricsView().Spec
sql, tsAlias, args, err := q.buildMetricsTimeseriesSQL(olap, mv, q.ResolvedMVSecurity)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context.Background() is used in the ctrl.Get method call. It's generally a good practice to use the context passed into the function (ctx in this case) instead of creating a new one, to ensure that any deadlines, cancellation signals, or other context-related values are propagated correctly. This is important for managing the lifecycle of the request and for proper handling of timeouts and cancellations.

- r, err := ctrl.Get(context.Background(), &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: q.MetricsViewName}, false)
+ r, err := ctrl.Get(ctx, &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: q.MetricsViewName}, false)
Committable suggestion (Beta)
Suggested change
ctrl, err := rt.Controller(ctx, instanceID)
if err != nil {
return err
}
if opts.PreWriteHook != nil {
err = opts.PreWriteHook(q.generateFilename(q.MetricsView))
if err != nil {
return err
}
}
tmp := make([]*structpb.Struct, 0, len(q.Result.Data))
meta := append([]*runtimev1.MetricsViewColumn{{
Name: q.MetricsView.TimeDimension,
}}, q.Result.Meta...)
for _, dt := range q.Result.Data {
dt.Records.Fields[q.MetricsView.TimeDimension] = structpb.NewStringValue(dt.Ts.AsTime().Format(time.RFC3339Nano))
tmp = append(tmp, dt.Records)
}
switch opts.Format {
case runtimev1.ExportFormat_EXPORT_FORMAT_UNSPECIFIED:
return fmt.Errorf("unspecified format")
case runtimev1.ExportFormat_EXPORT_FORMAT_CSV:
return writeCSV(meta, tmp, w)
case runtimev1.ExportFormat_EXPORT_FORMAT_XLSX:
return writeXLSX(meta, tmp, w)
case runtimev1.ExportFormat_EXPORT_FORMAT_PARQUET:
return writeParquet(meta, tmp, w)
}
return nil
}
func (q *MetricsViewTimeSeries) generateFilename(mv *runtimev1.MetricsViewSpec) string {
filename := strings.ReplaceAll(q.MetricsViewName, `"`, `_`)
if q.TimeStart != nil || q.TimeEnd != nil || q.Filter != nil && (len(q.Filter.Include) > 0 || len(q.Filter.Exclude) > 0) {
filename += "_filtered"
}
return filename
}
func (q *MetricsViewTimeSeries) resolveDuckDB(ctx context.Context, rt *runtime.Runtime, instanceID string, mv *runtimev1.MetricsViewSpec, priority int, policy *runtime.ResolvedMetricsViewSecurity) error {
ms, err := resolveMeasures(mv, q.InlineMeasures, q.MeasureNames)
r, err := ctrl.Get(context.Background(), &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: q.MetricsViewName}, false)
if err != nil {
return err
}
measures, err := toColumnTimeseriesMeasures(ms)
if err != nil {
return err
}
tsq := &ColumnTimeseries{
TableName: mv.Table,
TimestampColumnName: mv.TimeDimension,
TimeRange: &runtimev1.TimeSeriesTimeRange{
Start: q.TimeStart,
End: q.TimeEnd,
Interval: q.TimeGranularity,
},
Measures: measures,
MetricsView: mv,
MetricsViewFilter: q.Filter,
TimeZone: q.TimeZone,
MetricsViewPolicy: policy,
FirstDayOfWeek: mv.FirstDayOfWeek,
FirstMonthOfYear: mv.FirstMonthOfYear,
}
err = rt.Query(ctx, instanceID, tsq, priority)
if err != nil {
return err
}
r := tsq.Result
q.Result = &runtimev1.MetricsViewTimeSeriesResponse{
Meta: r.Meta,
Data: r.Results,
}
return nil
}
func toColumnTimeseriesMeasures(measures []*runtimev1.MetricsViewSpec_MeasureV2) ([]*runtimev1.ColumnTimeSeriesRequest_BasicMeasure, error) {
res := make([]*runtimev1.ColumnTimeSeriesRequest_BasicMeasure, len(measures))
for i, m := range measures {
res[i] = &runtimev1.ColumnTimeSeriesRequest_BasicMeasure{
SqlName: m.Name,
Expression: m.Expression,
}
}
return res, nil
}
func (q *MetricsViewTimeSeries) resolveDruid(ctx context.Context, olap drivers.OLAPStore, mv *runtimev1.MetricsViewSpec, priority int, policy *runtime.ResolvedMetricsViewSecurity) error {
sql, tsAlias, args, err := q.buildDruidMetricsTimeseriesSQL(mv, policy)
mv := r.GetMetricsView().Spec
sql, tsAlias, args, err := q.buildMetricsTimeseriesSQL(olap, mv, q.ResolvedMVSecurity)
r, err := ctrl.Get(ctx, &runtimev1.ResourceName{Kind: runtime.ResourceKindMetricsView, Name: q.MetricsViewName}, false)

Comment on lines 119 to 125
tz := time.UTC
if q.TimeZone != "" {
tz, err = time.LoadLocation(q.TimeZone)
if err != nil {
return err
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time.LoadLocation function can return an error if the provided time zone string is not recognized. It's good to see that this error is being handled. However, it might be beneficial to provide more context in the error message to help with debugging if an invalid time zone is provided.

- return err
+ return fmt.Errorf("invalid timezone '%s': %w", q.TimeZone, err)
Committable suggestion (Beta)
Suggested change
tz := time.UTC
if q.TimeZone != "" {
tz, err = time.LoadLocation(q.TimeZone)
if err != nil {
return err
}
}
if err != nil {
return fmt.Errorf("invalid timezone '%s': %w", q.TimeZone, err)
}

Comment on lines +127 to +135
fdow := mv.FirstDayOfWeek
if mv.FirstDayOfWeek > 7 || mv.FirstDayOfWeek <= 0 {
fdow = 1
}

fmoy := mv.FirstMonthOfYear
if mv.FirstMonthOfYear > 12 || mv.FirstMonthOfYear <= 0 {
fmoy = 1
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code checks if the FirstDayOfWeek and FirstMonthOfYear values are within the valid range and assigns a default value if they are not. This is a good practice as it ensures that the values used in the subsequent calculations are valid. However, it might be beneficial to log a warning if the provided values are out of range, to help with debugging and to alert the user that their input was invalid and a default value was used instead.

Comment on lines 55 to 133
result.push({
ts,
ts_position,
bin: originalPt.bin,
...originalPt.records,
...toComparisonKeys(comparisonPt || {}, offsetDuration, zone),
};
});
});

if (
i < original.length &&
dtStart.equals(DateTime.fromISO(original[i].ts, { zone }))
) {
result[j] = {
...result[j],
...original[i].records,
};
i++;
}
if (comparison) {
if (
k < comparison.length &&
dtCompStart.equals(DateTime.fromISO(comparison[k].ts, { zone }))
) {
result[j] = {
...result[j],
...toComparisonKeys(comparison[k], offsetDuration, zone),
};
k++;
} else {
result[j] = {
...result[j],
...toComparisonKeys(
{
ts: dtCompStart.toISO(),
},
offsetDuration,
zone
),
};
}
}

switch (dtu) {
case "year":
dtStart = dtStart.plus({ years: 1 });
dtCompStart = dtCompStart.plus({ years: 1 });
break;
case "quarter":
dtStart = dtStart.plus({ quarters: 1 });
dtCompStart = dtCompStart.plus({ quarters: 1 });
break;
case "month":
dtStart = dtStart.plus({ months: 1 });
dtCompStart = dtCompStart.plus({ months: 1 });
break;
case "week":
dtStart = dtStart.plus({ weeks: 1 });
dtCompStart = dtCompStart.plus({ weeks: 1 });
break;
case "day":
dtStart = dtStart.plus({ days: 1 });
dtCompStart = dtCompStart.plus({ days: 1 });
break;
case "hour":
dtStart = dtStart.plus({ hours: 1 });
dtCompStart = dtCompStart.plus({ hours: 1 });
break;
case "minute":
dtStart = dtStart.plus({ minutes: 1 });
dtCompStart = dtCompStart.plus({ minutes: 1 });
break;
case "second":
dtStart = dtStart.plus({ seconds: 1 });
dtCompStart = dtCompStart.plus({ seconds: 1 });
break;
case "millisecond":
dtStart = dtStart.plus({ milliseconds: 1 });
dtCompStart = dtCompStart.plus({ milliseconds: 1 });
break;
}
j++;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new hunk has introduced a more complex logic for preparing time series data. It now handles missing intervals in the time series by filling them with empty records. This is done by iterating over the entire time range and checking if there is a data point for each interval. If a data point is missing, it creates a new one with an empty record. This is a good improvement for data consistency and visualization, as it ensures that the time series will have a continuous flow, even if some data points are missing.

However, the switch-case block for incrementing the dtStart and dtCompStart variables based on the dtu value seems repetitive and could be simplified. Consider using a mapping object to map the dtu values to their corresponding increment methods. This would make the code more maintainable and easier to read.

-    switch (dtu) {
-      case "year":
-        dtStart = dtStart.plus({ years: 1 });
-        dtCompStart = dtCompStart.plus({ years: 1 });
-        break;
-      case "quarter":
-        dtStart = dtStart.plus({ quarters: 1 });
-        dtCompStart = dtCompStart.plus({ quarters: 1 });
-        break;
-      case "month":
-        dtStart = dtStart.plus({ months: 1 });
-        dtCompStart = dtCompStart.plus({ months: 1 });
-        break;
-      case "week":
-        dtStart = dtStart.plus({ weeks: 1 });
-        dtCompStart = dtCompStart.plus({ weeks: 1 });
-        break;
-      case "day":
-        dtStart = dtStart.plus({ days: 1 });
-        dtCompStart = dtCompStart.plus({ days: 1 });
-        break;
-      case "hour":
-        dtStart = dtStart.plus({ hours: 1 });
-        dtCompStart = dtCompStart.plus({ hours: 1 });
-        break;
-      case "minute":
-        dtStart = dtStart.plus({ minutes: 1 });
-        dtCompStart = dtCompStart.plus({ minutes: 1 });
-        break;
-      case "second":
-        dtStart = dtStart.plus({ seconds: 1 });
-        dtCompStart = dtCompStart.plus({ seconds: 1 });
-        break;
-      case "millisecond":
-        dtStart = dtStart.plus({ milliseconds: 1 });
-        dtCompStart = dtCompStart.plus({ milliseconds: 1 });
-        break;
-    }
+    const timeIncrements = {
+      year: { years: 1 },
+      quarter: { quarters: 1 },
+      month: { months: 1 },
+      week: { weeks: 1 },
+      day: { days: 1 },
+      hour: { hours: 1 },
+      minute: { minutes: 1 },
+      second: { seconds: 1 },
+      millisecond: { milliseconds: 1 },
+    };
+
+    dtStart = dtStart.plus(timeIncrements[dtu]);
+    dtCompStart = dtCompStart.plus(timeIncrements[dtu]);
Committable suggestion (Beta)
Suggested change
result.push({
ts,
ts_position,
bin: originalPt.bin,
...originalPt.records,
...toComparisonKeys(comparisonPt || {}, offsetDuration, zone),
};
});
});
if (
i < original.length &&
dtStart.equals(DateTime.fromISO(original[i].ts, { zone }))
) {
result[j] = {
...result[j],
...original[i].records,
};
i++;
}
if (comparison) {
if (
k < comparison.length &&
dtCompStart.equals(DateTime.fromISO(comparison[k].ts, { zone }))
) {
result[j] = {
...result[j],
...toComparisonKeys(comparison[k], offsetDuration, zone),
};
k++;
} else {
result[j] = {
...result[j],
...toComparisonKeys(
{
ts: dtCompStart.toISO(),
},
offsetDuration,
zone
),
};
}
}
switch (dtu) {
case "year":
dtStart = dtStart.plus({ years: 1 });
dtCompStart = dtCompStart.plus({ years: 1 });
break;
case "quarter":
dtStart = dtStart.plus({ quarters: 1 });
dtCompStart = dtCompStart.plus({ quarters: 1 });
break;
case "month":
dtStart = dtStart.plus({ months: 1 });
dtCompStart = dtCompStart.plus({ months: 1 });
break;
case "week":
dtStart = dtStart.plus({ weeks: 1 });
dtCompStart = dtCompStart.plus({ weeks: 1 });
break;
case "day":
dtStart = dtStart.plus({ days: 1 });
dtCompStart = dtCompStart.plus({ days: 1 });
break;
case "hour":
dtStart = dtStart.plus({ hours: 1 });
dtCompStart = dtCompStart.plus({ hours: 1 });
break;
case "minute":
dtStart = dtStart.plus({ minutes: 1 });
dtCompStart = dtCompStart.plus({ minutes: 1 });
break;
case "second":
dtStart = dtStart.plus({ seconds: 1 });
dtCompStart = dtCompStart.plus({ seconds: 1 });
break;
case "millisecond":
dtStart = dtStart.plus({ milliseconds: 1 });
dtCompStart = dtCompStart.plus({ milliseconds: 1 });
break;
}
j++;
}
const timeIncrements = {
year: { years: 1 },
quarter: { quarters: 1 },
month: { months: 1 },
week: { weeks: 1 },
day: { days: 1 },
hour: { hours: 1 },
minute: { minutes: 1 },
second: { seconds: 1 },
millisecond: { milliseconds: 1 },
};
dtStart = dtStart.plus(timeIncrements[dtu]);
dtCompStart = dtCompStart.plus(timeIncrements[dtu]);
j++;
}
return result;
}
export function getBisectedTimeFromCordinates(

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 382bba4 and 02f3473.
Files selected for processing (1)
  • runtime/queries/metricsview_timeseries.go (7} hunks)
Additional comments: 6
runtime/queries/metricsview_timeseries.go (6)
  • 81-92: The code retrieves the MetricsView resource using the Controller and Get methods. It then extracts the MetricsView specification and uses it to build the SQL query for the metrics time series. This is a good practice as it ensures that the most recent MetricsView specification is used for the query. However, it might be beneficial to handle the case where the MetricsView resource does not exist or the Get method returns an error. This would improve error handling and make the code more robust.

  • 119-135: The code checks the TimeZone field of the MetricsViewTimeSeries struct and loads the corresponding time zone. If the TimeZone field is empty, it defaults to UTC. It also checks the FirstDayOfWeek and FirstMonthOfYear fields of the MetricsView spec and assigns default values if they are out of range. This is a good practice as it ensures that the time zone and date parameters used in the subsequent calculations are valid. However, it might be beneficial to log a warning if the provided values are out of range, to help with debugging and to alert the user that their input was invalid and a default value was used instead.

  • 141-142: The rowMap variable is declared outside the loop, which prevents its redeclaration and overwriting in each iteration. This is a good practice as it improves efficiency by avoiding unnecessary memory allocations.

  • 191-236: The Export method has been updated to retrieve the MetricsView resource and use its specification to generate the filename for the exported data. This ensures that the filename is always up-to-date with the latest MetricsView specification. The method then serializes the query result and writes it to an io.Writer in the specified format. This is a good practice as it ensures that the exported data is always based on the most recent query result and MetricsView specification.

  • 285-296: The code builds the SQL query for the metrics time series based on the dialect of the OLAP store. It uses different methods to build the SQL query for DuckDB and Druid dialects. This is a good practice as it ensures that the SQL query is compatible with the specific OLAP dialect. However, it might be beneficial to handle the case where the OLAP dialect is not supported. This would improve error handling and make the code more robust.

  • 350-452: The code includes several helper functions for handling time-related operations. These functions include truncateTime, which truncates a time value to a specified granularity; generateNullRecords, which generates a record with null values; addNulls, which adds null records to a time series for missing intervals; and addTo, which adds a specified time granularity to a time value. These functions are well-designed and improve the modularity and readability of the code. They also ensure correct representation of timestamps and handle missing intervals in the time series.

Comment on lines +162 to +169
if zeroTime.Equal(start) {
if q.TimeStart != nil {
start = truncateTime(q.TimeStart.AsTime(), q.TimeGranularity, tz, int(fdow), int(fmoy))
data = addNulls(data, nullRecords, start, t, q.TimeGranularity)
}
} else {
data = addNulls(data, nullRecords, start, t, q.TimeGranularity)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code checks if the start time is equal to zeroTime and if q.TimeStart is not nil, and then calls the addNulls function. It then checks again if start is not equal to zeroTime and calls addNulls again. This seems redundant and could potentially lead to addNulls being called twice with the same parameters. Consider refactoring this to avoid potential redundancy.

@begelundmuller begelundmuller merged commit 88a34ff into main Oct 26, 2023
4 checks passed
@begelundmuller begelundmuller deleted the decouple-ts-and-mv branch October 26, 2023 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants