feat: Implement Session Stats API #22770

Swatinem · 2020-12-17T11:56:42Z

This is a new API for querying sessions and has a query interface similar to discover, and returns timeseries data.

Example URL: http://localhost:8000/api/0/organizations/sentry/sessions/?project=1&statsPeriod=1d&interval=6h&field=sum(session)&field=count_unique(user)&groupBy=release&groupBy=session.status

TODO:

handle errors (unknown fields, etc) correctly
group by project, restrict filtered projects to ones we have permission to access
default to all org projects if no project is specifically selected
get a data generator or something going… (maybe someone has one already)
implement filtering
more "virtual" fields, such as "release.major", etc.

This is a new API for querying sessions and has a query interface similar to discover, and returns timeseries data.

markstory · 2021-01-04T15:53:45Z

src/sentry/api/endpoints/organization_sessions.py

+        # print(result_totals, result_timeseries)
+
+        with sentry_sdk.start_span(op="sessions.endpoint", description="massage_sessions_result"):
+            result = massage_sessions_result(query, result_totals, result_timeseries)


You could use a 'serializer' to convert the internal formats to the ones the client is expecting.

src/sentry/api/endpoints/organization_sessions.py

src/sentry/snuba/sessions_v2.py

src/sentry/api/endpoints/organization_sessions.py

Swatinem · 2021-01-12T11:30:57Z

Some notes to clarify some strange things we saw while testing:

Why do I get 3 intervals when using `statsPeriod=2d&interval=1d`:

Lets visualize this using epic box drawing:

          ┌─ 2021-01-10T11:05:01 ─ 2021-01-12T11:05:01 ─┐
├─────────────────────┼─────────────────────┼─────────────────────┤
└─2021-01-10T00:00:00─└─2021-01-11T00:00:00─└─2021-01-12T00:00:00─

The statsPeriod is not rounded, but exact, so it spans 3 rounded intervals. This is also reflected in snuba:

greaterOrEquals(started, toDateTime('2021-01-10T11:05:01', 'Universal')) AND less(started, toDateTime('2021-01-12T11:05:01', 'Universal'))

If you know how it works, it is kind of expected. However, in this example, you only get half of the data for the first period, which is weird from a user perspective. Better would be to round the statsPeriod as well to the interval.

Why are multi-day intervals so weird?

So When using statsPeriod=4w&interval=2w, we actually get:

["2020-12-10T00:00:00Z","2020-12-24T00:00:00Z","2021-01-07T00:00:00Z"]

… Which is not really what we would expect, and today (2021-01-12), we are halfway through the last interval, why?

Well, both sentry (the code that I copy/pasted from somewhere else) and snuba do very basic arithmetic rounding/truncating based on timestamps, for example:

(toDateTime(multiply(intDiv(toUInt32(started), 1209600), 1209600), 'Universal') AS _snuba_bucketed_started)

And it just so happens that this is the arithmetic rounding you get for unix timestamps. so they will be widely inconsistent across years. At least sentry and snuba agree on the rounding rules, so the intervals are matching ;-)

Conclusion:

Don’t use interval > 1d, just don’t.
I use an internal util function to do the statsPeriod parsing. Maybe it makes sense to introduce a new parameter to round to a certain interval. Though I should be very careful doing that, since I have no idea what other implications this has throughout sentry. I hope @markstory can tell me more?
@matejminar For comparing current period with last period, you can use statsPeriod=2w&interval=1d for the current period, and statsPeriodStart=4w&statsPeriodEnd=2w&interval=1d for the previous period.

markstory · 2021-01-12T15:08:29Z

Though I should be very careful doing that, since I have no idea what other implications this has throughout sentry. I hope @markstory can tell me more?

An additional parameter should be relatively safe. Would you need to change the rounding of the interval, or would adjusting the start date to be aligned with a bucket boundary work? For example a start of 2021-01-12 10:03:00 with a 1d bucket size could be aligned to 2021-01-12 00:00:00 (as the day is not done) or 2021-01-13 00:00:00 (to get partial results from the most recent bin. I'm partial towards aligning on the future value.

We have some precedent for moving the start dates, as we quantize start dates to 5 min boundaries in discover to improve cache efficacy.

Swatinem · 2021-01-12T19:00:12Z

I think this should round "upward", so for example now at 19:57 on 2021-01-12, when I request 2 days worth of 1 day intervals, it would round to 2021-01-11T00:00:00 <= X < 2021-01-13T00:00:00 (or <= 2021-01-12T23:59:59), so it has 2 days in it.

I mean either way the rounding problem also becomes a timezone problem, since we round to UTC, so maybe expectations of people in other timezones would be different in that case, hm.

markstory

It would be good to have a couple of endpoint tests that check succesful queries, queries with invalid fields, and project permission problems.

markstory

Thanks for adding the endpoint tests. You may need to relax your assertions to get the tests to pass.

tests/snuba/api/endpoints/test_organization_sessions.py

Swatinem added 3 commits December 17, 2020 12:53

feat: Implement Session Stats API

01045be

This is a new API for querying sessions and has a query interface similar to discover, and returns timeseries data.

start unit testing the internal sessions API

74c493c

implement virtual groupby

65c266f

vercel bot deployed to Preview – storybook December 18, 2020 13:27 View deployment

vercel bot deployed to Preview – sentry December 18, 2020 13:27 View deployment

restrict session.duration to healthy sessions only

31b5774

vercel bot deployed to Preview – storybook December 18, 2020 13:53 View deployment

vercel bot deployed to Preview – sentry December 18, 2020 13:53 View deployment

Swatinem requested review from jan-auer and mitsuhiko December 21, 2020 10:49

Swatinem marked this pull request as ready for review December 21, 2020 10:49

Swatinem requested a review from a team as a code owner December 21, 2020 10:49

markstory reviewed Jan 4, 2021

View reviewed changes

dashed reviewed Jan 4, 2021

View reviewed changes

src/sentry/api/endpoints/organization_sessions.py Outdated Show resolved Hide resolved

Swatinem added 2 commits January 5, 2021 10:48

Merge branch 'master' into feat/sessions-api

51e0bdc

ensure stable tests

128b3a1

vercel bot deployed to Preview – storybook January 5, 2021 12:09 View deployment

vercel bot deployed to Preview – sentry January 5, 2021 12:09 View deployment

Merge branch 'master' into feat/sessions-api

8bcc951

vercel bot deployed to Preview – sentry January 11, 2021 13:36 View deployment

vercel bot deployed to Preview – storybook January 11, 2021 13:36 View deployment

implement groupBy project

717a32c

vercel bot deployed to Preview – storybook January 12, 2021 12:49 View deployment

vercel bot deployed to Preview – sentry January 12, 2021 12:49 View deployment

Swatinem added 2 commits January 13, 2021 13:41

add a new util to get a date range with rollup window

da5c49b

simplicfy code dealing with query rollups a bit

00d22fa

vercel bot deployed to Preview – storybook January 13, 2021 13:13 View deployment

vercel bot deployed to Preview – sentry January 13, 2021 13:13 View deployment

simplify query handling a bit

a2a5dea

vercel bot deployed to Preview – sentry January 13, 2021 14:06 View deployment

vercel bot deployed to Preview – storybook January 13, 2021 14:06 View deployment

Merge branch 'master' into feat/sessions-api

dd1c2f6

vercel bot deployed to Preview – sentry January 13, 2021 14:24 View deployment

vercel bot deployed to Preview – storybook January 13, 2021 14:24 View deployment

markstory reviewed Jan 13, 2021

View reviewed changes

add some endpoint tests

b65b134

vercel bot deployed to Preview – sentry January 14, 2021 14:35 View deployment

vercel bot deployed to Preview – storybook January 14, 2021 14:35 View deployment

markstory approved these changes Jan 14, 2021

View reviewed changes

tests/snuba/api/endpoints/test_organization_sessions.py Outdated Show resolved Hide resolved

Swatinem added 2 commits January 15, 2021 12:45

implement proper error handling for field mismatches

2504a51

Merge remote-tracking branch 'origin/master' into feat/sessions-api

32a0ee5

vercel bot deployed to Preview – storybook January 15, 2021 11:48 View deployment

vercel bot deployed to Preview – sentry January 15, 2021 11:48 View deployment

add missing field to test

b8914a8

vercel bot deployed to Preview – storybook January 15, 2021 12:51 View deployment

vercel bot deployed to Preview – sentry January 15, 2021 12:51 View deployment

super py2.7

4ce9657

vercel bot deployed to Preview – storybook January 15, 2021 13:08 View deployment

vercel bot deployed to Preview – sentry January 15, 2021 13:08 View deployment

Swatinem merged commit 3e70867 into master Jan 15, 2021

Swatinem deleted the feat/sessions-api branch January 15, 2021 13:29

github-actions bot locked and limited conversation to collaborators Jan 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Implement Session Stats API #22770

feat: Implement Session Stats API #22770

Uh oh!

Swatinem commented Dec 17, 2020 •

edited

Loading

Uh oh!

markstory Jan 4, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Swatinem commented Jan 12, 2021 •

edited

Loading

Uh oh!

markstory commented Jan 12, 2021

Uh oh!

Swatinem commented Jan 12, 2021

Uh oh!

markstory left a comment

Uh oh!

markstory left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

feat: Implement Session Stats API #22770

feat: Implement Session Stats API #22770

Uh oh!

Conversation

Swatinem commented Dec 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markstory Jan 4, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Swatinem commented Jan 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why do I get 3 intervals when using statsPeriod=2d&interval=1d:

Why are multi-day intervals so weird?

Conclusion:

Uh oh!

markstory commented Jan 12, 2021

Uh oh!

Swatinem commented Jan 12, 2021

Uh oh!

markstory left a comment

Choose a reason for hiding this comment

Uh oh!

markstory left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Swatinem commented Dec 17, 2020 •

edited

Loading

Swatinem commented Jan 12, 2021 •

edited

Loading

Why do I get 3 intervals when using `statsPeriod=2d&interval=1d`: