Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation by time interval should respect the natural boundaries of the argument #387

Closed
wants to merge 1 commit into from

Conversation

jvshahid
Copy link
Contributor

@jvshahid jvshahid commented Oct 9, 2014

minute, hour are straight forward. Grouping by 1d should make sure we bucket points between 00:00 UTC (midnight of consecutive days), 1w/7d should bucket points between consecutive mondays, etc. This story should deal with utc only, handling different timezones will be handled in a different issue.

@jurriaan
Copy link

jurriaan commented Apr 8, 2014

What about months and years?

@jvshahid
Copy link
Contributor Author

jvshahid commented Apr 8, 2014

That too

@annismckenzie
Copy link

+1 for also including months and years so grouping by month includes the values between the first and the last day in that month. That would be a huuuuuuge boon. I'm currently rolling group by time(1d)and doing the rollups manually. Continuous queries do help a bit here but it's still a pain and also quite slow when the time isn't restricted for these queries.

@EggDice
Copy link

EggDice commented Aug 19, 2014

So right now what are the boundaries for 1w? On our system it gives back data between Thursdays. Is there a way to change this?

@ernoaapa
Copy link

Having similar problems when grouping by week. I would need a monday to sunday grouping when group by time(1w)

@toddboom toddboom added this to the 0.8.4 milestone Oct 9, 2014
@toddboom toddboom self-assigned this Oct 9, 2014
@jvshahid
Copy link
Contributor Author

jvshahid commented Oct 9, 2014

@toddboom is currently working on this story, I'm just commenting on the issue to let everyone know.

@toddboom toddboom force-pushed the fix-387-logical-time-boundaries branch 2 times, most recently from 35fb30b to a286f5f Compare October 9, 2014 21:36
toddboom added a commit that referenced this pull request Oct 9, 2014
@toddboom
Copy link
Contributor

toddboom commented Oct 9, 2014

This is almost done. I just need to clean up the test cases for month and year boundaries. (And fix some broken tests.)

@annismckenzie
Copy link

Best news I've heard all week! 👯

@toddboom
Copy link
Contributor

Ok, I think this guy is ready to go.

The only caveat is that weeks start on Sunday now, which might be frustrating for folks who wanted it to start on Monday. Don't worry, though - I'm going to open up a new issue to make this configurable.

@@ -12,9 +12,11 @@ import (

// Returns the parsed duration in nanoseconds, support 'u', 's', 'm',
// 'h', 'd' and 'w' suffixes.
func ParseTimeDuration(value string) (int64, error) {
func ParseTimeDuration(value string) (int64, bool, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the documentation to explain what the extra return value means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bool return value is also used a few times and we ignore it in most places, should this be a separate function ? Just a thought, I don't have a strong opinion

@jvshahid
Copy link
Contributor Author

Other than my comments above, i think the pr is gtg

@jvshahid jvshahid added review and removed working labels Oct 10, 2014
@toddboom toddboom force-pushed the fix-387-logical-time-boundaries branch from cd99682 to 809d6f3 Compare October 10, 2014 19:23
@jvshahid jvshahid removed the 3 - Done label Oct 14, 2014
@toddboom toddboom closed this in 9f89aae Oct 20, 2014
@toddboom toddboom removed the review label Oct 20, 2014
@toddboom toddboom deleted the fix-387-logical-time-boundaries branch March 25, 2015 19:36
@jjfazio
Copy link

jjfazio commented Jun 26, 2015

@toddboom @jvshahid I don't see the support for month and year grouping in the documentation. Per V 0.9.0 -"The time function takes the time interval which can be in microseconds, seconds, minutes, hours, days or weeks. To specify the units you can use the respective suffix u, s, m, h, d and w." Is this a mistake or is month and year grouping still not supported?

@pkaeding
Copy link

@jjfazio the problem with month and year buckets is that they are not a uniform length. This would mean that some buckets would be a different size from other buckets, which may lead to misleading results.

EDIT: Oops, I was a little too quick to reply here. It looks like this is addressed in the PR?

@jamhall
Copy link

jamhall commented Aug 9, 2015

So, is there any news on this? Has it been merged? Is it currently on hold? Is it part of the 9.3 release? What's the current status? For my team and I, we absolutely cannot use influxdb as a time series database without support for year and month grouping. Thanks in advance.

EDIT: I've looked at the changelog and it says this has been released in v0.8.4-rc.1, however when trying month or year, it does not work.

Example:

group by time(y) returns ERR: time dimension must have one duration argument or

group by time(1y) returns ERR: error parsing query: found y, expected ) at line 1, char 83

I'm using v0.9.2.

EDIT 2: It looks like the year and month functionality was removed. What was the reasoning for this? Thank you!

@beckettsean
Copy link
Contributor

@jamhall version 0.8 and 0.9 are almost entirely different products. 0.8 was the final version of the older schema and cluster architecture.

There is no month or year grouping in InfluxDB 0.9 and it is not currently on the roadmap. I'm unable to find a current open feature request for 0.9, so I encourage you to open one.

@josepablocam
Copy link

@beckettsean oh no! I just found this. I was looking forward to doing some work with influxdb, but for my area monthly/yearly groupings are pretty critical. Is there an easy way around this, aside from writing out columns with the time dimensions that I might usually use?

@osallou
Copy link

osallou commented Aug 18, 2015

You can make queries in date range and aggregate yourself , but it is more
work....

Le mar. 18 août 2015 06:18, josepablocam notifications@github.com a
écrit :

@beckettsean https://github.com/beckettsean oh no! I just found this. I
was looking forward to doing some work with influxdb, but for my area
monthly/yearly groupings are pretty critical. Is there an easy way around
this, aside from writing out columns with the time dimensions that I might
usually use?


Reply to this email directly or view it on GitHub
#387 (comment).

@beckettsean
Copy link
Contributor

@josepablocam there are no simple workarounds, no. The GROUP BY clause only takes intervals of time, not explicit time boundaries, and nothing in the system equates to a month or a year, since they are irregular concepts.

@rodxavier
Copy link

I'm on 0.9.3 and I'm trying to group by week but it seems that each grouping starts on a Thursday? Is there any way to change this behaviour to either Monday or Sunday?

@beckettsean
Copy link
Contributor

@rodxavier there is no current feature to change when 1w periods run. See #1837 for more information.

@radeno
Copy link

radeno commented Aug 2, 2016

Was this issue fixed or not?

We have used influx 1.0.0.beta3 and problem still appears.

Example:
Start date is Monday.

WHERE time >= '2015-07-27T00:00:00.000Z' AND time <= '2016-07-27T23:59:59.000Z' GROUP BY time(1w)

returns first result as 2015-07-23T00:00:00Z which is Thursday week before.

Same results are returned when instead time(1w) is used time(7d) what is different, i think. Time window is not calendar week, but simple 7 days.

@hariteritreee
Copy link

when I group by time(1w). I still see week starting from Thursday . I want week to start from sunday or monday. Please let me know how to get it done. I spent 4,5 hours trying to figure out. mostly all issues retargeting here.

SELECT SUM(value) FROM measurement_name WHERE time >= '2016-10-09T18:30:00Z' AND time < '2016-11-07T18:29:59Z' GROUP BY time(1w)

Please reply...

@radeno
Copy link

radeno commented Nov 7, 2016

This ugly bug can be temporary fixed by adding offset into grouping by time:

To get from monday:
SELECT SUM(value) FROM measurement_name WHERE time >= '2016-10-09T18:30:00Z' AND time < '2016-11-07T18:29:59Z' GROUP BY time(1w, 4d)

@hariteritreee
Copy link

Thanks @radeno ...

@hariteritreee
Copy link

@radeno any idea why below query

SELECT SUM(value) FROM report WHERE channel='email' And time >= '2016-11-08T18:30:00Z' AND time < '2016-11-10T07:48:50.743524Z' GROUP BY time(1h)

gives time less than start time interval?

for example :)
time sum
1478628000000000000 0
1478631600000000000 0
1478635200000000000 0

@beckettsean
Copy link
Contributor

@hariteritreee please use the influxdb@googlegroups.com mailing list for questions. We try to save GHI for feature requests and bug reports.

The answer to your question is in the docs. Please review the GROUP BY time() section of the recently updated Data Exploration page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet