Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week 0 labeled as week 52 of prior year #2651

Open
domoritz opened this issue May 24, 2020 · 7 comments
Open

Week 0 labeled as week 52 of prior year #2651

domoritz opened this issue May 24, 2020 · 7 comments
Labels
discussion For discussing proposed changes

Comments

@domoritz
Copy link
Member

Time units are useful for binning temporal data. Vega implements time units by converting ranges of time into the same timestamps for a time unit. For example, when using the month time unit, all dates with the same month should end up in the same time unit bin.

However, this does not work correctly for week time unit. In the specification below, two timestamps become different time unit bins although they are both in week 52. The issue with this behavior is that week time units cannot be used as a group by (e.g. in vega/vega-lite#6526).

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "height": 20,
  "data": [
    {
      "name": "data",
      "values": [
        {"date": 1356249600000},
        {"date": 1357027200000}
      ],
      "transform": [
        {
          "field": "date",
          "type": "timeunit",
          "units": ["week"],
          "as": ["week", "week_end"]
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "y",
      "type": "point",
      "domain": {"data": "data", "field": "week"},
      "range": "height"
    }
  ],
  "axes": [
    {"scale": "y", "orient": "left", "format": "%Y W%U", "formatType": "time"}
  ]
}
@domoritz domoritz added the bug For bugs or other software errors label May 24, 2020
@roying
Copy link

roying commented May 24, 2020

Vega transform timeunit "week" produces start dates of weeks (Sundays).
https://vega.github.io/vega/docs/transforms/timeunit/

If you want to group data across years by week number,
why not just use Vega expression week() to convert the start date to week number?
https://vega.github.io/vega/docs/expressions/#datetime-functions

@domoritz
Copy link
Member Author

We are using the time unit transform in Vega-Lite for all other time units already so I'd prefer not to special case week. Another issue is formatting because time unit produces timestamps that can be formatted. For example, if you use time unit week and hour, we would not get a timestamp and thus cannot format it in the axis anymore.

The docs also say that "Note that the output dates default to the year 2012", which is not true in the case above where one date ends up in 2011.

@domoritz domoritz added the blocking VL The issue is blocking Vega-Lite label May 24, 2020
@jheer
Copy link
Member

jheer commented May 24, 2020

I'm not (yet) convinced there is a problem. In the example above, the two dates are indeed in different weeks in both of our time zones. See code snippets below.

Also keep in mind that week numbers have "interesting" conventions based on what constitutes week 0 versus week 1, as discussed in the d3-time-format docs: "all days in a new year preceding the first Sunday are considered to be in week 0". Thus week zero, when floored to the beginning of that week, can indeed be in the previous year.

// local time zone (for me, CET)
[1356249600000, 1357027200000]
  .map(d => new Date(d))
  .map(d3.timeFormat('%U'))
// output: ["52", "00"]
// offset by 9 hours time zone (for me, map to PST)
[1356249600000, 1357027200000]
  .map(d => new Date(d - 9*60*60*1000))
  .map(d3.timeFormat('%U'))
// output: ["52", "00"]

@domoritz
Copy link
Member Author

domoritz commented May 24, 2020

If I run the spec, I get two times week 52.

image

The issue I'm facing is that in the example below, I get two week 52s (the example only works after vega/vega-lite#6526, btw). When time binning by week, I would expect every week to only appear once.

In the transformed data (post time unit), I get one date in 2011 and the rest in 2012.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {"url": "data/seattle-weather.csv"},
  "transform": [{
    "filter": "year(datum.date) > 2013"
  }],
  "mark": "bar",
  "encoding": {
    "x": {
      "timeUnit": "week",
      "field": "date",
      "type": "ordinal"
    },
    "y": {
      "aggregate": "mean",
      "field": "precipitation",
      "type": "quantitative"
    }
  }
}

image

@jheer
Copy link
Member

jheer commented May 24, 2020

Week 0 for 2012 is the same as Week 52 for 2011. You get 52 for 2011 because the date is floored to the beginning of that week. (Well, technically there is no week 0 for 2012 since it starts on a Sunday, but the same logic applies, given the week 0 value from 2013.)

The timeunit transform appears to be working as designed. However, I'm happy to consider possible changes if you have a concrete proposal. I agree it would be nice to have a W00 label in your example rather than W52 from the prior year, but I have to think more to see if there is an elegant solution.

@domoritz
Copy link
Member Author

Hmm, that's tricky. We need a year that has both a week 0 and a week 53.

Luckily, 2000 has 54 weeks. Unfortunately, it starts on a Saturday and ends on a Sunday so we can't just switch to snapping time units to Saturdays (and it's already a leap year so we are not getting a longer year).

So, we have to give up either the fact that time unit bins are at regular intervals or have repeating labels.

The cleanest solution would be some way to label W52 in 2011 as W00 in 2012. Anyway, I see now why Vega is doing the right thing here.

@domoritz domoritz added discussion For discussing proposed changes and removed blocking VL The issue is blocking Vega-Lite bug For bugs or other software errors labels May 24, 2020
@domoritz domoritz changed the title Week time units are not binning correctly Week time units can be labeled as the same week 52 despite being different weeks May 24, 2020
@roying
Copy link

roying commented May 25, 2020

IMHO, it is a bad idea to coerce all dates to default (leap) year 2012 by using transform timeunit with ["week"] (instead of ["year", "week"]).

For data from non-leap years, the aggregation around week of Feb 28 will only include 6 days (instead of 7) because there is no data for Feb 29 for non-leap years.

The results will be difficult to validate and may contain a subtle error that is difficult to debug.

As indicated earlier, creating a new field with week() will give the correct week number (including week "0").

@jheer jheer changed the title Week time units can be labeled as the same week 52 despite being different weeks Week 0 labeled as week 52 of prior year May 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion For discussing proposed changes
Projects
None yet
Development

No branches or pull requests

3 participants