Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calendarise self-defined date-times (e.g. business days and time) and respect structural missingness #18

Open
earowang opened this issue Apr 22, 2018 · 6 comments
Labels
roadmap future implementation

Comments

@earowang
Copy link
Member

earowang commented Apr 22, 2018

tsibble is designed to work with many types of index objects, as long as these S3 methods including index_valid() and pull_interval() are defined for custom index classes, for example the timeDate class, and then fill_na() naturally work out of box.

Since trading/business hours differ from one market/store to another, data are not recorded out of the trading hours. But fill_na() will insert NA to the non-trading time, because tsibble thinks it as calendar periods. tsibble needs a more general class that handles custom time ranges and respects these missing observations.

For example, the data set calls from the fpp2 package contains five-minute call volume handled on weekdays between 7:00am and 9:05pm, from 3 March 2003 to 23 May 2003. A possible interface may look like this?

Define your own calendar function:

my_cal <- calendarise(
  # a typical business day starts from 7am and ends at 9:05pm w/o breaks
  from = "07:00:00", 
  to = "21:05:00",
  break = NULL,
  # set Sat and Sun as no working day
  wday = exclude(6:7),
  # set a particular date as no working day
  date = exclude("2003-04-21") # Easter break 
)

Then apply to a vector of date-times and tsibble respects its missing time gaps:

as_tsibble(fpp2::call2, index = my_cal(index))

How others handle with custom business days and hours:

@earowang earowang added the roadmap future implementation label Apr 22, 2018
@earowang earowang changed the title How to specify and support custom time ranges (e.g. business days and time) Calendarise self-defined date-times (e.g. business days and time) Apr 22, 2018
@earowang earowang changed the title Calendarise self-defined date-times (e.g. business days and time) Calendarise self-defined date-times (e.g. business days and time) and respect missing observations Apr 22, 2018
@earowang earowang changed the title Calendarise self-defined date-times (e.g. business days and time) and respect missing observations Calendarise self-defined date-times (e.g. business days and time) and respect structural missingness Aug 17, 2018
@DavisVaughan
Copy link

DavisVaughan commented Aug 27, 2018

I've also got some work done on exporting a subset of QuantLib that handles calendar dates and holidays. It's not done yet, and also doesn't really handle intraday systems yet, but I want it to.

https://github.com/DavisVaughan/calendarrr

There is also RQuantLib, but it's massive and a pain to install. This is self-contained.

It doesn't have native support for excluding weekends, but I don't think it would be too difficult to add. There is actually this "bespoke calendar" that starts with no holidays and no weekends defined, and the user can define what they are. This is quite useful as a base calendar. https://github.com/lballabio/QuantLib/blob/master/ql/time/calendars/bespokecalendar.cpp

I could see how you could attach a calendar object to a tsibble and then it knows how to adjust the calculations based on the holidays and excluded days from that calendar.

@earowang
Copy link
Member Author

I was poking around calendarrr yesterday. It's a good starting point, although it doesn't provide time adjustment within a day. Maybe you wanna share some your thoughts here.

@DavisVaughan
Copy link

On one hand, I'd like to modify the quantlib source directly to make the internal adjustments to allow for setting times within a day that are not allowed (like trading hours or something similar). The core of that problem would be defining (in cpp) isAllowedTime() (this can be named whatever) for the base calendar (adding adjustments if necessary for a few other calendars that inherit from it) and then altering adjust() to adjust for 1st) holidays 2nd) intraday hours. There might also be some work required in adjusting the advance() method, but I'm not sure yet.

The downside of this is that if quantlib changes and we want to merge in that new code, its not as straightforward as a copy paste because now we have changed their source code directly.

I'm not sure if there is a good way to add new methods to the classes that are already there without modifying their code directly, but that would be ideal.

Alternatively, this is something that the Quantlib team might be interested in, so they might be open to having this in quantlib directly.

@earowang
Copy link
Member Author

I'm also interested in knowing if it's possible to vectorise cal_advance() for taking a vector of n, and have seq() and arithmetic operators +/- working with calendarrr.

To incorporate a calendar into the tsibble framework, I suppose a new argument calendar = NULL is needed in build_tsibble(), and hence a new attribute calendar in the tbl_ts.

@DavisVaughan
Copy link

Does it make much sense to vectorize both dates and n in cal_advance()? How would this behave?
cal_advance(Sys.Date() + 1:2, n = c(1, 2) )

For seq(), I think we could provide a limited interface to the Schedule class.
https://www.quantlib.org/slides/dima-ql-intro-1.pdf
Slide 32

For + and -, I've been thinking about how these could (should?) work with vectors. The only thing I've come up with is to create a new data type, call it Date_cal, that would have a calendar as an attribute on the vector. Then Date_cal + 2 would know where the holidays are and would default to adding 2 days. Could also do Date_cal + months(2) from lubridate. I don't particularly like this though.

If tbl_ts was the object that had the calendar attribute, then in theory the date vector would not need to have the attribute on it as well. Especially for mutate() calls where this would be most useful. To me, this makes the most sense, because the entire tbl_ts object is what has the calendar associated with it. The index vector is just an index vector, so I'd expect tbl_ts$index to just return a Date object, not a special Date_cal thing.

I'm a bit torn on what to do for this + / - implementation because of this.

@earowang
Copy link
Member Author

earowang commented Aug 28, 2018

Probably vectorize both as in +.Date, but give an error if they are not of the same lengths, instead of warning?

x <- Sys.Date() + 1:2
x
#> [1] "2018-08-28" "2018-08-29"
x + 1
#> [1] "2018-08-29" "2018-08-30"
x + 1:2
#> [1] "2018-08-29" "2018-08-31"
x + 1:3
#> Warning in unclass(e1) + unclass(e2): longer object length is not a
#> multiple of shorter object length
#> [1] "2018-08-29" "2018-08-31" "2018-08-31"
x[1] + 1:3
#> [1] "2018-08-29" "2018-08-30" "2018-08-31"

fill_na() and lag/lead/difference all need to rely the calendar. fill_na() currently looks for seq() and arithmetic operators to generate full time sequence:

seq_generator <- function(x) {

I'd rather not to go for an if-else statement: if !is.null(calendar), and then use lots of specialist functions from other packages, which makes code complicated.

Also, it would be nice to set a default calendar as a global option, like what bizdays does here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap future implementation
Projects
None yet
Development

No branches or pull requests

2 participants