Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate business days / hours impact #210

Open
antoinecarme opened this issue Aug 30, 2022 · 4 comments
Open

Investigate business days / hours impact #210

antoinecarme opened this issue Aug 30, 2022 · 4 comments

Comments

@antoinecarme
Copy link
Owner

Investigate business days / hours impact.

Impact : better handling of irregular physical time stamps (business dates). Incremental. May have a workaround with non-physical time equivalent.

  1. PyAF models do not take into account the weekends/lunch time when computing future dates.
  2. This can have some impact on date-dependent time-series models
  3. no impact on signal transformation
  4. Impact on time-based trends (linear , polynomial, etc). No impact on other/stochastic trends (lag1, etc)
  5. Impact on seasonal values (for DayOfWeek , will the t+3 be a Monday or a Thursday ?)
  6. No impact on AR-like models (previous date can skip week-ends).
  7. Business hours should be investigated further (next business hour skips lunch ;).
  8. The time column will be smarted and generate more business-friendly forecast dates and forecast values. Model explanation improves.

Implementation impacts :

  1. The delta of the dates is computed as the mean difference between two consecutive dates. the notion of "consecutive" will be impacted. Is most frequent diff better than average diff in this case?
  2. the next date value will skip some intermediate "non-business" values.
  3. For the tests, we can use pd.date_range and pd.bdate_range as time values
# 1000 consecutive business days
pd.bdate_range('2000-1-1', periods=1000)

# 1000 consecutive business hours
pd.bdate_range('2000-1-1', periods=1000, freq = 'BH')
  1. Impact on plots ?
  2. Activate by default ?
  3. Automatic detection based on HourOfWeek/DayOfWeek distribution ?
  4. Use pd.bdate_range implementation to compute the next date ?
  5. Not sure if this is not dependent on the locale/country/culture etc ...
@antoinecarme
Copy link
Owner Author

next business day

import pandas  as pd
>>> pd.bdate_range('2000-1-1', periods=2)
DatetimeIndex(['2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq='B')
>>> pd.bdate_range('2000-1-2', periods=2)
DatetimeIndex(['2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq='B')
>>> pd.bdate_range('2000-1-3', periods=2)
DatetimeIndex(['2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq='B')
>>> pd.bdate_range('2000-1-4', periods=2)
DatetimeIndex(['2000-01-04', '2000-01-05'], dtype='datetime64[ns]', freq='B')


>>> lTwoNextBusinessDays = pd.bdate_range('2000-1-1', periods=2)
>>> lTwoNextBusinessDays[0]
Timestamp('2000-01-03 00:00:00', freq='B')

@antoinecarme
Copy link
Owner Author

>>> import pandas  as pd
>>> def next_business_day(x):
...     lNextTwoBusinessDays = pd.bdate_range(x, periods=2)
...     lDays = [d for d in lNextTwoBusinessDays if (d > pd.Timestamp(x))]
...     return lDays[0]
... 
>>> next_business_day('2000-1-1')
Timestamp('2000-01-03 00:00:00', freq='B')
>>> next_business_day('2000-1-2')
Timestamp('2000-01-03 00:00:00', freq='B')
>>> next_business_day('2000-1-3')
Timestamp('2000-01-04 00:00:00', freq='B')
>>> next_business_day('2000-1-4')
Timestamp('2000-01-05 00:00:00', freq='B')
>>> next_business_day('2000-1-5')
Timestamp('2000-01-06 00:00:00', freq='B')
>>> next_business_day('2000-1-6')
Timestamp('2000-01-07 00:00:00', freq='B')
>>> next_business_day('2000-1-7')
Timestamp('2000-01-10 00:00:00', freq='B')
>>> next_business_day('2000-1-8')
Timestamp('2000-01-10 00:00:00', freq='B')
>>> next_business_day('2000-1-9')
Timestamp('2000-01-10 00:00:00', freq='B')
>>> next_business_day('2000-1-10')
Timestamp('2000-01-11 00:00:00', freq='B')

@antoinecarme
Copy link
Owner Author

>>> import pandas  as pd
>>> 
>>> def next_business_hour(x):
...     lNextTwoBusinessHours = pd.date_range(x, periods=2, freq = 'BH')
...     lHours = [h for h in lNextTwoBusinessHours if (h > pd.Timestamp(x))]
...     print("next_business_hour" , (x , lHours[0]))
...     return lHours[0]
... 
>>> next_business_hour('2000-1-10 08:00:00')
next_business_hour ('2000-1-10 08:00:00', Timestamp('2000-01-10 09:00:00', freq='BH'))
Timestamp('2000-01-10 09:00:00', freq='BH')
>>> next_business_hour('2000-1-10 09:00:00')
next_business_hour ('2000-1-10 09:00:00', Timestamp('2000-01-10 10:00:00', freq='BH'))
Timestamp('2000-01-10 10:00:00', freq='BH')
>>> next_business_hour('2000-1-10 10:00:00')
next_business_hour ('2000-1-10 10:00:00', Timestamp('2000-01-10 11:00:00', freq='BH'))
Timestamp('2000-01-10 11:00:00', freq='BH')
>>> next_business_hour('2000-1-10 11:00:00')
next_business_hour ('2000-1-10 11:00:00', Timestamp('2000-01-10 12:00:00', freq='BH'))
Timestamp('2000-01-10 12:00:00', freq='BH')
>>> next_business_hour('2000-1-10 12:00:00')
next_business_hour ('2000-1-10 12:00:00', Timestamp('2000-01-10 13:00:00', freq='BH'))
Timestamp('2000-01-10 13:00:00', freq='BH')
>>> next_business_hour('2000-1-10 12:03:00')
next_business_hour ('2000-1-10 12:03:00', Timestamp('2000-01-10 13:03:00', freq='BH'))
Timestamp('2000-01-10 13:03:00', freq='BH')
>>> next_business_hour('2000-1-10 13:00:00')
next_business_hour ('2000-1-10 13:00:00', Timestamp('2000-01-10 14:00:00', freq='BH'))
Timestamp('2000-01-10 14:00:00', freq='BH')
>>> next_business_hour('2000-1-10 22:00:00')
next_business_hour ('2000-1-10 22:00:00', Timestamp('2000-01-11 09:00:00', freq='BH'))
Timestamp('2000-01-11 09:00:00', freq='BH')
>>> next_business_hour('2000-1-10 23:00:00')
next_business_hour ('2000-1-10 23:00:00', Timestamp('2000-01-11 09:00:00', freq='BH'))
Timestamp('2000-01-11 09:00:00', freq='BH')
>>> next_business_hour('2000-1-10 00:00:00')
next_business_hour ('2000-1-10 00:00:00', Timestamp('2000-01-10 09:00:00', freq='BH'))
Timestamp('2000-01-10 09:00:00', freq='BH')
>>> 

@antoinecarme antoinecarme self-assigned this Mar 3, 2023
@antoinecarme
Copy link
Owner Author

Not sure if this feature will be implemented. User value ?

Delayed. Priority : low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant