Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement the skip option #522

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

implement the skip option #522

wants to merge 5 commits into from

Conversation

yoch
Copy link

@yoch yoch commented Nov 14, 2017

Try to implement the SKIP option (which solves issue #149).

To be compatible with the RFC-7529 (#285), I also added RSCALE element because skip depends on it, but without the support of non-gregorian calendars.

Edit from pganssle:

Just for tracking, here's a to-do-list of in-scope cases for skip to be implemented in.

  • The start date of the recurrence has a day-of-month value greater
    than the smallest day-of-month value for any month in any year in
    the specified calendar system.
  • A "BYMONTHDAY" element in an "RRULE" has a day-of-month value
    greater than the smallest day-of-month value for any month in any
    year in the specified calendar system.
  • A combination of a "BYMONTHDAY" element and a "BYMONTH" element
    in an "RRULE" has a value corresponding to a leap day in the
    specified calendar system.
  • A "BYYEARDAY" element in an "RRULE" has an absolute value greater
    than the smallest number of days in any year in the specified
    calendar system.
  • A "BYWEEKNO" element in an "RRULE" has an absolute value greater
    than the smallest number of weeks in any year in the specified
    calendar system.

@yoch
Copy link
Author

yoch commented Nov 14, 2017

I haven't created new tests yet, but here some examples:

>>> from dateutil.rrule import *
>>> from datetime import datetime
>>> from pprint import pprint
>>> start_date = datetime(2014, 12, 31) 
>>> rr=rrule(freq=MONTHLY, count=4, dtstart=start_date); print(rr); pprint(list(rr))
DTSTART:20141231T000000
RRULE:FREQ=MONTHLY;COUNT=4
[datetime.datetime(2014, 12, 31, 0, 0),
 datetime.datetime(2015, 1, 31, 0, 0),
 datetime.datetime(2015, 3, 31, 0, 0),
 datetime.datetime(2015, 5, 31, 0, 0)]
>>> rr=rrule(freq=MONTHLY, count=4, dtstart=start_date, rscale='GREGORIAN', skip='BACKWARD'); print(rr); pprint(list(rr))
DTSTART:20141231T000000
RRULE:FREQ=MONTHLY;COUNT=4;RSCALE=GREGORIAN;SKIP=BACKWARD
[datetime.datetime(2014, 12, 31, 0, 0),
 datetime.datetime(2015, 1, 31, 0, 0),
 datetime.datetime(2015, 2, 28, 0, 0),
 datetime.datetime(2015, 3, 31, 0, 0)]
>>> rr=rrule(freq=MONTHLY, count=4, dtstart=start_date, rscale='GREGORIAN', skip='FORWARD'); print(rr); pprint(list(rr))
DTSTART:20141231T000000
RRULE:FREQ=MONTHLY;COUNT=4;RSCALE=GREGORIAN;SKIP=FORWARD
[datetime.datetime(2014, 12, 31, 0, 0),
 datetime.datetime(2015, 1, 31, 0, 0),
 datetime.datetime(2015, 3, 1, 0, 0),
 datetime.datetime(2015, 3, 31, 0, 0)]
>>> rr=rrule(freq=MONTHLY, count=4, dtstart=start_date, skip='FORWARD'); print(rr); pprint(list(rr))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/cygdrive/d/Projets/dateutil/dateutil/rrule.py", line 676, in __init__
    assert skip is None, 'SKIP is not allowed without RESCALE'
AssertionError: SKIP is not allowed without RESCALE

@yoch
Copy link
Author

yoch commented Nov 14, 2017

Important remark: this SKIP doesn't cover all the possibles cases.

The RFC give these cases :

  1. The start date of the recurrence is a leap day in the specified
    calendar system.
  2. The start date of the recurrence is in a leap month in the
    specified calendar system.
  3. The start date of the recurrence has a day-of-month value greater
    than the smallest day-of-month value for any month in any year in
    the specified calendar system.
  4. A "BYMONTHDAY" element in an "RRULE" has a day-of-month value
    greater than the smallest day-of-month value for any month in any
    year in the specified calendar system.
  5. A "BYMONTH" element in an "RRULE" has a value corresponding to a
    leap month in the specified calendar system.
  6. A combination of a "BYMONTHDAY" element and a "BYMONTH" element
    in an "RRULE" has a value corresponding to a leap day in the
    specified calendar system.
  7. A "BYYEARDAY" element in an "RRULE" has an absolute value greater
    than the smallest number of days in any year in the specified
    calendar system.
  8. A "BYWEEKNO" element in an "RRULE" has an absolute value greater
    than the smallest number of weeks in any year in the specified
    calendar system.
  1. cannot occurs because dtstart must be valid;
  2. and 5. cannot occurs in Gregorian calendar (no leap month);
  3. is adressed by this PR;
  4. , 6. , 7. and 8. need to be adressed.

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, thank you very much for this contribution, it is much appreciated and a much-requested feature.

I have made a few specific notes in-line on the implementation, but with regards to rscale, I'm a bit torn as to whether that even needs to be included. From an API perspective, it seems easiest to just have None be equivalent to 'GREGORIAN' and enforce that rscale and skip must both be set when parsing with rrulestr and when emitting rrulestr. That would make the Python interface more convenient but less API-compliant, but I think we're better off with convenience here.

The biggest blocker I see on this PR, though is the lack of tests. Please add tests for this new feature that cover all new lines. For each new rrule test, also please add a _rrulestr_reverse_test that your rrule (I'd like to fix this so it's more automatic, but I don't think that's going to happen for this PR).

Thanks!

P.S. Not a blocker at all (I can do it myself), but if you can rebase against master before making changes that would be ideal).

@@ -658,6 +659,23 @@ def __init__(self, freq, dtstart=None,
self._bysecond = tuple(sorted(self._bysecond))
self._original_rule['bysecond'] = self._bysecond

# rscale and optional skip values
if rscale is not None:
assert rscale == 'GREGORIAN', 'unsupported RSCALE'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, I'm not comfortable with using assertions for error handling. This should raise an error of some sort unless you're just asserting some invariant of the code (consider that assertions are turned off if Python is run with python -O.

I'm of two minds about how to implement this. One thing we could do is have a fixed list of calendar types that are considered valid, even if unsupported, adding a class attribute like this:

# List taken from https://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml
VALID_RSCALES = {'GREGORIAN', 'GREGORY',
                 'BUDDHIST', 'CHINESE', 'COPTIC', 'DANGI', ... }

Then change this check to:

if rscale is not None:
    rscale = rscale.upper()
    if rscale in self.VALID_RSCALES:
        if rscale not in {'GREGORY', 'GREGORIAN'}:
            raise NotImplementedError('Non-gregorian calendar '
                                      '{} not supported.'.format(rscale))
    else:
        raise ValueError('Invalid RSCALE component: {}'.format(rscale))

Note that I also support lower and mixed-case calendar names there with the upper call. That said, an alternative to using string comparison at all is to simply have an enum-type class that enumerates the supported calendars. That seems like it might have some advantages here, but depending on the implementation it might make it harder for people to extend dateutil to support other calendars. String comparison seems like a good start.

else:
self._skip = 'OMIT'

assert self._skip in ['OMIT', 'BACKWARD', 'FORWARD'], 'invalid SKIP'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case the possible values for SKIP are probably bounded such that an enum-like structure (can't use real enums because we still support Python 2.7) would be better here. In that case, I'd find an assertion of type or value more acceptable since it's probably a superfluous check that can be omitted in optimized code.

@@ -678,6 +696,14 @@ def __str__(self):
dateutil-specific extension BYEASTER.
"""

def _fmt(obj):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this function? The other two branches you added are never hit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this function, strings are formated uncorrectly by: parts.append(partfmt.format(name=name, vals=(','.join(str(v) for v in value)))). (The second branch never hit because I didn't add tests, but the last one may not be used.)

if day > daysinmonth:
if self._skip == 'BACKWARD':
day = daysinmonth
leap = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called leap? Shouldn't it be called skipped? Or shifted or something?

@pganssle
Copy link
Member

@yoch Are you intending to implement 4, 6, 7 and 8 in a separate PR, or is this a "WIP" PR that you plan on updating to include those cases, or is this just a partial implementation and you're not planning on completely implementing the skip option?

I think we can include a full implementation of skip (minus 1, 2 and 5) in dateutil 2.7.0, otherwise we can defer merging this until after the 2.7.0 release so that the full implementation can be ready for the 2.8.0 release.

day = 1
month += 1
if month == 13:
month = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not actually possible to hit this case in the Gregorian calendar when only allowing skip condition 3, since December always has 31 days.

I'd have to check the RFC rules to see if, with the new extension, you can do something like bymonthday=32 and have it fall forward. I suspect that this is unnecessary, since that should be equivalent to a bymonthday=1 rule.

If this case is forbidden by the RFC (or if this set of conditions is only ever reached for end-of-month skip conditions), just drop this whole if condition and replace it with this:

# In the Gregorian calendar, December has 31 days, so we should never find
# a skip condition that would roll over to the next year
assert month <= 12, "Skip forward logic resulted in anomalous month"

@yoch
Copy link
Author

yoch commented Nov 14, 2017

@pganssle With regard to the missing points, it turns out that it's more complex and probably need to modify the existing code more deeply, so I plan to make a separate PR, if I can.

@pganssle
Copy link
Member

With regard to the missing points, it turns out that it's more complex and probably need to modify the existing code more deeply, so I plan to make a separate PR, if I can.

Feel free to continue using this branch or to create a new PR based off of this one. I don't want to have to release with a half-done feature, so until the feature is more or less complete I wouldn't want to merge it until after the 2.7.0 release, which likely won't be for a month or so. If you finish the other parts I can merge it as part of the 2.7.0 release.

@yoch
Copy link
Author

yoch commented Nov 15, 2017

Cases 4 and 6 are now correctly handled (no idea why appveyor fails, but the errors doesn't seems related to the commit).

@@ -431,6 +431,28 @@ def testMonthlySkipForward(self):
datetime(2015, 3, 1, 0, 0),
datetime(2015, 3, 31, 0, 0)])

def testMonthlyByMonthSkipBackward(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the new stuff cover the bymonthday=30 case as well? Can you add tests for that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, in fact this was a fix for BYMONTH. But I've fixed it for BYMONTHDAY now.

Now I don't know how to deal with the negatives cases (do we also interpret FORWARD and BACKWARD in negative sense or not).

@pganssle
Copy link
Member

Appveyor failure was due to pytest-dev/py#169, seems fixed now.

@pganssle
Copy link
Member

@yoch I edited your first post to add a "to-do" list tracking the cases, because I was having a hard time remembering which cases were left to do. I wasn't sure if you would be able to update a separate post by me, so I edited your post so that you'd have permissions to update the list.

Thanks again for your hard work on this.

@pganssle
Copy link
Member

@yoch If you rebase against master, the master branch has a fix for the Appveyor issue.

This adds both the SKIP and RSCALE options to rrule, per 7529. They are
added together because SKIP requires RSCALE, but non-gregorian calendars
are not currently implemented.
pganssle added a commit to yoch/dateutil that referenced this pull request Feb 2, 2019
@pganssle pganssle force-pushed the rfc7529 branch 2 times, most recently from d0dabc9 to 18ae390 Compare February 2, 2019 16:52
@pganssle
Copy link
Member

pganssle commented Feb 2, 2019

I've started some refactoring on this, I'm not sure why the BYMONTHDAY and BYYEARDAY elements were checked off, because I think actually none of the byxxx rules are handled. I guess this has to be pushed to 2.9.0.

@pganssle pganssle modified the milestones: 2.8.0, 2.9.0 Feb 2, 2019
Since we do not actually have support for RSCALE in the rrule class,
it is premature to allow users to specify it. Instead, we will validate
it if present in strings and emit it if necessary to comply with the RFC
but otherwise postpone an explicit API for handling RSCALE until after
real support is added.
Rather than using hard-code strings, we will use dedicated objects for
the skip options, similar to the way other special values are handled.
@mojimi
Copy link

mojimi commented Jul 17, 2020

What can be done to help move this forward? Highly needed feature

immerrr pushed a commit to immerrr/dateutil that referenced this pull request Mar 31, 2021
@immerrr
Copy link

immerrr commented Mar 31, 2021

@pganssle I have taken a stab at implementing the skipping in 2dfd4a7. It fixes the failing tests that were added to this branch in 39d51f6

My change is somewhat rough: it may raise errors on bymonthday=32 and does not handle negative bymonthdays at all, but it shows the general idea. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants