New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: enforce deprecation of frequencies deprecated for offsets #57986
base: main
Are you sure you want to change the base?
CLN: enforce deprecation of frequencies deprecated for offsets #57986
Conversation
I enforced deprecation of aliases |
/preview |
Website preview of this PR available at: https://pandas.pydata.org/preview/pandas-dev/pandas/57986/ |
pandas/_libs/tslibs/offsets.pyx
Outdated
) | ||
elif is_period and name.upper() in c_OFFSET_DEPR_FREQSTR: | ||
elif is_period and name.upper() in c_OFFSET_REMOVED_FREQSTR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this branch I'm seeing something like: if it's removed, then warn that it's deprecated. Is this right?
Do we need c_OFFSET_REMOVED_FREQSTR
at all, or is it possible to just let it raise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this branch I'm seeing something like: if it's removed, then warn that it's deprecated. Is this right?
agreed, this check looks strange. It's my mistake, I will correct it.
Do we need c_OFFSET_REMOVED_FREQSTR at all, or is it possible to just let it raise?
I am unsure, maybe we can keep c_OFFSET_REMOVED_FREQSTR
at least for a while. I think it would be good to explain when we raise why this frequency is invalid. Because it's possible to use this frequency in other cases, it wouldn't be entirely correct to say that it's "Invalid frequency"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sure, that's a good point!
If we're going to keep it around then, maybe we can tell users which to use instead? At least for offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I agree. I corrected the error message. Now we advise users which frequency for offsets to use instead of invalid one. I think CI failures are unrelated to my changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating!
I've left a comment, but also, in general, this still looks very complex...if we're enforcing deprecations, then this might be a good chance to simplify the logic here?
OK with adding complexity to give a good error message if someone passes freq='M'
instead of 'ME'
, as that's probably still fairly common, but periods are far less used
The code is currently very hard to read - which is OK as a temporary phase during which we're enforcing a deprecation - but ultimately the goal should be to end up something that's cleaner than it was when we started. Is that possible here?
pandas/_libs/tslibs/offsets.pyx
Outdated
) | ||
name = c_OFFSET_DEPR_FREQSTR.get(name.upper()) | ||
raise ValueError(INVALID_FREQ_ERR_MSG.format(name)) | ||
name = c_OFFSET_REMOVED_FREQSTR.get(name.upper()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about this line? It looks very odd to me to do
if is_period and name.upper() in c_OFFSET_REMOVED_FREQSTR:
name = c_OFFSET_REMOVED_FREQSTR.get(name.upper())
If you need to map 'M'
(period) to its associated offset alias ('ME'), then it's there another dictionary, or another way, to do that? c_OFFSET_REMOVED_FREQSTR
will presumably be removed at some point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that c_OFFSET_REMOVED_FREQSTR
is functioning as both:
- aliases which used to be valid for offsets, but have been renamed
- period aliases, to get the corresponding offset aliases
How about creating another dictionary, which maps periods aliases to their corresponding offset aliases? I think this one could be a lot smaller - for example, it wouldn't need "BY": "BYE",
, right?
Then, to_offset
could be a lot simpler, something like
if not is_period and name in c_OFFSET_REMOVED_FREQSTR:
# raise error message
if is_period:
if name in c_PERIOD_TO_OFFSET_FREQSTR:
name = c_PERIOD_TO_OFFSET_FREQSTR[name]
else:
# raise: `name` not support for period
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about this line? It looks very odd to me to do
if is_period and name.upper() in c_OFFSET_REMOVED_FREQSTR: name = c_OFFSET_REMOVED_FREQSTR.get(name.upper())
If you need to map
'M'
(period) to its associated offset alias ('ME'), then it's there another dictionary, or another way, to do that?c_OFFSET_REMOVED_FREQSTR
will presumably be removed at some point
thank you for the comment. It seems odd, I agree. I removed this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that
c_OFFSET_REMOVED_FREQSTR
is functioning as both:* aliases which used to be valid for offsets, but have been renamed * period aliases, to get the corresponding offset aliases
How about creating another dictionary, which maps periods aliases to their corresponding offset aliases? I think this one could be a lot smaller - for example, it wouldn't need
"BY": "BYE",
, right?Then,
to_offset
could be a lot simpler, something likeif not is_period and name in c_OFFSET_REMOVED_FREQSTR: # raise error message if is_period: if name in c_PERIOD_TO_OFFSET_FREQSTR: name = c_PERIOD_TO_OFFSET_FREQSTR[name] else: # raise: `name` not support for period?
thanks, I added the dictionary PERIOD_TO_OFFSET_FREQSTR
and made changes as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating! A few things
- good to have the
c_PERIOD_TO_OFFSET_FREQSTR
dict, but I don't see why it's being used here:If you've already checkedif not is_period: if name.upper() in c_OFFSET_REMOVED_FREQSTR: raise ValueError( f"\'{name}\' is no longer supported for offsets. Please " f"use \'{c_OFFSET_REMOVED_FREQSTR.get(name.upper())}\' " f"instead." ) # below we raise for lowrecase monthly and bigger frequencies if (name.upper() != name and name.lower() not in {"h", "min", "s", "ms", "us", "ns"} and name.upper() not in c_PERIOD_TO_OFFSET_FREQSTR and name.upper() in c_OFFSET_TO_PERIOD_FREQSTR): raise ValueError(INVALID_FREQ_ERR_MSG.format(name))
if not is_period
, when why do you need to check if it's inc_PERIOD_TO_OFFSET_FREQSTR
? lowrecase
typo- is this part temporary
?
elif name in {"d", "b"}: name = name.upper() elif (name.upper() not in {"B", "D"} and not name.upper().startswith("W")):
If so, could you add a comment explaining why it needs to be there, possibly linking to an open issue? Ideally we should get to the point where we can get rid of all this complexity, so let's make it clear what the road towards that endpoint is
we need the check if it isn't in for example without this check |
thanks, I corrected the typo |
Yes, it's the temporary part. I left the comment below. |
thanks for explaining - is there a way to do that part without using |
but what should we do with aliases which are the same for both: period and offsets, such as After enforcing the deprecation of |
I'd suggest either that, or to add a set which contains aliases which are valid for both |
thanks, then maybe we can leave it as it is? |
cool, I think this is on the right track I think there's a logic error somewhere, as it currently gives
but 's' should definitely be supported here, right? |
xref #52064, #55792, #55553, #55496
Enforced deprecation of aliases
M
,Q
,Y
, etc. in favour ofME
,QE
,YE
, etc. for offsets. Now the aliases are case-sensitive.P.S. Corrected a note in v3.0.0 related to PR #57627