New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamp methods #17876

Closed
jbrockmendel opened this Issue Oct 15, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@jbrockmendel
Member

jbrockmendel commented Oct 15, 2017

[ ] - Timestamp classmethods should return cls, not hard-coded Timestamp
[ ] - ... if and only if is_timestamp is updated to recognized subclasses.
[ ] - Timestamp._get_field does not need to be in the user-facing namespace
[x] - Timestamp._get_named_field does not need to be in the user-facing namespace
[ ] - Timestamp._get_start_end_field does not need to be in the user-facing namespace
[ ] - NaT / timedelta should behave like NaT / Timedelta #17955
[ ] - _assert_tzawareness_compat doesn't actually return the declared return type
[ ] - Is it the case that _localize_tso argument obj will always have obj.tzinfo == None? This is the case in all existing usages.

A few quick questions:

  1. __new__ returns Timestamp instead of cls. Is this intentional? Ditto Timestamp.replace returning Timestamp instead of self.__class__.

    • This might be solved by making create_timestamp_from_ts a classmethod. Maybe this function was written before cython supported cdef classmethods?
  2. Similarly, is_timestamp will miss subclasses of Timestamp. That is appreciably faster than isinstance(obj, Timestamp), so this may just be something we have to live with.

  3. Some Timestamp methods are cpdef but may not need to be user-facing, i.e. could just be cdef: _get_field, _get_named_field, _get_start_end_field

  4. Timestamp._get_named_field is only used once, for weekday_name. This calls

    val = self._maybe_convert_value_to_local()
    out = get_date_name_field(np.array([val], dtype=np.int64), field)
    return out[0]

which seems like huge overkill for return {0: 'Monday', ..., 6: 'Sunday'}[self.weekday()]. Was there a decision at some point that the latter is less performant?

@gfyoung gfyoung added the Internals label Oct 15, 2017

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 15, 2017

4. is not locale safe (and less general); imagine returning month names for example
1. prob could be changed
2. not sure this actually matters
3. prob could be changed

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 15, 2017

Yah, definitely all super-low priority.

Not sure about your numbering scheme, except for 4. locale-safeness would be a good reason for the status-quo, but the current version also hard-codes English names.

In a quick %timeit comparison, using a dict lookup is ~3x faster.

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Oct 16, 2017

@jbrockmendel fixed the numbering (sometimes it is a bit annoying that you can write whathever number in markdown and it fixes it for you :-))

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 17, 2017

Related enough:

>>> pd.NaT / pd.Timedelta(seconds=1)
nan
>>> pd.NaT / pd.Timedelta(seconds=1).to_pytimedelta()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for /: 'NaTType' and 'datetime.timedelta'
  1. Treating NaT/timedelta the same as NaT/Timedelta would amount to changing an isinstance(other, Timedelta) check to isinstance(other, timedelta) (or actually, a more performant C version). Is treating these differently an intentional choice?
@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 17, 2017

  1. prob could be changed
  2. not sure this actually matters

Fair enough. Changing 1 without changing 2 is liable to cause headaches down the road, so I'm going to mentally mark these as "wont fix"

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 17, 2017

Two more:

  1. _localize_tso docstring reads Take a TSObject in UTC and localizes to timezone tz.. In practice I think obj.tzinfo is always None on the way in, but there are no checks to that effect. Is there any scenario in which this function would be called and obj.tzinfo would be non-null (and non-UTC)?

  2. Timestamp method cdef int _assert_tzawareness_compat(_Timestamp self, object other) except -1: never actually returns an int, so the return type can be removed -- and with it the ugly "except -1" clause. Sidenote: the cython people tell me that declaring _Timestamp self is perfunctory.

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 22, 2017

Updated numbering scheme above. 1-4 have been answered. 5) will be actionable if I get the go-ahead.

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 23, 2017

pd.NaT / pd.Timedelta(seconds=1).to_pytimedelta()
Traceback (most recent call last):
File "", line 1, in
TypeError: unsupported operand type(s) for /: 'NaTType' and 'datetime.timedelta'

should probably work, I think if we can de-couple NaT from datetime/timedelta directly (but make it act like duck), then a few things will be easier

@jbrockmendel jbrockmendel referenced this issue Oct 23, 2017

Merged

nat division by timedelta #17955

3 of 4 tasks complete
@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 23, 2017

should probably work, I think if we can de-couple NaT from datetime/timedelta directly (but make it act like duck), then a few things will be easier

OK. Implementing this in #17955. Regardless of the end-game for de-coupling NaT, #17793 should be a step in the right direction.

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 23, 2017

can you update top section with check boxes as needed (then can reference the closing PR).

@jbrockmendel jbrockmendel referenced this issue Oct 27, 2017

Open

tslibs TODO list #17652

28 of 59 tasks complete
@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Oct 27, 2017

Just ported this list above over to the TODO list in #17652; this can be closed if you want to de-clutter the tracker.

@jreback jreback closed this Oct 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment