Fix docstrings, error messages #19672

jbrockmendel · 2018-02-13T05:43:38Z

The discussion in #19558 makes me think getting PeriodArray and DatetimeArray implemented sooner rather than later will make things easier for other goals. In preparation for that, this PR goes through (most) of the methods that will need to be ported and adapts their error messages to be class-agnostic, and to use str.format syntax.

Also fixes a couple of copy/paste typos in docstrings.

jorisvandenbossche

Added a few comments, but in general, I would defer this to when we actually changed things instead of now already trying to be "prepared" or "robust" for such changes without knowing how it will exactly be done in practice.

I would rather put effort in actually finding out how we can such a refactor to dispatch ops to Array-likes, backing both ops for Series/Block and Index, but that is something you have to sync closely with @TomAugspurger to make sure you don't conflict with what he is doing (and I think help is certainly welcome! we just need to make sure we are not doing duplicate work)

jorisvandenbossche · 2018-02-13T09:05:50Z

pandas/core/indexes/datetimelike.py

@@ -794,7 +796,7 @@ def isin(self, values):

    def shift(self, n, freq=None):
        """
-        Specialized shift which produces a DatetimeIndex
+        Specialized shift which produces a new object of the same type as self


I don't think this is a useful description for users (they don't necessarily know what 'self' is)

This is a good note. But the reason for this change isn't future-looking; this is also the docstring for TimedeltaIndex.shift.

jorisvandenbossche · 2018-02-13T10:19:41Z

pandas/core/indexes/datetimelike.py

-                                .format(typ=type(other)))
+                raise TypeError("cannot add {cls} and {typ}"
+                                .format(cls=type(self).__name__,
+                                        typ=type(other).__name__))


I am not sure this is an improvement in robustness, as if series would use the same code, this 'cls' would become either "Series" or "TimedeltaArray", and "Series" is not that informative, and "TimedeltaArray" should not bubble up to user when using a Series (that would be an internal implementation detail).

The above is always a TimedeltaIndex given the elif statement above

jorisvandenbossche · 2018-02-13T10:19:59Z

pandas/core/indexes/datetimelike.py

@@ -804,7 +806,7 @@ def shift(self, n, freq=None):

        Returns
        -------
-        shifted : DatetimeIndex
+        shifted : type(self)


jorisvandenbossche · 2018-02-13T10:20:26Z

pandas/core/indexes/datetimes.py


        if isinstance(other, (datetime, compat.string_types)):
            if isinstance(other, datetime):
                # GH#18435 strings get a pass from tzawareness compat
                self._assert_tzawareness_compat(other)

            other = _to_m8(other, tz=self.tz)
-            result = func(other)
+            result = binop(self, other)


why this change?

Because DatetimeIndex.__mro__ has 12 entries, so with the standard super usage it isn't obvious where the method is defined.

TomAugspurger · 2018-02-13T11:38:19Z

Haven't looked closely yet, but I haven't touched datetimeindex yet.

Including the correct class name in the error message has been an issue, so those changes should be worthwhile.

Should be able to look closer in a few days.

jbrockmendel · 2018-02-13T15:40:50Z

Just pushed a fix for the shift docstring that re-evaluates it within TimedeltaIndex and DatetimeIndex. I'm wondering if there is a metaclass (or hopefully a lighter-weight option) way to do this without having to re-define the function in each subclass. Could that be what _decorators.docstring_wrapper (which is not used) is for?

codecov · 2018-02-13T17:58:42Z

Codecov Report

Merging #19672 into master will increase coverage by <.01%.
The diff coverage is 93.1%.

@@            Coverage Diff             @@
##           master   #19672      +/-   ##
==========================================
+ Coverage   91.58%   91.58%   +<.01%     
==========================================
  Files         150      150              
  Lines       48867    48875       +8     
==========================================
+ Hits        44755    44763       +8     
  Misses       4112     4112

Flag	Coverage Δ
#multiple	`89.96% <93.1%> (ø)`	⬆️
#single	`41.75% <41.37%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/indexes/datetimelike.py	`97.05% <100%> (ø)`	⬆️
pandas/core/indexes/datetimes.py	`95.33% <90.9%> (+0.01%)`	⬆️
pandas/core/indexes/timedeltas.py	`90.63% <93.33%> (+0.07%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2fdf1e2...a39b22a. Read the comment docs.

jbrockmendel · 2018-02-15T04:20:33Z

Thoughts on this? There are a number of broken cases to fix in and around these methods; ideally we'd close this out first.

jorisvandenbossche · 2018-02-15T08:39:35Z

pandas/core/indexes/datetimelike.py

-                                .format(typ=type(other)))
+                raise TypeError("cannot add {cls} and {typ}"
+                                .format(cls=type(self).__name__,
+                                        typ=type(other).__name__))


The above is always a TimedeltaIndex given the elif statement above

jorisvandenbossche · 2018-02-15T08:39:41Z

pandas/core/indexes/datetimelike.py

-                                    .format(typ=type(other).__name__))
+                    raise TypeError("cannot subtract {cls} and {typ}"
+                                    .format(cls=type(self).__name__,
+                                            typ=type(other).__name__))


jorisvandenbossche · 2018-02-15T08:47:09Z

pandas/core/indexes/datetimes.py


    def _sub_datelike(self, other):
        # subtract a datetime from myself, yielding a TimedeltaIndex
        from pandas import TimedeltaIndex
        if isinstance(other, DatetimeIndex):
            # require tz compat
            if not self._has_same_tz(other):
-                raise TypeError("DatetimeIndex subtraction must have the same "
-                                "timezones or no timezones")
+                raise TypeError("{cls} subtraction must have the same "


Same here, this will always be subtraction of DatetimeIndexes ?

It is now, but one of the on-deck PRs will have ndarray[datetime64] go down this path too to fix:

dti = pd.date_range('2016-01-01', periods=3) arr = np.array(['2015-01-01', '2015-02-02', '2015-03-03'], dtype='datetime64[ns]') >>> dti - arr DatetimeIndex(['1971-01-01', '1970-12-01', '1970-11-03'], dtype='datetime64[ns]', freq=None)

how about instead of using type(self).__name__ we make a little function which can format these. I agree with @jorisvandenbossche point here, in the case of an Index type the dtype is clear, but not for a Series.

This seems worthwhile but a pretty low priority. Closing for now.

jorisvandenbossche · 2018-02-15T08:47:29Z

pandas/core/indexes/datetimes.py

-                            .format(typ=type(other).__name__))
+            raise TypeError("cannot subtract {cls} and {typ}"
+                            .format(cls=type(self).__name__,
+                                    typ=type(other).__name__))


jorisvandenbossche · 2018-02-15T08:48:28Z

pandas/core/indexes/timedeltas.py

@@ -59,23 +59,25 @@ def _td_index_cmp(opname, cls, nat_result=False):
    """

    def wrapper(self, other):
-        msg = "cannot compare a TimedeltaIndex with type {0}"
-        func = getattr(super(TimedeltaIndex, self), opname)
+        msg = "cannot compare a {cls} with type {typ}"


same here? self is always a TimedeltaIndex

TDI/DTI/PI have very similar implementations for these methods, some of which can be de-duplicated if we make these class-agnostic. More to the point, we want these messages to remain accurate if/when they get moved to the array level. Also in principle this could be subclassed and we'd like the name to come out right there too.

can you add some tests which lock down these error messages? IOW the types that are supposed to appear

jorisvandenbossche · 2018-02-15T08:50:04Z

pandas/core/indexes/datetimelike.py

@@ -794,7 +796,7 @@ def isin(self, values):

    def shift(self, n, freq=None):
        """
-        Specialized shift which produces a DatetimeIndex
+        Specialized shift which produces a %(klass)s


Instead of doing the overriding in the subclass just to fix this name, we can maybe reword the docstring to be clear enough?
Eg mention it works for Datetime- and TimedeltaIndex?

That was the idea in the earlier version. I'm open to suggestions.

jreback · 2018-02-18T15:59:14Z

pandas/core/indexes/datetimes.py


    def _sub_datelike(self, other):
        # subtract a datetime from myself, yielding a TimedeltaIndex
        from pandas import TimedeltaIndex
        if isinstance(other, DatetimeIndex):
            # require tz compat
            if not self._has_same_tz(other):
-                raise TypeError("DatetimeIndex subtraction must have the same "
-                                "timezones or no timezones")
+                raise TypeError("{cls} subtraction must have the same "


how about instead of using type(self).__name__ we make a little function which can format these. I agree with @jorisvandenbossche point here, in the case of an Index type the dtype is clear, but not for a Series.

jreback · 2018-02-18T15:59:41Z

pandas/core/indexes/datetimes.py

@@ -901,6 +903,11 @@ def _sub_datelike_dti(self, other):
            new_values[mask] = libts.iNaT
        return new_values.view('i8')

+    @Substitution(klass='DatetimeIndex')
+    @Appender(DatetimeIndexOpsMixin.shift.__doc__)


this should be a shared doc-string instead (and same for the other things you added). this is the existing pattern.

jreback · 2018-02-18T16:00:17Z

pandas/core/indexes/timedeltas.py

@@ -59,23 +59,25 @@ def _td_index_cmp(opname, cls, nat_result=False):
    """

    def wrapper(self, other):
-        msg = "cannot compare a TimedeltaIndex with type {0}"
-        func = getattr(super(TimedeltaIndex, self), opname)
+        msg = "cannot compare a {cls} with type {typ}"


can you add some tests which lock down these error messages? IOW the types that are supposed to appear

jreback · 2018-02-18T16:00:35Z

pandas/core/indexes/timedeltas.py

@@ -409,7 +412,7 @@ def _add_datelike(self, other):
            result = checked_add_with_arr(i8, other.value,
                                          arr_mask=self._isnan)
            result = self._maybe_mask_results(result, fill_value=iNaT)
-        return DatetimeIndex(result, name=self.name, copy=False)


what changed here?

It just got indented (next line in the diff) b/c this branch is never reached except for in the indented case.

jbrockmendel added 4 commits February 12, 2018 21:21

Fic docstrings, make error messages class-agnostic

ddb78de

revert some less obvious changes

08d575e

one more formatter

f0d203f

make inheritance more transparent

a7775ea

jorisvandenbossche reviewed Feb 13, 2018

View reviewed changes

improve DTI/TDI shift docstring

c779f64

fix unreached NameError

5c2868a

gfyoung added Internals Related to non-user accessible pandas implementation Docs labels Feb 14, 2018

jbrockmendel changed the title ~~DatetimeArray/PeriodArray prelims~~ Fix docstrings, error messages Feb 14, 2018

jorisvandenbossche reviewed Feb 15, 2018

View reviewed changes

jbrockmendel mentioned this pull request Feb 15, 2018

Implement maybe_cache for compat between immutable/mutable classes #19709

Closed

4 tasks

Merge branch 'master' into array_prelims

a39b22a

jbrockmendel mentioned this pull request Feb 16, 2018

Parametrize PeriodIndex tests #19659

Merged

4 tasks

jreback requested changes Feb 18, 2018

View reviewed changes

jreback added the Error Reporting Incorrect or improved errors from pandas label Feb 18, 2018

jbrockmendel closed this Feb 19, 2018

jbrockmendel deleted the array_prelims branch June 22, 2018 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix docstrings, error messages #19672

Fix docstrings, error messages #19672

jbrockmendel commented Feb 13, 2018

jorisvandenbossche left a comment •

edited

Loading

jorisvandenbossche Feb 13, 2018

jbrockmendel Feb 13, 2018

jorisvandenbossche Feb 13, 2018

jorisvandenbossche Feb 15, 2018

jorisvandenbossche Feb 13, 2018

jorisvandenbossche Feb 13, 2018

jbrockmendel Feb 13, 2018

TomAugspurger commented Feb 13, 2018

jbrockmendel commented Feb 13, 2018

codecov bot commented Feb 13, 2018 •

edited

Loading

jbrockmendel commented Feb 15, 2018

jorisvandenbossche Feb 15, 2018

jorisvandenbossche Feb 15, 2018

jorisvandenbossche Feb 15, 2018

jbrockmendel Feb 15, 2018

jreback Feb 18, 2018

jbrockmendel Feb 19, 2018

jorisvandenbossche Feb 15, 2018

jorisvandenbossche Feb 15, 2018

jbrockmendel Feb 15, 2018

jreback Feb 18, 2018

jorisvandenbossche Feb 15, 2018

jbrockmendel Feb 15, 2018

jreback Feb 18, 2018

jreback Feb 18, 2018

jreback Feb 18, 2018

jreback Feb 18, 2018

jbrockmendel Feb 18, 2018

Fix docstrings, error messages #19672

Fix docstrings, error messages #19672

Conversation

jbrockmendel commented Feb 13, 2018

jorisvandenbossche left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Feb 13, 2018

jbrockmendel commented Feb 13, 2018

codecov bot commented Feb 13, 2018 • edited Loading

Codecov Report

jbrockmendel commented Feb 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment •

edited

Loading

codecov bot commented Feb 13, 2018 •

edited

Loading