Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use range in RangeIndex instead of _start etc. #26581

Merged
merged 4 commits into from Jun 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions doc/source/whatsnew/v0.25.0.rst
Expand Up @@ -473,6 +473,9 @@ Other Deprecations
the :meth:`SparseArray.to_dense` method instead (:issue:`26421`).
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64` or :meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
- The :meth:`DataFrame.compound` and :meth:`Series.compound` methods are deprecated and will be removed in a future version (:issue:`26405`).
- The internal attributes ``_start``, ``_stop`` and ``_step`` attributes of :class:`RangeIndex` have been deprecated.
Use the public attributes :attr:`~RangeIndex.start`, :attr:`~RangeIndex.stop` and :attr:`~RangeIndex.step` instead (:issue:`26581`).


.. _whatsnew_0250.prior_deprecations:

Expand Down
29 changes: 29 additions & 0 deletions pandas/core/dtypes/common.py
@@ -1,4 +1,5 @@
""" common type operations """
from typing import Union
import warnings

import numpy as np
Expand Down Expand Up @@ -125,6 +126,34 @@ def ensure_int_or_float(arr: ArrayLike, copy=False) -> np.array:
return arr.astype('float64', copy=copy)


def ensure_python_int(value: Union[int, np.integer]) -> int:
"""
Ensure that a value is a python int.

Parameters
----------
value: int or numpy.integer

Returns
-------
int

Raises
------
TypeError: if the value isn't an int or can't be converted to one.
"""
if not is_scalar(value):
raise TypeError("Value needs to be a scalar value, was type {}"
.format(type(value)))
msg = "Wrong type {} for value {}"
try:
new_value = int(value)
assert (new_value == value)
except (TypeError, ValueError, AssertionError):
raise TypeError(msg.format(type(value), value))
return new_value


def classes(*klasses):
""" evaluate if the tipo is a subclass of the klasses """
return lambda tipo: issubclass(tipo, klasses)
Expand Down
21 changes: 11 additions & 10 deletions pandas/core/dtypes/concat.py
Expand Up @@ -541,36 +541,37 @@ def _concat_rangeindex_same_dtype(indexes):
"""
from pandas import Int64Index, RangeIndex

start = step = next = None
start = step = next_ = None

# Filter the empty indexes
non_empty_indexes = [obj for obj in indexes if len(obj)]

for obj in non_empty_indexes:
rng = obj._range # type: range
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised the annotation here doesn't throw an error - is there an actual type you can put in here?

Copy link
Contributor Author

@topper-123 topper-123 Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A range is actually a type in Python 3 :-). So issubclass(range, typing.Sequence) evaluates to True.

EDIT: If you meant that obj is untyped, then I think it's because untyped objects can have any attributes.


if start is None:
# This is set by the first non-empty index
start = obj._start
if step is None and len(obj) > 1:
step = obj._step
start = rng.start
if step is None and len(rng) > 1:
step = rng.step
elif step is None:
# First non-empty index had only one element
if obj._start == start:
if rng.start == start:
return _concat_index_same_dtype(indexes, klass=Int64Index)
step = obj._start - start
step = rng.start - start

non_consecutive = ((step != obj._step and len(obj) > 1) or
(next is not None and obj._start != next))
non_consecutive = ((step != rng.step and len(rng) > 1) or
(next_ is not None and rng.start != next_))
if non_consecutive:
return _concat_index_same_dtype(indexes, klass=Int64Index)

if step is not None:
next = obj[-1] + step
next_ = rng[-1] + step

if non_empty_indexes:
# Get the stop value from "next" or alternatively
# from the last non-empty index
stop = non_empty_indexes[-1]._stop if next is None else next
stop = non_empty_indexes[-1].stop if next_ is None else next_
return RangeIndex(start, stop, step)

# Here all "indexes" had 0 length, i.e. were empty.
Expand Down
10 changes: 5 additions & 5 deletions pandas/core/frame.py
Expand Up @@ -2282,7 +2282,7 @@ def info(self, verbose=None, buf=None, max_cols=None, memory_usage=None,
text_col 5 non-null object
float_col 5 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 200.0+ bytes
memory usage: 248.0+ bytes

Prints a summary of columns count and its dtypes but not per column
information:
Expand All @@ -2292,7 +2292,7 @@ def info(self, verbose=None, buf=None, max_cols=None, memory_usage=None,
RangeIndex: 5 entries, 0 to 4
Columns: 3 entries, int_col to float_col
dtypes: float64(1), int64(1), object(1)
memory usage: 200.0+ bytes
memory usage: 248.0+ bytes

Pipe output of DataFrame.info to buffer instead of sys.stdout, get
buffer content and writes to a text file:
Expand Down Expand Up @@ -2494,7 +2494,7 @@ def memory_usage(self, index=True, deep=False):
4 1 1.0 1.0+0.0j 1 True

>>> df.memory_usage()
Index 80
Index 128
int64 40000
float64 40000
complex128 80000
Expand All @@ -2513,7 +2513,7 @@ def memory_usage(self, index=True, deep=False):
The memory footprint of `object` dtype columns is ignored by default:

>>> df.memory_usage(deep=True)
Index 80
Index 128
int64 40000
float64 40000
complex128 80000
Expand All @@ -2525,7 +2525,7 @@ def memory_usage(self, index=True, deep=False):
many repeated values.

>>> df['object'].astype('category').memory_usage(deep=True)
5168
5216
"""
result = Series([c.memory_usage(index=False, deep=deep)
for col, c in self.iteritems()], index=self.columns)
Expand Down