Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/mikebailey/pandas into mi…
Browse files Browse the repository at this point in the history
…kebailey-master
  • Loading branch information
jreback committed Apr 29, 2014
2 parents 37f8dc4 + bcf7f0d commit ac189a2
Show file tree
Hide file tree
Showing 10 changed files with 441 additions and 30 deletions.
6 changes: 6 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1143,6 +1143,12 @@ Time/Date Components
DatetimeIndex.tz
DatetimeIndex.freq
DatetimeIndex.freqstr
DatetimeIndex.is_month_start
DatetimeIndex.is_month_end
DatetimeIndex.is_quarter_start
DatetimeIndex.is_quarter_end
DatetiemIndex.is_year_start
DatetimeIndex.is_year_end

Selecting
~~~~~~~~~
Expand Down
8 changes: 8 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,17 @@ API Changes
- ``hour,minute,second,weekofyear``
- ``week,dayofweek,dayofyear,quarter``
- ``microsecond,nanosecond,qyear``
- ``is_month_start,is_month_end``
- ``is_quarter_start,is_quarter_end``
- ``is_year_start,is_year_end``
- ``min(),max()``
- ``pd.infer_freq()``

- Add ``is_month_start``, ``is_month_end``, ``is_quarter_start``, ``is_quarter_end``,
``is_year_start``, ``is_year_end`` accessors for ``DateTimeIndex`` / ``Timestamp`` which return a boolean array
of whether the timestamp(s) are at the start/end of the month/quarter/year defined by the
frequency of the ``DateTimeIndex`` / ``Timestamp`` (:issue:`4565`, :issue:`6998`))

- ``pd.infer_freq()`` will now raise a ``TypeError`` if given an invalid ``Series/Index``
type (:issue:`6407`, :issue:`6463`)

Expand Down
33 changes: 33 additions & 0 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,39 @@ regularity will result in a ``DatetimeIndex`` (but frequency is lost):
.. _timeseries.offsets:

Time/Date Components
~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are several time/date properties that one can access from ``Timestamp`` or a collection of timestamps like a ``DateTimeIndex``.

.. csv-table::
:header: "Property", "Description"
:widths: 15, 65

year, "The year of the datetime"
month,"The month of the datetime"
day,"The days of the datetime"
hour,"The hour of the datetime"
minute,"The minutes of the datetime"
second,"The seconds of the datetime"
microsecond,"The microseconds of the datetime"
nanosecond,"The nanoseconds of the datetime"
date,"Returns datetime.date"
time,"Returns datetime.time"
dayofyear,"The ordinal day of year"
weekofyear,"The week ordinal of the year"
week,"The week ordinal of the year"
dayofweek,"The day of the week with Monday=0, Sunday=6"
weekday,"The day of the week with Monday=0, Sunday=6"
quarter,"Quarter of the date: Jan=Mar = 1, Apr-Jun = 2, etc."
is_month_start,"Logical indicating if first day of month (defined by frequency)"
is_month_end,"Logical indicating if last day of month (defined by frequency)"
is_quarter_start,"Logical indicating if first day of quarter (defined by frequency)"
is_quarter_end,"Logical indicating if last day of quarter (defined by frequency)"
is_year_start,"Logical indicating if first day of year (defined by frequency)"
is_year_end,"Logical indicating if last day of year (defined by frequency)"


DateOffset objects
------------------

Expand Down
5 changes: 5 additions & 0 deletions doc/source/v0.14.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ API changes
- ``hour,minute,second,weekofyear``
- ``week,dayofweek,dayofyear,quarter``
- ``microsecond,nanosecond,qyear``
- ``is_month_start,is_month_end``
- ``is_quarter_start,is_quarter_end``
- ``is_year_start,is_year_end``
- ``min(),max()``
- ``pd.infer_freq()``

Expand All @@ -89,6 +92,8 @@ API changes
s.year
s.index.year

- Add ``is_month_start``, ``is_month_end``, ``is_quarter_start``, ``is_quarter_end``, ``is_year_start``, ``is_year_end`` accessors for ``DateTimeIndex`` / ``Timestamp`` which return a boolean array of whether the timestamp(s) are at the start/end of the month/quarter/year defined by the frequency of the ``DateTimeIndex`` / ``Timestamp`` (:issue:`4565`, :issue:`6998`)

- More consistent behaviour for some groupby methods:

groupby ``head`` and ``tail`` now act more like ``filter`` rather than an aggregation:
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,3 +336,9 @@ def nunique(self):
dayofyear = _field_accessor('dayofyear', "The ordinal day of the year")
quarter = _field_accessor('quarter', "The quarter of the date")
qyear = _field_accessor('qyear')
is_month_start = _field_accessor('is_month_start', "Logical indicating if first day of month (defined by frequency)")
is_month_end = _field_accessor('is_month_end', "Logical indicating if last day of month (defined by frequency)")
is_quarter_start = _field_accessor('is_quarter_start', "Logical indicating if first day of quarter (defined by frequency)")
is_quarter_end = _field_accessor('is_quarter_end', "Logical indicating if last day of quarter (defined by frequency)")
is_year_start = _field_accessor('is_year_start', "Logical indicating if first day of year (defined by frequency)")
is_year_end = _field_accessor('is_year_end', "Logical indicating if last day of year (defined by frequency)")
8 changes: 4 additions & 4 deletions pandas/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ def test_value_counts_unique_nunique(self):
# freq must be specified because repeat makes freq ambiguous
o = klass(np.repeat(values, range(1, len(o) + 1)), freq=o.freq)
else:
o = klass(np.repeat(values, range(1, len(o) + 1)))
o = klass(np.repeat(values, range(1, len(o) + 1)))

expected_s = Series(range(10, 0, -1), index=values[::-1], dtype='int64')
tm.assert_series_equal(o.value_counts(), expected_s)
Expand Down Expand Up @@ -246,7 +246,7 @@ def test_value_counts_unique_nunique(self):
if isinstance(o, PeriodIndex):
o = klass(np.repeat(values, range(1, len(o) + 1)), freq=o.freq)
else:
o = klass(np.repeat(values, range(1, len(o) + 1)))
o = klass(np.repeat(values, range(1, len(o) + 1)))

if isinstance(o, DatetimeIndex):
# DatetimeIndex: nan is casted to Nat and included
Expand Down Expand Up @@ -278,7 +278,7 @@ def test_value_counts_inferred(self):
s = klass(s_values)
expected = Series([4, 3, 2, 1], index=['b', 'a', 'd', 'c'])
tm.assert_series_equal(s.value_counts(), expected)

self.assert_numpy_array_equal(s.unique(), np.unique(s_values))
self.assertEquals(s.nunique(), 4)
# don't sort, have to sort after the fact as not sorting is platform-dep
Expand Down Expand Up @@ -410,7 +410,7 @@ def setUp(self):

def test_ops_properties(self):
self.check_ops_properties(['year','month','day','hour','minute','second','weekofyear','week','dayofweek','dayofyear','quarter'])
self.check_ops_properties(['date','time','microsecond','nanosecond'], lambda x: isinstance(x,DatetimeIndex))
self.check_ops_properties(['date','time','microsecond','nanosecond', 'is_month_start', 'is_month_end', 'is_quarter_start', 'is_quarter_end', 'is_year_start', 'is_year_end'], lambda x: isinstance(x,DatetimeIndex))

class TestPeriodIndexOps(Ops):
_allowed = '_allow_period_index_ops'
Expand Down
21 changes: 18 additions & 3 deletions pandas/tseries/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from pandas.compat import u
from pandas.tseries.frequencies import (
infer_freq, to_offset, get_period_alias,
Resolution, get_reso_string)
Resolution, get_reso_string, get_offset)
from pandas.tseries.offsets import DateOffset, generate_range, Tick, CDay
from pandas.tseries.tools import parse_time_string, normalize_date
from pandas.util.decorators import cache_readonly
Expand All @@ -28,6 +28,7 @@
import pandas.algos as _algos
import pandas.index as _index

from pandas.tslib import isleapyear

def _utc():
import pytz
Expand All @@ -43,7 +44,14 @@ def f(self):
utc = _utc()
if self.tz is not utc:
values = self._local_timestamps()
return tslib.get_date_field(values, field)
if field in ['is_month_start', 'is_month_end',
'is_quarter_start', 'is_quarter_end',
'is_year_start', 'is_year_end']:
month_kw = self.freq.kwds.get('startingMonth', self.freq.kwds.get('month', 12)) if self.freq else 12
freqstr = self.freqstr if self.freq else None
return tslib.get_start_end_field(values, field, freqstr, month_kw)
else:
return tslib.get_date_field(values, field)
f.__name__ = name
f.__doc__ = docstring
return property(f)
Expand Down Expand Up @@ -1439,6 +1447,12 @@ def freqstr(self):
_weekday = _dayofweek
_dayofyear = _field_accessor('dayofyear', 'doy')
_quarter = _field_accessor('quarter', 'q')
_is_month_start = _field_accessor('is_month_start', 'is_month_start')
_is_month_end = _field_accessor('is_month_end', 'is_month_end')
_is_quarter_start = _field_accessor('is_quarter_start', 'is_quarter_start')
_is_quarter_end = _field_accessor('is_quarter_end', 'is_quarter_end')
_is_year_start = _field_accessor('is_year_start', 'is_year_start')
_is_year_end = _field_accessor('is_year_end', 'is_year_end')

@property
def _time(self):
Expand Down Expand Up @@ -1774,6 +1788,7 @@ def to_julian_date(self):
self.nanosecond/3600.0/1e+9
)/24.0)


def _generate_regular_range(start, end, periods, offset):
if isinstance(offset, Tick):
stride = offset.nanos
Expand Down Expand Up @@ -1831,7 +1846,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
Frequency strings can have multiples, e.g. '5H'
tz : string or None
Time zone name for returning localized DatetimeIndex, for example
Asia/Hong_Kong
Asia/Hong_Kong
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Expand Down
122 changes: 100 additions & 22 deletions pandas/tseries/tests/test_timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -1473,7 +1473,7 @@ def test_timestamp_fields(self):
# extra fields from DatetimeIndex like quarter and week
idx = tm.makeDateIndex(100)

fields = ['dayofweek', 'dayofyear', 'week', 'weekofyear', 'quarter']
fields = ['dayofweek', 'dayofyear', 'week', 'weekofyear', 'quarter', 'is_month_start', 'is_month_end', 'is_quarter_start', 'is_quarter_end', 'is_year_start', 'is_year_end']
for f in fields:
expected = getattr(idx, f)[-1]
result = getattr(Timestamp(idx[-1]), f)
Expand Down Expand Up @@ -2192,7 +2192,7 @@ def test_join_with_period_index(self):

class TestDatetime64(tm.TestCase):
"""
Also test supoprt for datetime64[ns] in Series / DataFrame
Also test support for datetime64[ns] in Series / DataFrame
"""

def setUp(self):
Expand All @@ -2202,37 +2202,115 @@ def setUp(self):

def test_datetimeindex_accessors(self):
dti = DatetimeIndex(
freq='Q-JAN', start=datetime(1997, 12, 31), periods=100)
freq='D', start=datetime(1998, 1, 1), periods=365)

self.assertEquals(dti.year[0], 1998)
self.assertEquals(dti.month[0], 1)
self.assertEquals(dti.day[0], 31)
self.assertEquals(dti.day[0], 1)
self.assertEquals(dti.hour[0], 0)
self.assertEquals(dti.minute[0], 0)
self.assertEquals(dti.second[0], 0)
self.assertEquals(dti.microsecond[0], 0)
self.assertEquals(dti.dayofweek[0], 5)
self.assertEquals(dti.dayofweek[0], 3)

self.assertEquals(dti.dayofyear[0], 31)
self.assertEquals(dti.dayofyear[1], 120)
self.assertEquals(dti.dayofyear[0], 1)
self.assertEquals(dti.dayofyear[120], 121)

self.assertEquals(dti.weekofyear[0], 5)
self.assertEquals(dti.weekofyear[1], 18)
self.assertEquals(dti.weekofyear[0], 1)
self.assertEquals(dti.weekofyear[120], 18)

self.assertEquals(dti.quarter[0], 1)
self.assertEquals(dti.quarter[1], 2)

self.assertEquals(len(dti.year), 100)
self.assertEquals(len(dti.month), 100)
self.assertEquals(len(dti.day), 100)
self.assertEquals(len(dti.hour), 100)
self.assertEquals(len(dti.minute), 100)
self.assertEquals(len(dti.second), 100)
self.assertEquals(len(dti.microsecond), 100)
self.assertEquals(len(dti.dayofweek), 100)
self.assertEquals(len(dti.dayofyear), 100)
self.assertEquals(len(dti.weekofyear), 100)
self.assertEquals(len(dti.quarter), 100)
self.assertEquals(dti.quarter[120], 2)

self.assertEquals(dti.is_month_start[0], True)
self.assertEquals(dti.is_month_start[1], False)
self.assertEquals(dti.is_month_start[31], True)
self.assertEquals(dti.is_quarter_start[0], True)
self.assertEquals(dti.is_quarter_start[90], True)
self.assertEquals(dti.is_year_start[0], True)
self.assertEquals(dti.is_year_start[364], False)
self.assertEquals(dti.is_month_end[0], False)
self.assertEquals(dti.is_month_end[30], True)
self.assertEquals(dti.is_month_end[31], False)
self.assertEquals(dti.is_month_end[364], True)
self.assertEquals(dti.is_quarter_end[0], False)
self.assertEquals(dti.is_quarter_end[30], False)
self.assertEquals(dti.is_quarter_end[89], True)
self.assertEquals(dti.is_quarter_end[364], True)
self.assertEquals(dti.is_year_end[0], False)
self.assertEquals(dti.is_year_end[364], True)

self.assertEquals(len(dti.year), 365)
self.assertEquals(len(dti.month), 365)
self.assertEquals(len(dti.day), 365)
self.assertEquals(len(dti.hour), 365)
self.assertEquals(len(dti.minute), 365)
self.assertEquals(len(dti.second), 365)
self.assertEquals(len(dti.microsecond), 365)
self.assertEquals(len(dti.dayofweek), 365)
self.assertEquals(len(dti.dayofyear), 365)
self.assertEquals(len(dti.weekofyear), 365)
self.assertEquals(len(dti.quarter), 365)
self.assertEquals(len(dti.is_month_start), 365)
self.assertEquals(len(dti.is_month_end), 365)
self.assertEquals(len(dti.is_quarter_start), 365)
self.assertEquals(len(dti.is_quarter_end), 365)
self.assertEquals(len(dti.is_year_start), 365)
self.assertEquals(len(dti.is_year_end), 365)

dti = DatetimeIndex(
freq='BQ-FEB', start=datetime(1998, 1, 1), periods=4)

self.assertEquals(sum(dti.is_quarter_start), 0)
self.assertEquals(sum(dti.is_quarter_end), 4)
self.assertEquals(sum(dti.is_year_start), 0)
self.assertEquals(sum(dti.is_year_end), 1)

# Ensure is_start/end accessors throw ValueError for CustomBusinessDay, CBD requires np >= 1.7
if not _np_version_under1p7:
bday_egypt = offsets.CustomBusinessDay(weekmask='Sun Mon Tue Wed Thu')
dti = date_range(datetime(2013, 4, 30), periods=5, freq=bday_egypt)
self.assertRaises(ValueError, lambda: dti.is_month_start)

dti = DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03'])

self.assertEquals(dti.is_month_start[0], 1)

tests = [
(Timestamp('2013-06-01', offset='M').is_month_start, 1),
(Timestamp('2013-06-01', offset='BM').is_month_start, 0),
(Timestamp('2013-06-03', offset='M').is_month_start, 0),
(Timestamp('2013-06-03', offset='BM').is_month_start, 1),
(Timestamp('2013-02-28', offset='Q-FEB').is_month_end, 1),
(Timestamp('2013-02-28', offset='Q-FEB').is_quarter_end, 1),
(Timestamp('2013-02-28', offset='Q-FEB').is_year_end, 1),
(Timestamp('2013-03-01', offset='Q-FEB').is_month_start, 1),
(Timestamp('2013-03-01', offset='Q-FEB').is_quarter_start, 1),
(Timestamp('2013-03-01', offset='Q-FEB').is_year_start, 1),
(Timestamp('2013-03-31', offset='QS-FEB').is_month_end, 1),
(Timestamp('2013-03-31', offset='QS-FEB').is_quarter_end, 0),
(Timestamp('2013-03-31', offset='QS-FEB').is_year_end, 0),
(Timestamp('2013-02-01', offset='QS-FEB').is_month_start, 1),
(Timestamp('2013-02-01', offset='QS-FEB').is_quarter_start, 1),
(Timestamp('2013-02-01', offset='QS-FEB').is_year_start, 1),
(Timestamp('2013-06-30', offset='BQ').is_month_end, 0),
(Timestamp('2013-06-30', offset='BQ').is_quarter_end, 0),
(Timestamp('2013-06-30', offset='BQ').is_year_end, 0),
(Timestamp('2013-06-28', offset='BQ').is_month_end, 1),
(Timestamp('2013-06-28', offset='BQ').is_quarter_end, 1),
(Timestamp('2013-06-28', offset='BQ').is_year_end, 0),
(Timestamp('2013-06-30', offset='BQS-APR').is_month_end, 0),
(Timestamp('2013-06-30', offset='BQS-APR').is_quarter_end, 0),
(Timestamp('2013-06-30', offset='BQS-APR').is_year_end, 0),
(Timestamp('2013-06-28', offset='BQS-APR').is_month_end, 1),
(Timestamp('2013-06-28', offset='BQS-APR').is_quarter_end, 1),
(Timestamp('2013-03-29', offset='BQS-APR').is_year_end, 1),
(Timestamp('2013-11-01', offset='AS-NOV').is_year_start, 1),
(Timestamp('2013-10-31', offset='AS-NOV').is_year_end, 1)]

for ts, value in tests:
self.assertEquals(ts, value)


def test_nanosecond_field(self):
dti = DatetimeIndex(np.arange(10))
Expand Down
Loading

0 comments on commit ac189a2

Please sign in to comment.