Skip to content

Commit

Permalink
PERF: Speed up Period construction (#50149)
Browse files Browse the repository at this point in the history
* PERF: Speed up Period construction

* Try to fix CI?

* Avoid hackiness

* debug CI

* Modify condition

* revert whitespace

* fix tests
  • Loading branch information
lithomas1 committed Jan 18, 2023
1 parent 3ec3ac2 commit d2d1797
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 13 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -868,6 +868,7 @@ Performance improvements
- Performance improvement in :func:`merge` when not merging on the index - the new index will now be :class:`RangeIndex` instead of :class:`Int64Index` (:issue:`49478`)
- Performance improvement in :meth:`DataFrame.to_dict` and :meth:`Series.to_dict` when using any non-object dtypes (:issue:`46470`)
- Performance improvement in :func:`read_html` when there are multiple tables (:issue:`49929`)
- Performance improvement in :class:`Period` constructor when constructing from a string or integer (:issue:`38312`)
- Performance improvement in :func:`to_datetime` when using ``'%Y%m%d'`` format (:issue:`17410`)
- Performance improvement in :func:`to_datetime` when format is given or can be inferred (:issue:`50465`)
- Performance improvement in :func:`read_csv` when passing :func:`to_datetime` lambda-function to ``date_parser`` and inputs have mixed timezone offsetes (:issue:`35296`)
Expand Down
10 changes: 6 additions & 4 deletions pandas/_libs/tslibs/parsing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,11 @@ def parse_datetime_string_with_reso(
&out_tzoffset, False
)
if not string_to_dts_failed:
if out_bestunit == NPY_DATETIMEUNIT.NPY_FR_ns or out_local:
timestamp_units = {NPY_DATETIMEUNIT.NPY_FR_ns,
NPY_DATETIMEUNIT.NPY_FR_ps,
NPY_DATETIMEUNIT.NPY_FR_fs,
NPY_DATETIMEUNIT.NPY_FR_as}
if out_bestunit in timestamp_units or out_local:
# TODO: the not-out_local case we could do without Timestamp;
# avoid circular import
from pandas import Timestamp
Expand All @@ -389,9 +393,7 @@ def parse_datetime_string_with_reso(
# Match Timestamp and drop picoseconds, femtoseconds, attoseconds
# The new resolution will just be nano
# GH 50417
if out_bestunit in {NPY_DATETIMEUNIT.NPY_FR_ps,
NPY_DATETIMEUNIT.NPY_FR_fs,
NPY_DATETIMEUNIT.NPY_FR_as}:
if out_bestunit in timestamp_units:
out_bestunit = NPY_DATETIMEUNIT.NPY_FR_ns
reso = {
NPY_DATETIMEUNIT.NPY_FR_Y: "year",
Expand Down
13 changes: 4 additions & 9 deletions pandas/_libs/tslibs/period.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -2592,18 +2592,13 @@ class Period(_Period):

freqstr = freq.rule_code if freq is not None else None
dt, reso = parse_datetime_string_with_reso(value, freqstr)
try:
ts = Timestamp(value)
except ValueError:
nanosecond = 0
else:
nanosecond = ts.nanosecond
if nanosecond != 0:
reso = "nanosecond"
if reso == "nanosecond":
nanosecond = dt.nanosecond
if dt is NaT:
ordinal = NPY_NAT

if freq is None:
if freq is None and ordinal != NPY_NAT:
# Skip NaT, since it doesn't have a resolution
try:
freq = attrname_to_abbrevs[reso]
except KeyError:
Expand Down

0 comments on commit d2d1797

Please sign in to comment.