Skip to content

BUG: Added deprecation warning to Timestamp constructor #61149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
@@ -416,6 +416,7 @@ Other Deprecations
- Deprecated allowing non-keyword arguments in :meth:`Series.to_string` except ``buf``. (:issue:`57280`)
- Deprecated behavior of :meth:`.DataFrameGroupBy.groups` and :meth:`.SeriesGroupBy.groups`, in a future version ``groups`` by one element list will return tuple instead of scalar. (:issue:`58858`)
- Deprecated behavior of :meth:`Series.dt.to_pytimedelta`, in a future version this will return a :class:`Series` containing python ``datetime.timedelta`` objects instead of an ``ndarray`` of timedelta; this matches the behavior of other :meth:`Series.dt` properties. (:issue:`57463`)
- Deprecated empty string in :class:`Timestamp` (:issue:`61149`)
- Deprecated lowercase strings ``d``, ``b`` and ``c`` denoting frequencies in :class:`Day`, :class:`BusinessDay` and :class:`CustomBusinessDay` in favour of ``D``, ``B`` and ``C`` (:issue:`58998`)
- Deprecated lowercase strings ``w``, ``w-mon``, ``w-tue``, etc. denoting frequencies in :class:`Week` in favour of ``W``, ``W-MON``, ``W-TUE``, etc. (:issue:`58998`)
- Deprecated parameter ``method`` in :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like` (:issue:`58667`)
8 changes: 8 additions & 0 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
@@ -2587,6 +2587,14 @@ class Timestamp(_Timestamp):
tzinfo is None):
return ts_input
elif isinstance(ts_input, str):
if ts_input == "":
warnings.warn(
"Passing an empty string to Timestamp is deprecated and will raise "
"a ValueError in a future version.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add Use `pd.NaT` directly instead.

FutureWarning,
stacklevel = find_stack_level()
)

# User passed a date string to parse.
# Check that the user didn't also pass a date attribute kwarg.
if any(arg is not None for arg in _date_attributes):
40 changes: 35 additions & 5 deletions pandas/tests/indexing/test_loc.py
Original file line number Diff line number Diff line change
@@ -39,7 +39,10 @@
to_timedelta,
)
import pandas._testing as tm
from pandas.api.types import is_scalar
from pandas.api.types import (
is_datetime64_any_dtype,
is_scalar,
)
from pandas.core.indexing import _one_ellipsis_message
from pandas.tests.indexing.common import check_indexing_smoketest_or_raises

@@ -2084,7 +2087,21 @@ def test_loc_setitem_with_expansion_nonunique_index(self, index):
assert key not in index # otherwise test is invalid
# TODO: using a tuple key breaks here in many cases

exp_index = index.insert(len(index), key)
msg = "Passing an empty string to Timestamp"
contains_datetime = False
if isinstance(index, MultiIndex):
for i in range(index.nlevels):
if is_datetime64_any_dtype(index.get_level_values(i)):
contains_datetime = True
break
if contains_datetime:
with tm.assert_produces_warning(FutureWarning, match=msg):
exp_index = index.insert(len(index), key)
Comment on lines +2098 to +2099
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks problematic - the warning message will say not to pass "" to Timestamp but the user is doing no such thing. It looks like we'll have to change the internals here to avoid triggering the warning message. This applies to all the warnings in this PR where you aren't explicitly passing "" to Timestamp.

If you want any assistance in how to change, let me know and I'm happy to take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please advise on what the expected behavior should be? The issue lies in inserting a single key into a MultiIndex when one of the sub-levels is a Timestamp. Since key is a single string in all of these test cases, the key for the Timestamp sub-level defaults to "", which triggers Timestamp("") and therefore throws the warning.

I'm just a little confused why a single key is being inserted into a MultiIndex and if the empty sub-levels are an expected part of the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current behavior - when index levels are not present to fill with NaT for datetimes - looks correct to me.

I am wondering if DatetimeArray._scalar_from_string should allow value="" to return NaT. That would prevent this warning. cc @mroeschke

else:
exp_index = index.insert(len(index), key)
else:
exp_index = index.insert(len(index), key)

if isinstance(index, MultiIndex):
assert exp_index[-1][0] == key
else:
@@ -2094,19 +2111,32 @@ def test_loc_setitem_with_expansion_nonunique_index(self, index):

# Add new row, but no new columns
df = orig.copy()
df.loc[key, 0] = N
if contains_datetime:
with tm.assert_produces_warning(FutureWarning, match=msg):
df.loc[key, 0] = N
else:
df.loc[key, 0] = N
tm.assert_frame_equal(df, expected)

# add new row on a Series
ser = orig.copy()[0]
ser.loc[key] = N
if contains_datetime:
with tm.assert_produces_warning(FutureWarning, match=msg):
ser.loc[key] = N
else:
ser.loc[key] = N
tm.assert_frame_equal(df, expected)
# the series machinery lets us preserve int dtype instead of float
expected = expected[0].astype(np.int64)
tm.assert_series_equal(ser, expected)

# add new row and new column
df = orig.copy()
df.loc[key, 1] = N
if contains_datetime:
with tm.assert_produces_warning(FutureWarning, match=msg):
df.loc[key, 1] = N
else:
df.loc[key, 1] = N
expected = DataFrame(
{0: list(arr) + [np.nan], 1: [np.nan] * N + [float(N)]},
index=exp_index,
8 changes: 6 additions & 2 deletions pandas/tests/io/formats/test_to_csv.py
Original file line number Diff line number Diff line change
@@ -302,15 +302,19 @@ def test_to_csv_date_format_in_categorical(self):
ser = pd.Series(pd.to_datetime(["2021-03-27", pd.NaT], format="%Y-%m-%d"))
ser = ser.astype("category")
expected = tm.convert_rows_list_to_csv_str(["0", "2021-03-27", '""'])
assert ser.to_csv(index=False) == expected
msg = "Passing an empty string to Timestamp"
with tm.assert_produces_warning(FutureWarning, match=msg):
assert ser.to_csv(index=False) == expected

ser = pd.Series(
pd.date_range(
start="2021-03-27", freq="D", periods=1, tz="Europe/Berlin"
).append(pd.DatetimeIndex([pd.NaT]))
)
ser = ser.astype("category")
assert ser.to_csv(index=False, date_format="%Y-%m-%d") == expected
msg = "Passing an empty string to Timestamp"
with tm.assert_produces_warning(FutureWarning, match=msg):
assert ser.to_csv(index=False, date_format="%Y-%m-%d") == expected

def test_to_csv_float_ea_float_format(self):
# GH#45991
8 changes: 7 additions & 1 deletion pandas/tests/scalar/test_nat.py
Original file line number Diff line number Diff line change
@@ -109,7 +109,13 @@ def test_nat_vector_field_access():
"value", [None, np.nan, iNaT, float("nan"), NaT, "NaT", "nat", "", "NAT"]
)
def test_identity(klass, value):
assert klass(value) is NaT
if value == "" and klass == Timestamp:
msg = "Passing an empty string to Timestamp"
with tm.assert_produces_warning(FutureWarning, match=msg):
result = klass(value)
else:
result = klass(value)
assert result is NaT


@pytest.mark.parametrize("klass", [Timestamp, Timedelta])
5 changes: 5 additions & 0 deletions pandas/tests/scalar/timestamp/test_constructors.py
Original file line number Diff line number Diff line change
@@ -62,6 +62,11 @@ def test_constructor_float_not_round_with_YM_unit_raises(self):
with pytest.raises(ValueError, match=msg):
Timestamp(150.5, unit="M")

def test_constructor_with_empty_string(self):
msg = "Passing an empty string to Timestamp"
with tm.assert_produces_warning(FutureWarning, match=msg):
Timestamp("")

@pytest.mark.parametrize(
"value, check_kwargs",
[
4 changes: 3 additions & 1 deletion pandas/tests/test_multilevel.py
Original file line number Diff line number Diff line change
@@ -313,7 +313,9 @@ def test_multiindex_dt_with_nan(self):
names=[None, "Date"],
),
)
df = df.reset_index()
msg = "Passing an empty string to Timestamp"
with tm.assert_produces_warning(FutureWarning, match=msg):
df = df.reset_index()
result = df[df.columns[0]]
expected = Series(["a", "b", "c", "d"], name=("sub", np.nan))
tm.assert_series_equal(result, expected)
Loading
Oops, something went wrong.