Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Aliases as orient Argument in DataFrame.to_dict #32516

Merged
merged 32 commits into from
Mar 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
3cf3ed9
updated to dict function and tests involved
elmonsomiat Mar 6, 2020
2e783e0
rolled back to python_env in yaml file
elmonsomiat Mar 6, 2020
dcc4c2e
added space between lines
elmonsomiat Mar 6, 2020
b755897
added space between lines
elmonsomiat Mar 6, 2020
dc1207b
documented tests0
elmonsomiat Mar 7, 2020
1a78f3c
formated to black
elmonsomiat Mar 7, 2020
0e5a874
Merge branch 'master' of github.com:pandas-dev/pandas into feature/to…
elmonsomiat Mar 7, 2020
43d702c
added whats new
elmonsomiat Mar 7, 2020
e52a3b2
removed line from doc whats new
elmonsomiat Mar 7, 2020
ad5afa5
added comment on other enhancements to test the ci docs
elmonsomiat Mar 7, 2020
f161803
removed changes from docs
elmonsomiat Mar 7, 2020
a2edef1
reset whatsnew file
elmonsomiat Mar 7, 2020
e069a60
added orient lower back
elmonsomiat Mar 7, 2020
7075467
added deprecation warning and rolled back to accepting short versions…
elmonsomiat Mar 8, 2020
fb7de33
updated black formatting
elmonsomiat Mar 8, 2020
4164a87
removed full orient strings to return warning"
elmonsomiat Mar 8, 2020
c598b32
removed deprecation warning test
elmonsomiat Mar 8, 2020
7b9e90f
added whatsnew for deprecation
elmonsomiat Mar 8, 2020
3d6efa5
pulled master to get whatsnew changes
elmonsomiat Mar 8, 2020
6e117ee
added deprecation warning test
elmonsomiat Mar 8, 2020
36f9561
added deprecation warning test
elmonsomiat Mar 8, 2020
a17765e
removed whatsnew
elmonsomiat Mar 8, 2020
9619403
added --again-- whatsnew
elmonsomiat Mar 8, 2020
fa43f06
added back whatsnew
elmonsomiat Mar 9, 2020
d0bc7dc
changed deprecation warning for future warning
elmonsomiat Mar 9, 2020
9e3daba
updated test name
elmonsomiat Mar 9, 2020
d1b9c50
rolled back to original ValueError test for .to_dict
elmonsomiat Mar 9, 2020
9557349
updated whatsnew
elmonsomiat Mar 11, 2020
dcc9d43
moved mapping of strings inside if statements
elmonsomiat Mar 14, 2020
228b71e
Merge branch 'master' of github.com:pandas-dev/pandas into feature/to…
elmonsomiat Mar 14, 2020
beb1f32
Merge branch 'feature/to_dict_orient' of github.com:elmonsomiat/panda…
elmonsomiat Mar 14, 2020
4173ce6
remvoed unused contains upper
elmonsomiat Mar 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ Deprecations
- Lookups on a :class:`Series` with a single-item list containing a slice (e.g. ``ser[[slice(0, 4)]]``) are deprecated, will raise in a future version. Either convert the list to tuple, or pass the slice directly instead (:issue:`31333`)
- :meth:`DataFrame.mean` and :meth:`DataFrame.median` with ``numeric_only=None`` will include datetime64 and datetime64tz columns in a future version (:issue:`29941`)
- Setting values with ``.loc`` using a positional slice is deprecated and will raise in a future version. Use ``.loc`` with labels or ``.iloc`` with positions instead (:issue:`31840`)
-
- :meth:`DataFrame.to_dict` has deprecated accepting short names for ``orient`` in future versions (:issue:`32515`)

.. ---------------------------------------------------------------------------

Expand Down
50 changes: 44 additions & 6 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1401,11 +1401,45 @@ def to_dict(self, orient="dict", into=dict):
)
# GH16122
into_c = com.standardize_mapping(into)
if orient.lower().startswith("d"):

orient = orient.lower()
WillAyd marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 1328 above might also need a change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think so - are you interested in submitting a PR to update the documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be removed in a future PR. Check below, there is a FutureWarning that says this will not be accepted anymore, then we can remove this line.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, so to confirm, in the future any code like
df.to_dict(orient='s') will stop working?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it should

# GH32515
if orient.startswith(("d", "l", "s", "r", "i")) and orient not in {
jreback marked this conversation as resolved.
Show resolved Hide resolved
"dict",
"list",
"series",
"split",
"records",
"index",
}:
warnings.warn(
"Using short name for 'orient' is deprecated. Only the "
"options: ('dict', list, 'series', 'split', 'records', 'index') "
"will be used in a future version. Use one of the above "
"to silence this warning.",
FutureWarning,
)

if orient.startswith("d"):
orient = "dict"
elif orient.startswith("l"):
orient = "list"
elif orient.startswith("sp"):
orient = "split"
elif orient.startswith("s"):
orient = "series"
elif orient.startswith("r"):
orient = "records"
elif orient.startswith("i"):
orient = "index"

if orient == "dict":
jreback marked this conversation as resolved.
Show resolved Hide resolved
return into_c((k, v.to_dict(into)) for k, v in self.items())
elif orient.lower().startswith("l"):

elif orient == "list":
return into_c((k, v.tolist()) for k, v in self.items())
elif orient.lower().startswith("sp"):

elif orient == "split":
return into_c(
(
("index", self.index.tolist()),
Expand All @@ -1419,9 +1453,11 @@ def to_dict(self, orient="dict", into=dict):
),
)
)
elif orient.lower().startswith("s"):

elif orient == "series":
return into_c((k, com.maybe_box_datetimelike(v)) for k, v in self.items())
elif orient.lower().startswith("r"):

elif orient == "records":
columns = self.columns.tolist()
rows = (
dict(zip(columns, row))
Expand All @@ -1431,13 +1467,15 @@ def to_dict(self, orient="dict", into=dict):
into_c((k, com.maybe_box_datetimelike(v)) for k, v in row.items())
for row in rows
]
elif orient.lower().startswith("i"):

elif orient == "index":
if not self.index.is_unique:
raise ValueError("DataFrame index must be unique for orient='index'.")
return into_c(
(t[0], dict(zip(self.columns, t[1:])))
for t in self.itertuples(name=None)
)

else:
raise ValueError(f"orient '{orient}' not understood")

Expand Down
21 changes: 15 additions & 6 deletions pandas/tests/frame/methods/test_to_dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,17 @@ def test_to_dict_invalid_orient(self):
with pytest.raises(ValueError, match=msg):
df.to_dict(orient="xinvalid")

@pytest.mark.parametrize("orient", ["d", "l", "r", "sp", "s", "i"])
def test_to_dict_short_orient_warns(self, orient):
# GH#32515
df = DataFrame({"A": [0, 1]})
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
df.to_dict(orient=orient)

@pytest.mark.parametrize("mapping", [dict, defaultdict(list), OrderedDict])
def test_to_dict(self, mapping):
# orient= should only take the listed options
# see GH#32515
test_data = {"A": {"1": 1, "2": 2}, "B": {"1": "1", "2": "2", "3": "3"}}

# GH#16122
Expand All @@ -81,27 +90,27 @@ def test_to_dict(self, mapping):
for k2, v2 in v.items():
assert v2 == recons_data[k][k2]

recons_data = DataFrame(test_data).to_dict("l", mapping)
recons_data = DataFrame(test_data).to_dict("list", mapping)

for k, v in test_data.items():
for k2, v2 in v.items():
assert v2 == recons_data[k][int(k2) - 1]

recons_data = DataFrame(test_data).to_dict("s", mapping)
recons_data = DataFrame(test_data).to_dict("series", mapping)

for k, v in test_data.items():
for k2, v2 in v.items():
assert v2 == recons_data[k][k2]

recons_data = DataFrame(test_data).to_dict("sp", mapping)
recons_data = DataFrame(test_data).to_dict("split", mapping)
expected_split = {
"columns": ["A", "B"],
"index": ["1", "2", "3"],
"data": [[1.0, "1"], [2.0, "2"], [np.nan, "3"]],
}
tm.assert_dict_equal(recons_data, expected_split)

recons_data = DataFrame(test_data).to_dict("r", mapping)
recons_data = DataFrame(test_data).to_dict("records", mapping)
expected_records = [
{"A": 1.0, "B": "1"},
{"A": 2.0, "B": "2"},
Expand All @@ -113,15 +122,15 @@ def test_to_dict(self, mapping):
tm.assert_dict_equal(l, r)

# GH#10844
recons_data = DataFrame(test_data).to_dict("i")
recons_data = DataFrame(test_data).to_dict("index")

for k, v in test_data.items():
for k2, v2 in v.items():
assert v2 == recons_data[k2][k]

df = DataFrame(test_data)
df["duped"] = df[df.columns[0]]
recons_data = df.to_dict("i")
recons_data = df.to_dict("index")
comp_data = test_data.copy()
comp_data["duped"] = comp_data[df.columns[0]]
for k, v in comp_data.items():
Expand Down