Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: to_dict(orient="index", into=T) returns T[K, dict], expecting T[K, T] #34109

Open
2 tasks done
TomGoBravo opened this issue May 10, 2020 · 8 comments
Open
2 tasks done
Labels
Enhancement Needs Discussion Requires discussion from core team before further action

Comments

@TomGoBravo
Copy link

  • I have checked that this issue has not already been reported.
  • [] I have confirmed this bug exists on the latest version of pandas.
  • (optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

Live at https://repl.it/@TomGoBravo/RoughUncommonDatabase

import pandas as pd
import numpy as np
from collections import OrderedDict

df = pd.DataFrame(
    np.random.randint(10, size=(3, 2)),
    columns=list('AB'),
)

print(df.to_dict(orient='index', into=OrderedDict))
# Today this outputs something like:
# OrderedDict([(0, {'A': 7, 'B': 9}),
#              (1, {'A': 8, 'B': 9}),
#              (2, {'A': 9, 'B': 0})])
# but I'm expecting the values to be OrderedDict too instead
# of {}

Problem description

I expected the into parameter to be used for all dicts returned by to_dict but the implementation for orient="index" has dict hard-coded. A work around is fairly simple but thought I'd report the issue.

Expected Output

I'm expecting the values of the indexed mapping to also be of type into, not dict. The fix is replacing dict at https://github.com/pandas-dev/pandas/blob/master/pandas/core/frame.py#L1503 with into_c. I tried changing this line of code and re-ran the test_fast.sh. It didn't seem to break any tests, I get 78277 passed, 8446 skipped, 1000 xfailed, 31 warnings in 967.05s (0:16:07) vs 78277 passed, 8446 skipped, 1000 xfailed, 31 warnings in 1546.42s (0:25:46) running master with no change.

@TomGoBravo TomGoBravo added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 10, 2020
@TomGoBravo
Copy link
Author

I bumped into this while writing act-now-coalition/covid-data-model#394

@jreback
Copy link
Contributor

jreback commented May 10, 2020

why is this useful at all?
dicts are ordered as of 3.6 anyhow so this provides little utility

@TomGoBravo
Copy link
Author

I used OrderedDict as an example non-default into parameter.

@TomAugspurger
Copy link
Contributor

Mmm I don't think that's the original behavior of into. I'm hesitant to overload the meaning of it to apply to nested dictionaries too.

I'm also unsure about adding another parameter to support this since it won't make sense for other orients.

@TomGoBravo
Copy link
Author

TomGoBravo commented May 11, 2020 via email

@TomAugspurger TomAugspurger removed the Needs Triage Issue that has not been reviewed by a pandas team member label May 19, 2020
@TomAugspurger
Copy link
Contributor

@TomGoBravo can you share a bit more about your use-case? Right now, I'm leaning towards closing this as out of scope, since I don't want to complicate the signature for orients this doesn't apply to.

@mroeschke mroeschke added the Needs Info Clarification about behavior needed to assess issue label May 19, 2020
@TomGoBravo
Copy link
Author

I'm copying a DataFrame into a dict that maps from label/index value to a custom mapping for the row with that label. There is probably a better way to do it but what I came up with is at https://github.com/covid-projections/covid-data-model/blob/4f713e0b1c3e17303b2fbb12f5ce2c35eb64d427/test/dataset_utils_test.py#L15-L47

@mroeschke mroeschke changed the title BUG: to_dict(orient="index", into=T) returns T[K, dict], expecting T[K, T] ENH: to_dict(orient="index", into=T) returns T[K, dict], expecting T[K, T] Mar 27, 2021
@mroeschke mroeschke added Enhancement Needs Discussion Requires discussion from core team before further action and removed Needs Info Clarification about behavior needed to assess issue Bug labels Mar 27, 2021
@DDEle
Copy link

DDEle commented Jun 10, 2021

I think we need another argument for the inner constructor. In the case where the document states, someone may use the default value defined with defaultdict for the outer dictionary while a different default value could be needed for fileds of records.

My own case is similar to the original post. I want to serialize a data frame as a Lua table used in my fandom wiki. I use luadata which needs a dictionary-like input. As I want to keep the order of recorded and field, I need some way to pass into the constructors of both inner and outer dictionaries.

Unfurtuanly, I can not propose a name for the new argument as I think my English is not great enough to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

5 participants