Skip to content

ENH: Copy attrs on join (possibly depending on left, right, etc.) #60351

Closed
@rommeswi

Description

@rommeswi

Feature Type

  • Adding new functionality to pandas

    Changing existing functionality in pandas

    Removing existing functionality in pandas

Problem Description

df.join() does not retain the attrs of the dataset. Given that the attrs manual states that "many operations that create new datasets will retain attrs", this seems like an omission.

Feature Description

Join is different from concat because there is a clear dataframe that the operation is on. Therefore, it would seem natural if df.join() would retain the attrs of the initial dataframe.

Alternative Solutions

It would also be possible to make the attrs dependent on "how" but this would only be natural for "left" and "right".

Additional Context

No response

Activity

timhoffm

timhoffm commented on Nov 18, 2024

@timhoffm
Contributor

This should be handled the same way as concat: Propagate only if all inputs have the same attrs.

concat is currently a hard-coded special case,

if method == "concat":

but we may want to delegate the attrs combination back to the operation instead, i.e.

if hasattr(other, "__combined_attrs__"):
      self.attrs = other.__combined_attrs__()

Note that other is the "combination object" for there calls, i.e. _MergeOperation, _Concatenator etc, which will have to grow the logic for combining attrs.

Alternatively, one could leave the combination logic in __finalize__ but provide a uniform interface on all "combination objects" to give their inputs. Currently, thats non-uniform _Concatenator.objs, but _MergeOperation.left/_MergeOperation.right.

rhshadrach

rhshadrach commented on Nov 18, 2024

@rhshadrach
Member

Thanks for the report! @timhoffm - can you post a reproducible example.

added
Needs InfoClarification about behavior needed to assess issue
ReshapingConcat, Merge/Join, Stack/Unstack, Explode
metadata_metadata, .attrs
and removed
Needs TriageIssue that has not been reviewed by a pandas team member
on Nov 18, 2024
added a commit that references this issue on Nov 19, 2024
1d4c974
timhoffm

timhoffm commented on Nov 19, 2024

@timhoffm
Contributor

#60357 should fix this. I've choosen the somewhat smaller refactoring and not pushed the combination logic back into the "combination objects". In fact #59141 removed _Concatenator in favor of simple functions. Therefore, I've now choosen the common API to be "provides the inputs via input_objs parameter".

Note that join() is implemented via concat() or merge() depending on the case. I've only added explicit tests for these fundamental operations, not for join(), but could add that if desired.

added 4 commits that reference this issue on Nov 19, 2024
8bd828c
1e8c04b
6697172
7815e73

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds InfoClarification about behavior needed to assess issueReshapingConcat, Merge/Join, Stack/Unstack, Explodemetadata_metadata, .attrs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @timhoffm@rommeswi@rhshadrach

      Issue actions

        ENH: Copy attrs on join (possibly depending on left, right, etc.) · Issue #60351 · pandas-dev/pandas