Skip to content

ENH: Add totality validation to merge method #58547

@z3rone

Description

@z3rone

Feature Type

  • Adding new functionality to pandas

Problem Description

The available validation methods lack checks for (left-/right-)totality. I am frequently encountering cases where I need to manually check that eg. a one-to-one merge also finds a match match in the right DF for every row in the left DF or vice versa.

Feature Description

Add the following to one_to_one, one_to_many and many_to_one merge validations:

  • left_total ... Each row in the left DataFrame is matched to (at least) one row in the right DataFrame
  • right_total ... Each row in the right DataFrame is matched to (at least) one row in the left DataFrame
  • total ... Both left_total and right_total must hold

A combination of join relation and totality constraint should be possible by combining with a +: one_to_one+left_total

Alternative Solutions

Currently, doing an outer join and checking for NaN values in the "foreign" columns works to find unmerged rows. However, this will fail if there are already NaN values in the initial DataFrames.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds DiscussionRequires discussion from core team before further actionReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions