Skip to content

Conversation

@AbelJSanchez
Copy link

Added test_sum_string_dtype_coercion() that checks if the df.sum() method results in in concatenation for numeric strings, and not coercion to dtype int64 or float64.

I wrote three different assertions:

  1. Sum of two DataFrames with "integer" strings
  2. Sum of two DataFrames with "floating point" string
  3. Sum of two DataFrames with both number type strings.

…ngs results in concatenation and not coercion to dtype int64 or float64
Comment on lines +1050 to +1052
df = DataFrame({"a": ["483", "3"], "b": ["94", "759"]})
result = df.sum(axis=1)
expected = Series(["48394", "3759"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use pytest.mark.parametrize like

@pytest.mark.parametrize("input_data,expected_data", [[{"a": ...}, ["48394" ...]] ,...])

so we don't have to repeat 3 separate setups?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can. I will make these changes tonight and resubmit.

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Testing pandas testing functions or related to the test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: df.sum() of string columns depends on whether or not they can be coerced to numbers

2 participants