Skip to content

Conversation

@Xenomorph149
Copy link

Summary of Changes:

  • Enhanced the error messaging in the .query() method for pandas DataFrames when duplicate column names are present.
  • Prior to this change, invoking .query() on a DataFrame with duplicate column names resulted in an unclear TypeError, making it difficult for users to understand the root cause.
  • With this update, users will now receive a more descriptive and helpful ValueError, similar to when columns are accessed directly, with a message such as:
    "ValueError: cannot reindex on an axis with duplicate labels"

Reasoning Behind the Change:

  • The current behavior of .query() did not offer clear feedback when users attempted to run queries on DataFrames with duplicate column names.
  • By improving the error message, we enhance the overall user experience, making it easier for users to diagnose and resolve issues related to duplicate columns.

Testing Approach:

  • I tested the change by creating a sample DataFrame with duplicate column names and attempted to execute a .query() operation.
  • Below is the test case used:
import pandas as pd

# Create a DataFrame with duplicate column names
df = pd.DataFrame({
    "A": [1, 2, 3, 4, 5],
    "B": [10, 8, 6, 4, 2],
    "A": [5, 4, 3, 2, 1],  # Duplicate column name "A"
})

# Test the query functionality
try:
    result = df.query("A <= 4 and B <= 8")
    print(result)
except Exception as e:
    print(f"Error: {e}")
  • After applying the fix, the code will raise a ValueError, indicating that queries cannot be executed due to duplicate column names, making it much easier to pinpoint the issue.

Issue Addressed:

  • This PR resolves the issue documented in #60863, where .query() failed to provide a clear error message when used on DataFrames containing duplicate column names.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2025

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Mar 9, 2025
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants