-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add Melt method to DataFrame #7578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7578 +/- ##
==========================================
+ Coverage 69.05% 69.10% +0.04%
==========================================
Files 1483 1483
Lines 274362 274693 +331
Branches 28270 28294 +24
==========================================
+ Hits 189466 189824 +358
+ Misses 77510 77484 -26
+ Partials 7386 7385 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
Not sure who all would want to look at this, but I have another PR here. @tarekgh @ericstj @jeffhandley |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a DataFrame.Melt() API to reshape data from wide to long format (similar to pandas.melt), enabling a common “unpivot” transformation within Microsoft.Data.Analysis.
Changes:
- Introduces
DataFrame.Melt(...)plus helper methods for validation, sizing, column initialization, and filling. - Implements optional null/empty filtering (
dropNulls) and mixed-type handling (stringifying values when needed). - Adds new unit tests covering core melt scenarios and some invalid-input cases.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| src/Microsoft.Data.Analysis/DataFrame.cs | Adds the Melt() API and its helper methods to produce a long-format DataFrame. |
| test/Microsoft.Data.Analysis.Tests/DataFrameTests.cs | Adds theory data + tests validating melt output and a few invalid-input cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…e underlying data type.
|
I have implemented all the Copilot suggestions! |
Add DataFrame.Melt() method for transforming wide to long format
Description
This PR implements a
Melt()method for the DataFrame class that transforms data from wide format to long format, similar to Pandas'pandas.melt()function. This is a fundamental data reshaping operation that "unpivots" multiple value columns into a pair of variable-value columns.Fixes #7577
What does this change do?
The
Melt()method:Why this approach?
Performance optimizations:
Design decisions:
API signature:
Changes included
Melt()method and supporting helper methodsCalculateTotalOutputRows(): Pre-calculates output size for efficient allocationInitializeIdColumns(): Sets up ID columns with correct sizeCreateValueColumn(): Creates appropriately typed value columnFillMeltedData(): Performs the actual unpivoting operationExample usage
Additional notes
This implementation brings the .NET DataFrame API closer to feature parity with Pandas and supports common data transformation workflows needed for analysis and visualization. The method is optimized for performance while maintaining code readability and maintainability.