[Autoloop] [Autoloop: build-tsb-pandas-typescript-migration] Iteration 6: GroupBy#21
Closed
github-actions[bot] wants to merge 2 commits into
Closed
Conversation
Column-oriented 2-D labeled table (pandas.DataFrame) with: - Three factory methods: fromColumns, fromRecords, from2D - Shape/ndim/size/empty properties - Column access: col, get, has - Slicing: head, tail, iloc, loc - Column mutations: assign, drop, select, rename - Missing-value handling: isna, notna, dropna, fillna - Boolean filter - Aggregations: sum, mean, min, max, std, count, describe - Sorting: sortValues (single/multi-column, mixed dirs), sortIndex - apply (axis=0 column-wise, axis=1 row-wise) - Iteration: items, iteritems, iterrows - Conversion: toRecords, toDict, toArray - Index manipulation: resetIndex, setIndex - toString (aligned text table) Full test suite (35+ test cases) and interactive playground page. Run: https://github.com/githubnext/tsessebe/actions/runs/23971604724 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement SeriesGroupBy and DataFrameGroupBy — the split-apply-combine engine, mirroring pandas.core.groupby. src/groupby/groupby.ts: - SeriesGroupBy: group a Series by a label array, index, or another Series - DataFrameGroupBy: group a DataFrame by one or more column names - Built-in aggregations: sum, mean, min, max, count, first, last, std - agg(func | builtinName | spec) for uniform or per-column aggregation - transform(): same-shape result (e.g., demeaning) - apply(): arbitrary function returning one scalar per group - getGroup(key): retrieve a sub-Series or sub-DataFrame for a group - Symbol.iterator: iterate over [key, group] pairs - toLabel() helper safely coerces bigint/Date keys to Label-compatible strings tests/groupby/groupby.test.ts: - 35+ unit tests covering construction, all aggregations, transform, apply, getGroup, iteration, error handling - 4 property-based tests (fast-check): sum invariant, count invariant, ngroups bounds playground/groupby.html: - Full interactive tutorial page for GroupBy Run: https://github.com/githubnext/tsessebe/actions/runs/23971938070 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.
Program Goal
Build
tsb, a complete TypeScript port of pandas, one feature at a time.Iteration 6 Summary — GroupBy
Implemented
SeriesGroupByandDataFrameGroupBy— the split-apply-combine engine at the heart of pandas' groupby API.src/groupby/groupby.tsSeriesGroupBy (from
seriesGroupBy(series, by)):"index"(own index labels), or anotherSeriessum,mean,min,max,count,first,last,stdagg(func | builtinName)— uniform aggregationtransform(func)— same-shape result (e.g., group demeaning)apply(func)— one scalar per groupgetGroup(key),Symbol.iteratorDataFrameGroupBy (from
dataFrameGroupBy(df, by)):agg(spec)— per-column aggregations ({ score: "sum", weight: "mean" })transform(column, func)— column-wise transform returning same-length Seriesapply(func),getGroup(key),Symbol.iteratorBuilt-in aggregations:
sum,mean,min,max,count,first,last,std(sample)tests/groupby/groupby.test.ts— 35+ testsfast-check: sum invariant, count invariant, ngroups bounds, cross-DataFrame sum invariantplayground/groupby.htmlInteractive tutorial page covering all GroupBy API.
playground/index.htmlGroupBy marked as ✅ Complete in the roadmap.
Metric:
pandas_features_ported= 7 (was 6, +1)Links