Skip to content

[Autoloop] [Autoloop: build-tsb-pandas-typescript-migration] Iteration 72: value_counts + elem_ops exports#57

Merged
mrjf merged 20 commits into
mainfrom
autoloop/build-tsb-pandas-typescript-migration-32862b917c67b1cd
Apr 7, 2026
Merged

[Autoloop] [Autoloop: build-tsb-pandas-typescript-migration] Iteration 72: value_counts + elem_ops exports#57
mrjf merged 20 commits into
mainfrom
autoloop/build-tsb-pandas-typescript-migration-32862b917c67b1cd

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented Apr 5, 2026

🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.

Program Goal

Build tsb, a complete TypeScript port of pandas, one feature at a time.

Metric: pandas_features_ported (higher is better)
Current best: 28 (iteration 72)
Branch: autoloop/build-tsb-pandas-typescript-migration

Iteration 72 Summary

src/stats/value_counts.ts — value_counts for Series and DataFrame

Mirrors pandas.Series.value_counts() and pandas.DataFrame.value_counts():

  • valueCounts(series, opts) — count (or proportion) of each unique value in a Series
  • dataFrameValueCounts(df, opts) — count unique row combinations across selected columns
  • Options: normalize, sort, ascending, dropna, subset (DataFrame only)
  • scalarKey() mapper produces stable Map string keys for any Scalar type
  • Composite "v1|v2|…" labels for DataFrame row combinations
  • 25 unit + property-based tests (fast-check); all pass

Also: wire up elem_ops barrel exports (iter 71 gap)

src/stats/index.ts and src/index.ts now export clip, seriesAbs, seriesRound, dataFrameClip, dataFrameAbs, dataFrameRound (these were added in iter 71 but not wired to the barrels).

Metric: 27 → 28 (+1)


References: Run §24010521196

Generated by Autoloop · ● 7.9M ·

github-actions Bot and others added 20 commits April 5, 2026 21:27
Port pandas.core.groupby — DataFrameGroupBy and SeriesGroupBy with full
split-apply-combine engine:

- DataFrameGroupBy: sum, mean, min, max, count, std, first, last, size,
  agg (named/fn/per-column spec), transform, apply, filter, getGroup,
  ngroups, groupKeys, groups — single-key and multi-key support
- SeriesGroupBy: same aggregation API, transform, apply, filter, getGroup
- DataFrame.groupby(by) and Series.groupby(by) convenience methods
- 40+ unit tests + property-based tests (fast-check)
- Interactive playground page (playground/groupby.html)

Run: https://github.com/githubnext/tsessebe/actions/runs/23972003902

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement pandas.concat() — combine Series and DataFrames along either axis.

## What's included

**src/merge/concat.ts** — full concat implementation:
- axis=0 (default): stack Series → Series; stack DataFrames → DataFrame
- axis=1: Series[] → DataFrame (each Series as a column); DataFrame[] → DataFrame (side-by-side)
- join='outer' (default): union of labels, null-fill missing values
- join='inner': intersection of labels only
- ignoreIndex: reset the result to a RangeIndex

**src/merge/index.ts** — barrel export

**src/index.ts** — exports concat + ConcatOptions

**tests/merge/concat.test.ts** — 25+ unit tests + 4 property-based tests (fast-check)

**playground/concat.html** — interactive tutorial page with 7 examples

**playground/index.html** — concat marked as ✅ Complete

Run: https://github.com/githubnext/tsessebe/actions/runs/23972580333

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port pandas.merge — SQL-style join of two DataFrames.

- src/merge/merge.ts: full merge() implementation with inner/left/right/outer
  joins, on/left_on/right_on/left_index/right_index, suffix handling, sort
- tests/merge/merge.test.ts: 29 tests (unit + property-based with fast-check)
- playground/merge.html: interactive tutorial with all join types
- playground/index.html: mark merge as complete

Metric: pandas_features_ported 8 → 9

Run: https://github.com/githubnext/tsessebe/actions/runs/24001239424

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port pandas StringMethods as Series.str accessor.

- src/core/string_accessor.ts: StringAccessor class with 35+ methods:
  case (lower/upper/title/capitalize/swapcase), len, strip/lstrip/rstrip,
  pad/ljust/rjust/center/zfill, contains/startswith/endswith/match/fullmatch,
  find/rfind/count, replace/extract, split/rsplit/join/cat, slice/get/sliceReplace,
  repeat/wrap, encode, isalpha/isdigit/isalnum/islower/isupper/istitle/isspace.
  All methods propagate null/NaN unchanged.
- src/core/series.ts: added withValues() helper + Series.str getter
- tests/core/string_accessor.test.ts: 60 tests (unit + property-based with fast-check)
- playground/string_accessor.html: interactive tutorial with 9 sections
- playground/index.html: mark str accessor as complete

Metric: pandas_features_ported 9 → 10

Run: https://github.com/githubnext/tsessebe/actions/runs/24001823414

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port pandas DatetimeProperties as Series.dt accessor.

- src/core/datetime_accessor.ts: DatetimeAccessor class with:
  Calendar components: year/month/day/hour/minute/second/millisecond/
    microsecond/nanosecond/dayofweek/weekday/dayofyear/quarter/
    isocalendar_week/days_in_month/daysinmonth
  Boolean boundaries: is_month_start/is_month_end/is_quarter_start/
    is_quarter_end/is_year_start/is_year_end/is_leap_year
  Formatting: strftime() with 25+ directives (%Y/%m/%d/%H/%M/%S/%A/%B etc.)
  Normalization: normalize() (floor to midnight), date()
  Rounding: floor/ceil/round with D/H/T/min/S/L/ms units
  Epoch: total_seconds()
  All methods propagate null/undefined/NaN unchanged
- src/core/series.ts: added Series.dt getter returning DatetimeAccessor
- src/core/index.ts: export DatetimeAccessor + DatetimeSeriesLike
- src/index.ts: re-export DatetimeAccessor + DatetimeSeriesLike
- tests/core/datetime_accessor.test.ts: 50+ tests (unit + property-based with fast-check)
- playground/datetime_accessor.html: interactive tutorial with 8 sections
- playground/index.html: mark dt accessor as complete

Metric: pandas_features_ported 10 → 11

Run: https://github.com/githubnext/tsessebe/actions/runs/24002454105

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements src/stats/describe.ts — the pandas-style describe() function:
- quantile(sorted, q): linear interpolation (pandas method='linear')
- describe(series): count/mean/std/min/percentiles/max for numeric,
  count/unique/top/freq for categorical
- describe(dataFrame): per-column stats, include='number'|'object'|'all'
- DescribeOptions: custom percentiles array
- Series.quantile(q) method added to Series

New files: src/stats/describe.ts, src/stats/index.ts
Tests: tests/stats/describe.test.ts (32 tests, unit + property-based)
Playground: playground/describe.html with 7 interactive tutorials
Metric: pandas_features_ported 11 → 12

Run: https://github.com/githubnext/tsessebe/actions/runs/24002845454

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add src/io/csv.ts — pandas-style CSV read/write:
- readCsv(text, options?): parse CSV string → DataFrame with automatic
  dtype inference (bool/int64/float64/string/object), NA handling,
  quoted fields (RFC 4180), custom separators, indexCol, skipRows, nRows
- toCsv(df, options?): serialize DataFrame → CSV string with header,
  index, custom sep, lineterminator, naRep
- Comprehensive tests: 35+ unit + property-based (fast-check) tests
- Interactive playground: playground/csv.html with 7 tutorial sections

Metric: 12 → 13 (+1)

Run: https://github.com/githubnext/tsessebe/actions/runs/24003267099

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Five orient formats (records, split, index, columns, values) mirroring
pandas read_json() and DataFrame.to_json(). Full null propagation,
dtype override support, and JSON auto-detection. 31 unit + property-based
tests. Interactive playground: playground/json.html.

Run: https://github.com/githubnext/tsessebe/actions/runs/24003815679

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add `src/stats/corr.ts` with:
- `pearsonCorr(a, b, options?)` — Pearson correlation for two Series, aligning
  on shared index labels and dropping missing pairs
- `dataFrameCorr(df, options?)` — symmetric N×N Pearson correlation matrix for
  all numeric columns of a DataFrame
- `dataFrameCov(df, options?)` — symmetric N×N sample covariance matrix

Also adds inline methods:
- `Series.corr(other, minPeriods?)`
- `DataFrame.corr(minPeriods?)`
- `DataFrame.cov(ddof?, minPeriods?)`

34 unit + property-based tests. Playground: playground/corr.html.

Run: https://github.com/githubnext/tsessebe/actions/runs/24004259683

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… DataFrame.rolling)

- Add `src/window/rolling.ts` — `Rolling` class with full pandas-compatible
  sliding-window API: mean(), sum(), std(), var(), min(), max(), count(),
  median(), apply().  Uses `RollingSeriesLike` interface to avoid circular
  imports.  Supports `minPeriods` and `center` options.
- Add `DataFrameRolling` to `src/core/frame.ts` — column-wise rolling
  aggregations for DataFrame (same methods as Rolling).
- Wire `Series.rolling()` and `DataFrame.rolling()` instance methods.
- Export `Rolling`, `RollingSeriesLike`, `RollingOptions` from `src/index.ts`;
  export `DataFrameRolling` from `src/core/index.ts` and `src/index.ts`.
- Add 40+ unit + property-based tests in `tests/window/rolling.test.ts`.
- Add interactive playground page `playground/rolling.html` (8 examples).
- Mark rolling as complete in `playground/index.html`.

Run: https://github.com/githubnext/tsessebe/actions/runs/24004857590

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- src/window/expanding.ts: Expanding class with pandas-compatible growing-window
  API (mean/sum/std/var/min/max/count/median/apply). ExpandingSeriesLike interface
  avoids circular imports. DataFrameExpanding in frame.ts.
- src/core/cat_accessor.ts: CategoricalAccessor (Series.cat) — categories, codes,
  ordered, nCategories, addCategories, removeCategories, removeUnusedCategories,
  renameCategories, setCategories, reorderCategories, asOrdered, asUnordered,
  valueCounts. CatHolder preserves category metadata through chaining.
- Tests: 40+ unit + property tests for Expanding, 40+ unit + property tests for
  CategoricalAccessor.
- Playground: playground/cat_accessor.html (8 tutorial sections).
- Metric: 16 → 18 (+2)

Run: https://github.com/githubnext/tsessebe/actions/runs/24005927691

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- src/reshape/melt.ts: pandas-compatible melt() (wide→long unpivot) with id_vars,
  value_vars, var_name, value_name, ignore_index
- src/reshape/pivot.ts: pandas-compatible pivot() (unique reshape) and pivotTable()
  (aggregation — mean/sum/count/min/max/first/last) with fill_value, dropna
- tests/reshape/melt.test.ts: 12 unit tests + 2 property tests (fast-check)
- tests/reshape/pivot.test.ts: 14 unit tests + 2 property tests (fast-check)
- playground/melt.html, playground/pivot.html: interactive tutorials
- Metric: 18 → 20 (pandas_features_ported)

Run: https://github.com/githubnext/tsessebe/actions/runs/24006370785

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements src/window/ewm.ts with:
- EWM class: mean (adjust=true/false), std, var, cov, corr, apply
- Decay via span, com, halflife, or alpha
- ignoreNa option (false=decay-on-NaN, true=skip-NaN)
- minPeriods support
- DataFrameEwm in core/frame.ts for per-column EWM on DataFrames
- Series.ewm() and DataFrame.ewm() methods

Online weighted-sum formulation (S, W, W2 state) for O(n) computation.
Helper functions computeCov/computeCorr reduce cognitive complexity.

55+ tests in tests/window/ewm.test.ts (unit + property-based).
Playground: playground/ewm.html (9 sections).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements stack/unstack as complementary inverse operations:
- stack(df): rotates column labels into compound string index labels
  ("rowLabel|colName"), returning a Series<Scalar>. Supports dropna
  (default true, matching pandas) and custom separator.
- unstack(series): reverses stack by parsing compound labels, filling
  missing (row, col) combinations with fill_value (default null).

Both functions are standalone (no circular dep), following the melt/pivot
pattern. Round-trips correctly: unstack(stack(df, {dropna:false})) ≈ df.

Files added:
- src/reshape/stack_unstack.ts — implementation
- tests/reshape/stack_unstack.test.ts — 25+ unit + 4 property tests
- playground/stack_unstack.html — 6-section interactive tutorial

Metric: 21 → 22 (+1)
Run: https://github.com/githubnext/tsessebe/actions/runs/24007420044

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement pandas.MultiIndex — hierarchical multi-level index:
- src/core/multi_index.ts: MultiIndex class with fromTuples/fromArrays/fromProduct
- Properties: nlevels, size, names, levels, codes, isUnique, hasDuplicates
- Access: at(), getLoc(), contains(), isin()
- Level ops: droplevel(), swaplevel(), reorderLevels(), setNames()
- Missing values: isna(), notna(), dropna()
- Duplicates: duplicated(), dropDuplicates()
- Set ops: union(), intersection(), difference()
- sortValues(), equals(), Symbol.iterator, toString()
- 60+ unit tests + 7 property-based tests (fast-check)
- Playground: playground/multi_index.html (8 sections)

Run: https://github.com/githubnext/tsessebe/actions/runs/24008035023

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add src/stats/rank.ts implementing pandas.Series.rank() / DataFrame.rank():
- rankSeries(series, options): assign numerical ranks to Series values
- rankDataFrame(df, options): rank columns (axis=0) or rows (axis=1)
- Tie methods: average (default), min, max, first, dense
- NaN handling: keep (→ NaN), top (lowest ranks), bottom (highest ranks)
- Percentage rank: pct=true divides by nValid (keep) or n (top/bottom)

Tests: 40+ unit tests + 6 property-based tests (fast-check).
Playground: playground/rank.html (8 interactive sections).
Metric: 23 → 24 pandas_features_ported.

Run: https://github.com/githubnext/tsessebe/actions/runs/24008535770

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ection

- src/stats/nlargest.ts: nlargestSeries, nsmallestSeries, nlargestDataFrame, nsmallestDataFrame
- Mirror pandas Series.nlargest/nsmallest and DataFrame.nlargest/nsmallest
- keep='first'/'last'/'all' tie-handling at selection boundary
- NaN/null values excluded from result (same as pandas)
- Multi-column DataFrame sorting with lexicographic comparison
- 45+ unit tests + 5 property-based tests
- Playground: nlargest.html (8 sections)

Run: https://github.com/githubnext/tsessebe/actions/runs/24009034419

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Added src/stats/cum_ops.ts — cumsum(), cumprod(), cummax(), cummin() for Series
- Added dataFrameCumsum(), dataFrameCumprod(), dataFrameCummax(), dataFrameCummin()
- skipna option (default true): NaN/null positions return NaN/null, accumulator skips them
- skipna: false: any missing value poisons all subsequent results
- axis=0 (default): column-wise; axis=1: row-wise across columns
- Non-numeric values treated as missing in cumsum/cumprod
- cummax/cummin work on any comparable scalar (numbers, strings, booleans)
- 50+ unit tests + 6 property-based tests (fast-check)
- Playground: playground/cum_ops.html (8 interactive sections)

Run: https://github.com/githubnext/tsessebe/actions/runs/24009460051

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…() for Series and DataFrame

Run: https://github.com/githubnext/tsessebe/actions/runs/24010099827

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…taFrame

- `valueCounts(series, opts)`: count or normalize unique values, with sort/ascending/dropna
- `dataFrameValueCounts(df, opts)`: count unique row combinations across all or subset of columns
- 25 unit + property-based tests (fast-check); all pass
- Interactive playground page: playground/value_counts.html
- Wire up elem_ops exports (clip/seriesAbs/seriesRound) to stats barrel and main index

Run: https://github.com/githubnext/tsessebe/actions/runs/24010521196

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mrjf mrjf marked this pull request as ready for review April 7, 2026 13:35
@mrjf mrjf merged commit ba62da9 into main Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant