regressio

Zero-dependency TypeScript regression, classification & statistics library with full statistical outputs, diagnostics, and preprocessing. Ships with an optional Rust/WASM engine for accelerated linear algebra.

Install

bun add regressio
# or
npm install regressio
# or
pnpm add regressio

Quick Start

import { LinearRegression } from 'regressio';

const model = new LinearRegression();
model.fit([1, 2, 3, 4, 5], [2.1, 3.9, 6.2, 7.8, 10.1]);

console.log(model.coefficients);  // [2.02]
console.log(model.intercept);     // 0.06
console.log(model.predict([6]));  // [12.18]
console.log(model.summary());     // R-style formatted summary table

Models

Regression

Model	Class	What it does
OLS	`LinearRegression`	Fits a linear relationship between features and target using Ordinary Least Squares solved via QR decomposition. The foundational regression method.
Polynomial	`PolynomialRegression`	Fits non-linear curves by expanding a single feature into polynomial terms (x, x², x³, ...) then applying OLS.
Ridge (L2)	`RidgeRegression`	Adds an L2 penalty (sum of squared coefficients) to OLS to handle multicollinearity and prevent overfitting. Shrinks coefficients toward zero but never exactly to zero.
Lasso (L1)	`LassoRegression`	Adds an L1 penalty (sum of absolute coefficients) via coordinate descent. Forces some coefficients to exactly zero, performing automatic feature selection.
Elastic Net	`ElasticNet`	Combines L1 and L2 penalties. Balances Lasso's feature selection with Ridge's stability for correlated features.
WLS	`WeightedRegression`	Weighted Least Squares. Assigns different importance to each observation. Useful when some data points are more reliable than others.
Robust	`RobustRegression`	Resistant to outliers. Uses Iteratively Reweighted Least Squares (IRLS) with Huber or Tukey bisquare M-estimators to downweight extreme values.

Classification

Model	Class	What it does
Logistic	`LogisticRegression`	Binary classification (0/1). Models the probability of class membership using a sigmoid function, fitted via Newton-Raphson/IRLS.
Multiclass Logistic	`MulticlassLogisticRegression`	Extends logistic regression to K classes using softmax. Fitted via gradient descent on the cross-entropy loss.
K-Nearest Neighbors	`KNearestNeighbors`	Non-parametric method. Predicts by majority vote (classification) or mean (regression) of the k closest training points. Supports Euclidean and Manhattan distances.

Neural Network

Model	Class	What it does
Feedforward NN	`NeuralNetwork`	Multi-layer perceptron with backpropagation. Configurable hidden layers, activations (relu, sigmoid, tanh, softmax), and learning rate. Supports both regression and classification tasks.

Usage

import {
  LinearRegression,
  PolynomialRegression,
  RidgeRegression,
  LassoRegression,
  ElasticNet,
  WeightedRegression,
  RobustRegression,
  LogisticRegression,
  MulticlassLogisticRegression,
  KNearestNeighbors,
  NeuralNetwork,
} from 'regressio';

// --- Regression ---

// OLS: multiple regression
const ols = new LinearRegression();
ols.fit([[1, 2], [3, 4], [5, 6]], [10, 22, 34]);

// Polynomial: fit a cubic curve
const poly = new PolynomialRegression({ degree: 3 });
poly.fit([1, 2, 3, 4, 5], [1, 8, 27, 64, 125]);

// Ridge: regularized regression for correlated features
const ridge = new RidgeRegression({ alpha: 0.5 });
ridge.fit(X, y);

// Lasso: automatic feature selection
const lasso = new LassoRegression({ alpha: 0.1 });
lasso.fit(X, y);
// Some coefficients will be exactly 0

// Elastic Net: mix of L1 and L2
const enet = new ElasticNet({ alpha: 0.1, l1Ratio: 0.5 });
enet.fit(X, y);

// Weighted Least Squares: different reliability per observation
const wls = new WeightedRegression();
wls.fit(X, y, weights);

// Robust: resistant to outliers
const robust = new RobustRegression({ method: 'huber' });
robust.fit(X, y);

// --- Classification ---

// Binary logistic regression
const logit = new LogisticRegression();
logit.fit(X, y); // y must be 0/1
logit.predictProbability(Xnew); // [0.12, 0.87, ...]

// Multiclass logistic regression (softmax)
const multi = new MulticlassLogisticRegression({ learningRate: 0.05 });
multi.fit(X, y); // y = 0, 1, 2, ...
multi.predictProbability(Xnew); // [[0.7, 0.2, 0.1], ...]

// K-Nearest Neighbors (classification or regression)
const knn = new KNearestNeighbors({ k: 5, mode: 'classification' });
knn.fit(X, y);
knn.predict(Xnew);

// --- Neural Network ---

// Regression with a neural network
const nn = new NeuralNetwork({
  layers: [
    { units: 16, activation: 'relu' },
    { units: 8, activation: 'relu' },
  ],
  learningRate: 0.01,
  epochs: 200,
  task: 'regression',
});
nn.fit(X, y);
nn.predict(Xnew);

// Classification with a neural network
const clf = new NeuralNetwork({
  layers: [{ units: 10, activation: 'sigmoid' }],
  learningRate: 0.1,
  epochs: 100,
  task: 'classification',
});
clf.fit(X, y); // y = 0, 1, 2, ...
clf.predict(Xnew);

Statistical Outputs

Every linear model (OLS, Ridge, Lasso, Elastic Net, WLS, Robust, Polynomial) provides statistics() and summary():

const stats = model.statistics();
// {
//   rSquared,              -- proportion of variance explained (0 to 1)
//   adjustedRSquared,      -- R² penalized for number of predictors
//   standardErrors,        -- uncertainty of each coefficient estimate
//   tStatistics,           -- coefficient / standard error for each predictor
//   pValues,               -- probability of observing the t-stat under H0 (no effect)
//   confidenceIntervals,   -- 95% confidence range for each coefficient
//   fStatistic,            -- overall model significance test
//   fPValue,               -- p-value for the F-test
//   residualStandardError, -- estimated standard deviation of residuals
//   aic,                   -- Akaike Information Criterion (lower = better fit/complexity trade-off)
//   bic,                   -- Bayesian Information Criterion (stronger complexity penalty than AIC)
//   degreesOfFreedom,      -- n - k (observations minus parameters)
//   nObservations,         -- number of data points
// }

console.log(model.summary());
// Coefficients:
//                 Estimate    Std. Error  t value   Pr(>|t|)
// (Intercept)     0.0600      0.1200      0.50      0.6300
// x1              2.0200      0.0400      50.20     0.0000 ***
// ---
// Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Binary logistic regression provides classification metrics:

const stats = logit.statistics();
// { accuracy, precision, recall, f1Score, confusionMatrix,
//   pseudoRSquared, logLikelihood, aic, bic }

Multiclass logistic regression provides per-class metrics:

const stats = multi.statistics();
// { accuracy, precision (per class), recall (per class),
//   nClasses, logLikelihood }

Diagnostics

Functions to validate model assumptions and detect problems.

Function	What it does
`residualDiagnostics(X, y, yHat)`	Returns raw residuals, studentized residuals, Cook's distance, and leverage for each observation.
`studentizedResiduals(X, y, yHat)`	Residuals scaled by their estimated standard deviation. Values > 2-3 suggest outliers.
`cooksDistance(X, y, yHat)`	Measures how much each observation influences the fitted model. Values > 4/n flag influential points.
`leverage(X)`	Hat matrix diagonal. Measures how far each observation's features are from the center. High leverage = unusual feature values.
`durbinWatson(residuals)`	Tests for autocorrelation in residuals. Returns statistic in [0,4]: ~2 = no autocorrelation, <2 = positive, >2 = negative. Critical for time series.
`breuschPagan(X, residuals)`	Tests for heteroscedasticity (non-constant variance). Low p-value = variance depends on X, meaning standard errors are unreliable.
`shapiroWilk(data)`	Tests whether data follows a normal distribution. Low p-value = non-normal. Important because p-values and CIs assume normal residuals.
`vif(X)`	Variance Inflation Factor for each feature. VIF > 10 signals multicollinearity (features are too correlated).
`correlationMatrix(X)`	Pairwise Pearson correlation matrix. Pairs with
`conditionNumber(X)`	Ratio of largest to smallest singular value of X. Values > 30 signal numerical instability from multicollinearity.

import {
  residualDiagnostics, leverage, cooksDistance, studentizedResiduals,
  durbinWatson, breuschPagan, shapiroWilk,
  vif, correlationMatrix, conditionNumber,
} from 'regressio';

const diag = residualDiagnostics(X, y, yHat);
const dw = durbinWatson(model.residuals());
const bp = breuschPagan(X, model.residuals());
const sw = shapiroWilk(model.residuals());
const vifs = vif(X);
const corr = correlationMatrix(X);
const kappa = conditionNumber(X);

Preprocessing

Functions to prepare data before fitting models.

Function	What it does
`standardize(X)`	Z-score normalization: transforms each feature to mean=0, std=1. Essential for Lasso/Ridge/Elastic Net and neural networks.
`unstandardize(X, params)`	Reverses standardization back to the original scale.
`normalize(X)`	Min-max scaling: transforms each feature to [0, 1] range.
`unnormalize(X, params)`	Reverses normalization back to the original scale.
`oneHotEncode(column, categories?, dropFirst?)`	Converts categorical values to binary columns. Use `dropFirst=true` to avoid the multicollinearity trap.
`polynomialFeatures(X, degree)`	Generates polynomial terms (x, x², x³, ...) for each feature. Use with `LinearRegression` for polynomial fitting with multiple features.
`interactionFeatures(X, pairs?)`	Generates interaction terms (xi * xj) for all or specified feature pairs.
`dropMissing(X, y?)`	Removes rows containing NaN or null values.
`imputeMean(X)`	Replaces NaN values with the column mean.
`imputeMedian(X)`	Replaces NaN values with the column median. More robust to outliers than mean imputation.

import {
  standardize, unstandardize, normalize, unnormalize,
  oneHotEncode, polynomialFeatures, interactionFeatures,
  dropMissing, imputeMean, imputeMedian,
} from 'regressio';

const { transformed, means, stds } = standardize(X);
const original = unstandardize(transformed, { means, stds });
const { transformed: normed, mins, maxs } = normalize(X);
const dummies = oneHotEncode(['cat', 'dog', 'cat'], undefined, true);
const polyX = polynomialFeatures(X, 3);
const interX = interactionFeatures(X);
const clean = dropMissing(X, y);
const imputed = imputeMean(X);

Prediction Intervals

Functions to quantify prediction uncertainty.

Function	What it does
`confidenceInterval(X, y, yHat, newX, newYHat)`	Confidence interval on the mean prediction. Answers: "where is the true regression line?" Narrower near the center of the training data.
`predictionInterval(X, y, yHat, newX, newYHat)`	Prediction interval for a new individual observation. Always wider than the confidence interval because it includes observation noise.
`bootstrapCoefficients(X, y, nBootstrap?)`	Non-parametric bootstrap: resamples data with replacement, refits the model many times, and returns empirical confidence intervals on coefficients. No distributional assumptions.

import { confidenceInterval, predictionInterval, bootstrapCoefficients } from 'regressio';

const ci = confidenceInterval(X, y, yHat, newX, newYHat);
// [{ predicted, lower, upper }, ...]

const pi = predictionInterval(X, y, yHat, newX, newYHat);
// Always wider than ci

const boot = bootstrapCoefficients(X, y, 1000);
// { coefficients, confidenceIntervals, standardErrors }

Advanced: Matrix Class

Low-level matrix operations for advanced users. Backed by Float64Array in row-major order.

import { Matrix } from 'regressio';

const A = Matrix.fromArray([[1, 2], [3, 4]]);
const B = Matrix.identity(2);
const C = A.multiply(B);
console.log(C.determinant());  // -2
console.log(C.trace());        // 5
console.log(C.transpose().toArray());

WASM Acceleration

regressio ships with a pre-compiled Rust/WASM engine that activates automatically — no configuration needed. When the WASM binary is available, heavy computations are dispatched to compiled Rust code for significantly faster execution.

Accelerated operations:

Matrix: multiply, transpose, add, subtract, scale, dot product, norm, determinant
Decompositions: QR, Cholesky, SVD, eigenvalues (tridiagonal QL)
Solvers: forward/back substitution
Models: Lasso/Elastic Net coordinate descent, logistic regression IRLS, softmax, KNN distance matrices
Diagnostics: correlation matrix, VIF (via correlation matrix inverse)
Predictions: bootstrap OLS (1000+ resamples in a single WASM call)

If WASM is unavailable (e.g. unsupported runtime), all operations fall back silently to pure TypeScript.

import { isWasmActive } from 'regressio';

console.log(isWasmActive()); // true if WASM loaded

// Everything just works — WASM is used transparently
const model = new LinearRegression();
model.fit(X, y); // QR decomposition runs in Rust

Rebuilding WASM

The pre-built WASM binary is included in the package. To rebuild from Rust source (requires Rust with wasm32-unknown-unknown target):

bun run build:wasm

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
pkg		pkg
rust		rust
src		src
tests		tests
.gitignore		.gitignore
.releaserc.json		.releaserc.json
CLAUDE.md		CLAUDE.md
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

regressio

Install

Quick Start

Models

Regression

Classification

Neural Network

Usage

Statistical Outputs

Diagnostics

Preprocessing

Prediction Intervals

Advanced: Matrix Class

WASM Acceleration

Rebuilding WASM

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

regressio

Install

Quick Start

Models

Regression

Classification

Neural Network

Usage

Statistical Outputs

Diagnostics

Preprocessing

Prediction Intervals

Advanced: Matrix Class

WASM Acceleration

Rebuilding WASM

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages