Zero-dependency TypeScript regression, classification & statistics library with full statistical outputs, diagnostics, and preprocessing. Ships with an optional Rust/WASM engine for accelerated linear algebra.
bun add regressio
# or
npm install regressio
# or
pnpm add regressioimport { LinearRegression } from 'regressio';
const model = new LinearRegression();
model.fit([1, 2, 3, 4, 5], [2.1, 3.9, 6.2, 7.8, 10.1]);
console.log(model.coefficients); // [2.02]
console.log(model.intercept); // 0.06
console.log(model.predict([6])); // [12.18]
console.log(model.summary()); // R-style formatted summary table| Model | Class | What it does |
|---|---|---|
| OLS | LinearRegression |
Fits a linear relationship between features and target using Ordinary Least Squares solved via QR decomposition. The foundational regression method. |
| Polynomial | PolynomialRegression |
Fits non-linear curves by expanding a single feature into polynomial terms (x, x², x³, ...) then applying OLS. |
| Ridge (L2) | RidgeRegression |
Adds an L2 penalty (sum of squared coefficients) to OLS to handle multicollinearity and prevent overfitting. Shrinks coefficients toward zero but never exactly to zero. |
| Lasso (L1) | LassoRegression |
Adds an L1 penalty (sum of absolute coefficients) via coordinate descent. Forces some coefficients to exactly zero, performing automatic feature selection. |
| Elastic Net | ElasticNet |
Combines L1 and L2 penalties. Balances Lasso's feature selection with Ridge's stability for correlated features. |
| WLS | WeightedRegression |
Weighted Least Squares. Assigns different importance to each observation. Useful when some data points are more reliable than others. |
| Robust | RobustRegression |
Resistant to outliers. Uses Iteratively Reweighted Least Squares (IRLS) with Huber or Tukey bisquare M-estimators to downweight extreme values. |
| Model | Class | What it does |
|---|---|---|
| Logistic | LogisticRegression |
Binary classification (0/1). Models the probability of class membership using a sigmoid function, fitted via Newton-Raphson/IRLS. |
| Multiclass Logistic | MulticlassLogisticRegression |
Extends logistic regression to K classes using softmax. Fitted via gradient descent on the cross-entropy loss. |
| K-Nearest Neighbors | KNearestNeighbors |
Non-parametric method. Predicts by majority vote (classification) or mean (regression) of the k closest training points. Supports Euclidean and Manhattan distances. |
| Model | Class | What it does |
|---|---|---|
| Feedforward NN | NeuralNetwork |
Multi-layer perceptron with backpropagation. Configurable hidden layers, activations (relu, sigmoid, tanh, softmax), and learning rate. Supports both regression and classification tasks. |
import {
LinearRegression,
PolynomialRegression,
RidgeRegression,
LassoRegression,
ElasticNet,
WeightedRegression,
RobustRegression,
LogisticRegression,
MulticlassLogisticRegression,
KNearestNeighbors,
NeuralNetwork,
} from 'regressio';
// --- Regression ---
// OLS: multiple regression
const ols = new LinearRegression();
ols.fit([[1, 2], [3, 4], [5, 6]], [10, 22, 34]);
// Polynomial: fit a cubic curve
const poly = new PolynomialRegression({ degree: 3 });
poly.fit([1, 2, 3, 4, 5], [1, 8, 27, 64, 125]);
// Ridge: regularized regression for correlated features
const ridge = new RidgeRegression({ alpha: 0.5 });
ridge.fit(X, y);
// Lasso: automatic feature selection
const lasso = new LassoRegression({ alpha: 0.1 });
lasso.fit(X, y);
// Some coefficients will be exactly 0
// Elastic Net: mix of L1 and L2
const enet = new ElasticNet({ alpha: 0.1, l1Ratio: 0.5 });
enet.fit(X, y);
// Weighted Least Squares: different reliability per observation
const wls = new WeightedRegression();
wls.fit(X, y, weights);
// Robust: resistant to outliers
const robust = new RobustRegression({ method: 'huber' });
robust.fit(X, y);
// --- Classification ---
// Binary logistic regression
const logit = new LogisticRegression();
logit.fit(X, y); // y must be 0/1
logit.predictProbability(Xnew); // [0.12, 0.87, ...]
// Multiclass logistic regression (softmax)
const multi = new MulticlassLogisticRegression({ learningRate: 0.05 });
multi.fit(X, y); // y = 0, 1, 2, ...
multi.predictProbability(Xnew); // [[0.7, 0.2, 0.1], ...]
// K-Nearest Neighbors (classification or regression)
const knn = new KNearestNeighbors({ k: 5, mode: 'classification' });
knn.fit(X, y);
knn.predict(Xnew);
// --- Neural Network ---
// Regression with a neural network
const nn = new NeuralNetwork({
layers: [
{ units: 16, activation: 'relu' },
{ units: 8, activation: 'relu' },
],
learningRate: 0.01,
epochs: 200,
task: 'regression',
});
nn.fit(X, y);
nn.predict(Xnew);
// Classification with a neural network
const clf = new NeuralNetwork({
layers: [{ units: 10, activation: 'sigmoid' }],
learningRate: 0.1,
epochs: 100,
task: 'classification',
});
clf.fit(X, y); // y = 0, 1, 2, ...
clf.predict(Xnew);Every linear model (OLS, Ridge, Lasso, Elastic Net, WLS, Robust, Polynomial) provides statistics() and summary():
const stats = model.statistics();
// {
// rSquared, -- proportion of variance explained (0 to 1)
// adjustedRSquared, -- R² penalized for number of predictors
// standardErrors, -- uncertainty of each coefficient estimate
// tStatistics, -- coefficient / standard error for each predictor
// pValues, -- probability of observing the t-stat under H0 (no effect)
// confidenceIntervals, -- 95% confidence range for each coefficient
// fStatistic, -- overall model significance test
// fPValue, -- p-value for the F-test
// residualStandardError, -- estimated standard deviation of residuals
// aic, -- Akaike Information Criterion (lower = better fit/complexity trade-off)
// bic, -- Bayesian Information Criterion (stronger complexity penalty than AIC)
// degreesOfFreedom, -- n - k (observations minus parameters)
// nObservations, -- number of data points
// }
console.log(model.summary());
// Coefficients:
// Estimate Std. Error t value Pr(>|t|)
// (Intercept) 0.0600 0.1200 0.50 0.6300
// x1 2.0200 0.0400 50.20 0.0000 ***
// ---
// Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Binary logistic regression provides classification metrics:
const stats = logit.statistics();
// { accuracy, precision, recall, f1Score, confusionMatrix,
// pseudoRSquared, logLikelihood, aic, bic }Multiclass logistic regression provides per-class metrics:
const stats = multi.statistics();
// { accuracy, precision (per class), recall (per class),
// nClasses, logLikelihood }Functions to validate model assumptions and detect problems.
| Function | What it does |
|---|---|
residualDiagnostics(X, y, yHat) |
Returns raw residuals, studentized residuals, Cook's distance, and leverage for each observation. |
studentizedResiduals(X, y, yHat) |
Residuals scaled by their estimated standard deviation. Values > 2-3 suggest outliers. |
cooksDistance(X, y, yHat) |
Measures how much each observation influences the fitted model. Values > 4/n flag influential points. |
leverage(X) |
Hat matrix diagonal. Measures how far each observation's features are from the center. High leverage = unusual feature values. |
durbinWatson(residuals) |
Tests for autocorrelation in residuals. Returns statistic in [0,4]: ~2 = no autocorrelation, <2 = positive, >2 = negative. Critical for time series. |
breuschPagan(X, residuals) |
Tests for heteroscedasticity (non-constant variance). Low p-value = variance depends on X, meaning standard errors are unreliable. |
shapiroWilk(data) |
Tests whether data follows a normal distribution. Low p-value = non-normal. Important because p-values and CIs assume normal residuals. |
vif(X) |
Variance Inflation Factor for each feature. VIF > 10 signals multicollinearity (features are too correlated). |
correlationMatrix(X) |
Pairwise Pearson correlation matrix. Pairs with |
conditionNumber(X) |
Ratio of largest to smallest singular value of X. Values > 30 signal numerical instability from multicollinearity. |
import {
residualDiagnostics, leverage, cooksDistance, studentizedResiduals,
durbinWatson, breuschPagan, shapiroWilk,
vif, correlationMatrix, conditionNumber,
} from 'regressio';
const diag = residualDiagnostics(X, y, yHat);
const dw = durbinWatson(model.residuals());
const bp = breuschPagan(X, model.residuals());
const sw = shapiroWilk(model.residuals());
const vifs = vif(X);
const corr = correlationMatrix(X);
const kappa = conditionNumber(X);Functions to prepare data before fitting models.
| Function | What it does |
|---|---|
standardize(X) |
Z-score normalization: transforms each feature to mean=0, std=1. Essential for Lasso/Ridge/Elastic Net and neural networks. |
unstandardize(X, params) |
Reverses standardization back to the original scale. |
normalize(X) |
Min-max scaling: transforms each feature to [0, 1] range. |
unnormalize(X, params) |
Reverses normalization back to the original scale. |
oneHotEncode(column, categories?, dropFirst?) |
Converts categorical values to binary columns. Use dropFirst=true to avoid the multicollinearity trap. |
polynomialFeatures(X, degree) |
Generates polynomial terms (x, x², x³, ...) for each feature. Use with LinearRegression for polynomial fitting with multiple features. |
interactionFeatures(X, pairs?) |
Generates interaction terms (xi * xj) for all or specified feature pairs. |
dropMissing(X, y?) |
Removes rows containing NaN or null values. |
imputeMean(X) |
Replaces NaN values with the column mean. |
imputeMedian(X) |
Replaces NaN values with the column median. More robust to outliers than mean imputation. |
import {
standardize, unstandardize, normalize, unnormalize,
oneHotEncode, polynomialFeatures, interactionFeatures,
dropMissing, imputeMean, imputeMedian,
} from 'regressio';
const { transformed, means, stds } = standardize(X);
const original = unstandardize(transformed, { means, stds });
const { transformed: normed, mins, maxs } = normalize(X);
const dummies = oneHotEncode(['cat', 'dog', 'cat'], undefined, true);
const polyX = polynomialFeatures(X, 3);
const interX = interactionFeatures(X);
const clean = dropMissing(X, y);
const imputed = imputeMean(X);Functions to quantify prediction uncertainty.
| Function | What it does |
|---|---|
confidenceInterval(X, y, yHat, newX, newYHat) |
Confidence interval on the mean prediction. Answers: "where is the true regression line?" Narrower near the center of the training data. |
predictionInterval(X, y, yHat, newX, newYHat) |
Prediction interval for a new individual observation. Always wider than the confidence interval because it includes observation noise. |
bootstrapCoefficients(X, y, nBootstrap?) |
Non-parametric bootstrap: resamples data with replacement, refits the model many times, and returns empirical confidence intervals on coefficients. No distributional assumptions. |
import { confidenceInterval, predictionInterval, bootstrapCoefficients } from 'regressio';
const ci = confidenceInterval(X, y, yHat, newX, newYHat);
// [{ predicted, lower, upper }, ...]
const pi = predictionInterval(X, y, yHat, newX, newYHat);
// Always wider than ci
const boot = bootstrapCoefficients(X, y, 1000);
// { coefficients, confidenceIntervals, standardErrors }Low-level matrix operations for advanced users. Backed by Float64Array in row-major order.
import { Matrix } from 'regressio';
const A = Matrix.fromArray([[1, 2], [3, 4]]);
const B = Matrix.identity(2);
const C = A.multiply(B);
console.log(C.determinant()); // -2
console.log(C.trace()); // 5
console.log(C.transpose().toArray());regressio ships with a pre-compiled Rust/WASM engine that activates automatically — no configuration needed. When the WASM binary is available, heavy computations are dispatched to compiled Rust code for significantly faster execution.
Accelerated operations:
- Matrix: multiply, transpose, add, subtract, scale, dot product, norm, determinant
- Decompositions: QR, Cholesky, SVD, eigenvalues (tridiagonal QL)
- Solvers: forward/back substitution
- Models: Lasso/Elastic Net coordinate descent, logistic regression IRLS, softmax, KNN distance matrices
- Diagnostics: correlation matrix, VIF (via correlation matrix inverse)
- Predictions: bootstrap OLS (1000+ resamples in a single WASM call)
If WASM is unavailable (e.g. unsupported runtime), all operations fall back silently to pure TypeScript.
import { isWasmActive } from 'regressio';
console.log(isWasmActive()); // true if WASM loaded
// Everything just works — WASM is used transparently
const model = new LinearRegression();
model.fit(X, y); // QR decomposition runs in RustThe pre-built WASM binary is included in the package. To rebuild from Rust source (requires Rust with wasm32-unknown-unknown target):
bun run build:wasmMIT