Skip to content

1.4.0

Latest

Choose a tag to compare

@waynewbishop waynewbishop released this 15 Jun 14:53
· 4 commits to main since this release

Overview

Quiver 1.4.0 expands what you can build and inspect by hand on the wrist, the phone, or the server. You can now train a binary classifier with LogisticRegression that returns a probability rather than just a label, so the confidence threshold for acting on a prediction is yours to set.

You can wrap any fitted regressor in a ResidualModel to read the gap between what it predicted and what actually happened — the part of a signal the model didn't explain — and you can interrogate a model's coefficients through the new Coefficients protocol and the Model Interpretation primer to see which inputs truly drive a result and whether the fit can be trusted at all.

On the search side, the Embedder protocol separates where your vectors come from from how they're ranked, so a hand-built table and an on-device model feed the same mostSimilar(to:k:) ranking surface unchanged — the foundation for retrieval pipelines that ground a language model's answer. Everything is additive over 1.3.0, with no existing call changed and no migration required.


Feature — Binary classification

Fitting a yes-or-no classifier on labeled data, reading the probability behind each call, and setting the decision threshold by hand.

  • LogisticRegression.fit(features:labels:learningRate:maxIterations:tolerance:intercept:) — trains a binary classifier by gradient descent on cross-entropy loss; throws on divergence. A single-feature [Double] overload mirrors the [[Double]] form.
  • LogisticRegression.predict(_:) — returns class labels (0 or 1).
  • LogisticRegression.predictProbabilities(_:) — returns the per-sample probability of class 1.
  • LogisticRegression.decisionFunction(_:) — returns the raw log-odds before the sigmoid; zero is the decision boundary. A scalar overload serves single-feature models.
  • LogisticRegression.Outcome — reports whether descent converged or reached the iteration cap; the fitted model also exposes coefficients, iterations, finalLoss, and lossHistory for inspection.

Logistic regression runs on the same descent loop as GradientDescent and Ridge, sharing one internal engine across all three. A Pipeline overload pairs the classifier with a StandardScaler so raw feature rows are scaled internally at both fit and predict time.

Feature — Reading a fitted model

Interrogating a model's coefficients honestly: which feature drives a prediction, and whether the fit can be trusted in the first place.

  • The Coefficients protocol exposes a model's fitted coefficient vector (intercept first, then one weight per feature). LinearRegression, Ridge, and GradientDescent conform.
  • The Model Interpretation primer covers reading coefficients, the condition number as a pre-fit trust check, and why a near-zero weight can be a measurement artifact rather than a finding.

Feature — Residual analysis

Modeling the part of a signal we understand, then studying what is left over.

  • ResidualModel(model:) — wraps any fitted regressor; trains nothing of its own.
  • ResidualModel.expected(_:) — the baseline prediction a residual is measured against.
  • ResidualModel.residuals(features:targets:) — one observed − predicted value per sample; a scalar residual(features:observed:) serves a single sample.
  • When the wrapped model carries coefficients, ResidualModel forwards them through the Coefficients protocol.

The article frames residuals as an out-of-sample diagnostic, reading the gap on data the fit never saw rather than on the training set that flatters it.

Feature — Pluggable embedding sources

Separating where vectors come from from how they are ranked, so the source can change without touching the search code.

  • The Embedder protocol names a single operation: text in, vector out (or nil when nothing can be embedded). Quiver ships the contract, not an embedder.
  • Array.embedded(using:) (on [String]) — embeds each string and returns one (text, vector) pair per string it can embed, keeping each vector beside the text that produced it.
  • Array.mostSimilar(to:k:) (on the paired output) — ranks the pairs against a query vector by cosine similarity and returns the top matches as (rank, text, score).

A hand-built word-vector table and an on-device sentence model conform to the same one-method contract, so the ranking surface treats them alike. The averaging baseline is deliberately order-blind, which the worked example makes visible rather than hiding.

Feature — Retrieval-augmented context

Turning a document and a question into a cited block of text ready to hand a language model.

  • The retrieval worked example walks the full pipeline: chunk a document with provenance, embed once at ingest, rank a query against the chunks, and assemble an attributable context block, each step a plain Swift operation.

The pattern is commonly known as RAG. Quiver supplies the retrieval scaffolding; the generation model sits outside the library, fed by the context block the pipeline produces.

Feature — Single-feature prediction convenience

  • predict(_:) now accepts a scalar Double on both the Regressor and Classifier protocols, returning one prediction for single-feature models. Every conforming model gains it from the protocol; no per-model code.

Documentation

The DocC catalog gained three new pages, a new Topics group, and a corpus-wide editorial pass.

New conceptual guides:

  • Model Interpretation Primer — reading a fitted model's coefficients, the condition number as a trust check, and when a weight is meaningful.

New API and topic pages:

  • Logistic Regression (Models).
  • Residual Model (Models).

New Worked Examples section:

  • Semantic Search — ranking by meaning across a corpus.
  • Embedding Sources — the Embedder contract and swapping vector sources.
  • Retrieving Context for Generation — chunking, ranking, and assembling a cited context block.
  • Panel Workflows — labeled-table analysis end to end.
  • Building an Effort Model — composing StandardScaler, Ridge, ResidualModel, and a KNearestNeighbors classifier into a watchOS effort model.

Expanded pages: Basics, How It Works, Machine Learning Primer, Physics Primitives Primer, and Text Tokenization gained cross-links to the new models, retrieval, and embedding surfaces. The catalog also received a sentence-structure and punctuation pass across every article.