udecide

Six small text classifiers and a train() primitive, all of which run in the browser or in Node and return a calibrated probability you can compare and threshold like any other number.

import { spam, intent, grade, train } from 'udecide'

await spam('BUY NOW!!! click here')        // 0.96
await intent('my card was charged twice')  // 'billing'
await grade('positive', 'increase')        // 0.89

Install

npm install udecide

The library runs in Node 22 and higher, in modern browsers, in Bun, in Deno, and inside a Cloudflare Worker, with @huggingface/transformers as the only runtime dependency.

The catalog

Six pre-trained tools, each one a single named import.

tool	what it does	first call
`spam`	comment-spam probability	23 MB shared
`intent`	route into billing, support, sales, shipping, other	23 MB shared
`sentiment`	positive, neutral, negative	23 MB shared
`toxicity`	abuse probability	23 MB shared
`pii`	personal-information detector	23 MB shared
`grade`	"do these two answers mean the same thing"	65 MB

The first five share one sentence-encoder model that downloads about twenty three megabytes on the first call and stays cached after, while grade loads its own cross-encoder of about sixty five megabytes because direction questions and antonym discrimination need a model that scores pairs jointly rather than computing a cosine between two embeddings. A reader who imports every tool and exercises every one downloads about eighty eight megabytes once and runs locally for the rest of the visit.

Train your own

import { train, load } from 'udecide'

const classify = await train([
  { text: 'great product', label: 'positive' },
  { text: 'broke immediately', label: 'negative' },
  // around thirty more
])

await classify('works as expected')   // 'positive'

const head = classify.export()
const reloaded = await load(head)

train() takes around thirty labeled examples for a binary task or around fifty for a multiclass one, fits a head on top of the sentence encoder using an eighty twenty stratified split for the holdout, calibrates the scores so a 0.7 actually means around seventy percent confidence on the held out test set, and returns a callable closure you can save to disk and reload later. When the classes do not separate, the trainer throws a TrainingError that lists the misclassified examples and the likely causes, which is the only honest thing to do when the underlying signal is not there.

Scores

Every score this library returns is a real probability in [0, 1] rather than a raw model output, which means a 0.7 from any tool can be compared with a 0.7 from any other tool without having to remember which one came out of which sigmoid, and the standard pattern is a single threshold.

const isSpam = (await spam(text)) > 0.7

What it cannot do

The library is the right tool when the rule you are trying to encode is "this looks like that" and the alternative you would otherwise be writing is a regular expression, a switch on keyword presence, or a string equality check that has started lying. It is the wrong tool for problems that require parsing, like deciding whether a string is valid Python, and for problems that require step by step reasoning, like working out whether a proof is correct, because a sentence encoder collapses both of those into a single vector and loses the information that mattered. The default models are also tuned for English, so any application that needs to classify other languages should swap the encoder with setEmbedder to one of the multilingual variants documented under the embedders concept page before expecting any of this to calibrate sensibly on its own corpus.

CLI

udecide train ./examples.jsonl --out ./my-head.json
udecide test  ./my-head.json --input "..." --expected "..."
udecide info  ./my-head.json

More

The full documentation lives under docs/ and the seven runnable examples under examples/ cover the patterns most applications actually need, including the screenshot grader at examples/grader-screenshot/ which demonstrates the fix for the specific bug this library was originally written to address.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
datasets		datasets
docs		docs
examples		examples
experiments		experiments
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
STACK_VERSIONS.md		STACK_VERSIONS.md
biome.json		biome.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsdown.config.ts		tsdown.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

udecide

Install

The catalog

Train your own

Scores

What it cannot do

CLI

More

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

udecide

Install

The catalog

Train your own

Scores

What it cannot do

CLI

More

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages