Skip to content

yasemineren/Typesentry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TypeSentry

🧪 TypeSentry: The LLM Torture-Test Harness for TypeScript

"Trust, but Verify." TypeSentry evaluates Large Language Models with adversarial TypeScript prompts and catches failures in security, async logic, and type safety before code reaches production.

TypeScript Node.js Status

Why this project exists

LLMs can produce convincing code that still fails in critical ways:

  • Concurrency bugs (forEach(async ...), race conditions)
  • Security footguns (SQL injection, leaking secrets)
  • Type hallucinations (as any, broken generic assumptions)
  • Operational gaps (weak error paths, no reproducible artifacts)

TypeSentry turns these into measurable test cases.

Architecture

  1. Suite definitions (src/suites/*.json) model real-world engineering tasks.
  2. Runner (src/core/runner.ts) executes each case against model output (mocked by default).
  3. Static evaluator (src/evaluators/static_analysis.ts) checks:
    • forbidden regex patterns
    • required regex patterns
    • strict TypeScript compilation
  4. Repro pack reporter (src/reporters/markdown.ts) stores prompt/code/errors per failure in examples/.

Included suites

  • src/suites/security_suite.json
    • JWT handling
    • SQL query safety
    • password hashing hygiene
  • src/suites/engineering_suite.json
    • async concurrency and retry patterns
    • typed REST client expectations
    • event-driven idempotency workflows

Usage

npm install
npm start -- run suites/security_suite.json
npm start -- run suites/engineering_suite.json

You can pass either suites/... or src/suites/...; CLI resolves both.

Output example

On failure, TypeSentry creates:

examples/repro_pack_<CASE_ID>_<TIMESTAMP>/
  ├── prompt.txt
  ├── generated_code.ts
  └── analysis_report.md

Scripts

  • npm start -- run <suite-path>: run suite
  • npm run typecheck: TypeScript compile check

Next steps (recommended)

  • Plug real model providers (OpenAI/Anthropic) behind a provider interface.
  • Add deterministic scoring weights per failure category.
  • Add CI job that uploads repro packs as artifacts.

About

LLM evaluation harness for TypeScript: adversarial suites, static checks, strict tsc, and reproducible failure packs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors