Skip to content

dave817/case-verification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

case-verification

Deterministic citation verification for US legal writing. Catches fabricated cases, fabricated quotes, and pincite errors against CourtListener. Refuses to accuse what it cannot verify.

Author: David Leung · davidleung.co

I work in compliance at a single family office in Hong Kong. I'm not a lawyer. I've spent a long time in offshore structuring, and enough time with AI tools to know where they fail quietly.


Why this exists

In April 2026, a well-known US law firm filed a letter with a federal bankruptcy court disclosing that an emergency motion they had filed nine days earlier contained hallucinated citations and other AI-generated errors. The remedial letter included a schedule correcting the errors line by line. It is a careful, public, and generous piece of work — the kind of documentation most firms never publish — and it makes a concrete benchmark possible for the rest of us.

I built this skill to make that schedule's four error classes deterministically detectable before a similar filing reaches a court.

The idea was prompted by Anthropic's recent Claude Skills webinar and by Mark Pike's Q&A discussion of the growing legal community around Claude Skills. Most of the code was written in collaboration with Claude Code.

What it does

Given a brief, motion, memo, or any document containing case citations, the skill runs a deterministic pipeline against CourtListener — the Free Law Project's open US case-law database:

  1. Parses every citation with eyecite.
  2. Resolves each citation to a canonical record via three fallback strategies.
  3. Fetches the full opinion text.
  4. Verifies that any quoted language actually appears in the opinion — with normalization for smart quotes, whitespace, editorial bracketing, ellipses, and star-pagination.
  5. Cross-checks the pincite where star-pagination is available.
  6. Emits a layered evidence ledger plus a human-readable report.

Two modes:

  • Preflight — auditing your own draft before filing.
  • Adversary — auditing opposing counsel's filing for a response brief or 28(j) letter.

What it deliberately does not do

The tool reports findings in layered form across seven dimensions. Two of those layers are always reported as not_reviewed:

  • Proposition support. No judgment of whether the cited case stands for what the brief says it stands for. That requires legal reasoning; the tool retrieves the passage and leaves the call to the human.
  • Treatment / citator. No KeyCite or Shepard's integration.

It also does not verify non-US law, statutes, regulations, or secondary sources. Those are out of scope, honestly labeled, and flagged to the reviewer.

A ✅ badge from this tool never means "this case supports the proposition." It means the quote and metadata are internally consistent. Proposition support is human judgment. See GOVERNANCE.md for the full honesty model.

Known limitations

I'd rather you know these upfront than discover them in a filing:

  • CourtListener coverage gaps. Recent bankruptcy court opinions, unpublished decisions, and many Westlaw- or Lexis-only citations are not indexed. When that happens, the tool labels the result unresolved — coverage gap with a manual-verification note. It does not label it a potential fabrication. This matters: in adversary mode, miscalling a coverage gap would be defamatory. The tool is built not to do that.
  • Short-form back-references. Citations like 400 B.R. at 291 that reference an earlier full cite are resolved but currently show case_name: None, because eyecite doesn't automatically link short-form citations back to their antecedents. Not a correctness bug — the underlying citation is still verified — but the display is rough.
  • Reporter-page vs. star-page pincites. When a brief cites by reporter page (e.g., 283) and CourtListener's opinion text only marks star-pagination (*284), the tool cannot deterministically translate between the two numbering systems. It returns pincite: unresolved rather than guess.
  • Non-US law, statutes, regulations, treaties, secondary sources. Out of scope by design.
  • Proposition support and treatment. Out of scope by design, as above.
  • Not a lawyer. I built this as an engineer and compliance practitioner. The legal judgment in any output is yours. This tool is an evidence retrieval and matching pipeline, not substitute counsel.

The CHANGELOG.md documents the bugs I found during live testing and the fixes for each. I recommend reading it before relying on the tool — particularly the v1.3 and v1.6 entries, which concern false-positive patterns that live testing surfaced and mock testing would not have.

Quick start

pip install eyecite
git clone https://github.com/dave817/case-verification
cd case-verification

# Free token from courtlistener.com/profile/
export COURTLISTENER_API_TOKEN=your_token

# Audit your own draft
python3 scripts/verify.py your_brief.txt --mode preflight --output report.md

# Audit opposing counsel's filing
python3 scripts/verify.py their_brief.txt --mode adversary --output response_memo.md

What you get

For every citation, the tool emits a seven-layer status table:

Layer What it checks
A. Authority resolution Does the citation resolve in CourtListener?
B. Source retrieval Can we retrieve opinion text?
C. Metadata consistency Do year and case name match the brief?
D. Quote verification Does quoted language appear in the opinion?
E. Pincite verification Is the *N page consistent with where the quote appears?
F. Proposition support ALWAYS not_reviewed — engine does not judge
G. Treatment / citator ALWAYS not_reviewed — no citator integration

No top-level status reads as bare verified. The most confident label is verified_quote_and_metadata — a deliberate reminder that proposition support remains human work.

In adversary mode, the tool produces draftable paragraphs you can use in a response. The language is calibrated to distinguish a genuine catch (fabricated quote, wrong pincite) from a CourtListener coverage gap that calls for manual verification rather than accusation.

Design philosophy

One sentence: the worst failure of a verification tool is not missing a hallucination — it is falsely accusing a real citation of being one.

Every iteration of this tool has been shaped by that sentence. The layered status model exists because a single verified badge overclaims. The distinction between reporter and WL citations exists because CourtListener is comprehensive on the former and spotty on the latter, and conflating them in adversary output is how you end up writing a response brief that damages your own credibility. The pincite path runs on raw text while quote matching runs on normalized text, because otherwise star-pagination markers contaminate the match positions.

Those decisions came from running the tool against real filed briefs and watching the early versions produce confident wrong answers. Each wrong answer pointed at a better design.

Licence and contributions

MIT. Pull requests welcome, especially:

  • Additional test fixtures drawn from real filed briefs.
  • Bug reports with live CourtListener responses attached.
  • Coverage for citation forms I haven't handled yet.

If you find this useful, please consider donating to Free Law Project. Their work building open US case-law infrastructure is what makes this tool possible.

Acknowledgments

  • Anthropic, whose Claude Skills webinar made this class of tool look buildable on a weekend rather than a quarter, and whose Claude Code product wrote most of the tests with me.
  • Mark Pike, whose Q&A discussion of the growing legal community around Claude Skills pointed at exactly this kind of use case.
  • Free Law Project, for a decade of work building CourtListener and eyecite and keeping them free. Commercial citators are good; free ones are civilisational.
  • The compliance practitioners I've learned from, who taught me across many years why verifiable evidence matters more than confident assertion.

David Leung · Hong Kong · April 2026 davidleung.co

About

Fail-closed citation verification against CourtListener. Catches fabricated cases, fabricated quotes, and pincite errors in US legal writing.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages