Support `--continue` to validate all JSONL entries beyond the first error by jviotti · Pull Request #727 · sourcemeta/jsonschema

jviotti · 2026-04-23T13:42:59Z

Fixes: #726
Signed-off-by: Juan Cruz Viotti jv@jviotti.com

…rror Fixes: #726 Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

augmentcode · 2026-04-23T13:53:54Z

🤖 Augment PR Summary

Summary: This PR adds a new --continue/-c flag to the jsonschema validate command to keep validating JSONL inputs after the first failing entry.

Changes:

Introduced a continue_on_error option and threaded it through the JSONL/streamed validation loop.
Adjusted multi-document (JSONL) control flow so failures no longer stop iteration when --continue is set.
Added spacing/separation in verbose and error outputs when reporting multiple JSONL failures.
Registered the new flag in the CLI option parser for the validate subcommand.
Updated validation documentation to describe the new behavior and provide an example.
Added new Unix shell tests covering continue behavior across normal, verbose, and JSON output modes.

Technical Notes: Exit code behavior remains consistent (expected-failure exit code when any entry fails), while JSONL reporting becomes "fail-fast" only when --continue is not provided.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

cubic-dev-ai

No issues found across 12 files

TobiasNx · 2026-04-23T14:08:45Z

 jsonschema validate path/to/my/schema.json path/to/my/dataset.jsonl
 ```

+### Validate a JSONL dataset reporting all failures


Does it report all failures or for each failing record the first validation error?

I'll clarify. Only first failure. JSON Schema, as per the standard, doesn't really describe how to proceed past first validation error. Maybe could be done, but might be very complex in certain cases. Definitely a longer track of work

If I remember correctly ajv has this feature at least as option --all-errors. The main reason we are experimenting with jsonschema is the support for jsonl.

Yeah, sadly AJV is one of the worst ones in terms of JSON Schema standard compliance we know of (and pretty much abandoned at this point). See https://bowtie.report for the official ranking we maintain.

I'll take a note of it. In theory it is possible to keep going even despite of errors, but I think we would need to be careful with how we present errors and not spit out non-sensical stack traces. We'll see!

The main reason we are experimenting with jsonschema is the support for jsonl.

Can you share more about the use case, out of curiosity?

Sure, I work for hbz which provides digital infrastructure for scientific libraries in the state of North Rhine-Westphalia, Germany.

One of our services is a search index of the union catalogue of the libraries called lobid-resources: https://lobid.org/resources

The index data is created by an ETL transformation from a library data format MARC21 to JSON-LD.

We also created a JSON Schema for our index data: https://github.com/hbz/lobid-resources/tree/master/src/test/resources/schemas

We already validate our single test files around 200 with ajv. But we also want to test larger portions of our index e.g. current updates (serveral thousand records up to hundret thousand) or the whole index (22 Mio.) as jsonl dump files. The search for a validator that supports jsonl lead us to your project jsonschema.

So far our tests with the update files look promising. Maybe a support for compressed jsonl files would be nice.

The support for reporting all errors would also be nice, since it would help us to improve our transformation with regard to all errors. Our update files change daily and if a record has multiple errors we would spot the next error only if the record gets an update again.

On hbz, very nice! Let me know how we can help. You might find the schema linter interesting, and https://one.sourcemeta.com is a nice way to visualise and serve JSON Schema data models. Overall, we are trying hard to produce a next-level JSON Schema ecosystem, so any feedback you have is very appreciated.

Maybe a support for compressed jsonl files would be nice.

That's probably not very hard to implement. Can you submit an issue? What kind of compression are you using?

We already validate our single test files around 200 with ajv.

In general, we from the JSON Schema TSC advise against AJV given its compliance issues. We offer a test command in this project to setup JSON Schema unit tests. Would that be a match here?

The support for reporting all errors would also be nice, since it would help us to improve our transformation with regard to all errors.

Right. Makes sense. Let me think more about it!

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

jviotti added 2 commits April 23, 2026 09:41

Support --continue to validate all JSONL entries beyond the first e…

cb89e0a

…rror Fixes: #726 Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

Better spacing

4a4f2f7

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

jviotti marked this pull request as ready for review April 23, 2026 13:50

augmentcode Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/main.cc

Fix USAGE_DETAILS

4b850ec

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

cubic-dev-ai Bot reviewed Apr 23, 2026

View reviewed changes

jviotti mentioned this pull request Apr 23, 2026

Report all validation failures in a jsonl dataset #726

Closed

TobiasNx reviewed Apr 23, 2026

View reviewed changes

Note on first error

30021c6

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

jviotti merged commit f18026d into main Apr 23, 2026
14 checks passed

jviotti deleted the jsonl-continue branch April 23, 2026 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `--continue` to validate all JSONL entries beyond the first error#727

Support `--continue` to validate all JSONL entries beyond the first error#727
jviotti merged 4 commits intomainfrom
jsonl-continue

jviotti commented Apr 23, 2026

Uh oh!

augmentcode Bot commented Apr 23, 2026

Uh oh!

augmentcode Bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

TobiasNx Apr 23, 2026

Uh oh!

jviotti Apr 23, 2026

Uh oh!

TobiasNx Apr 23, 2026

Uh oh!

jviotti Apr 23, 2026

Uh oh!

TobiasNx Apr 23, 2026

Uh oh!

TobiasNx Apr 23, 2026

Uh oh!

jviotti Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jviotti commented Apr 23, 2026

Uh oh!

augmentcode Bot commented Apr 23, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

TobiasNx Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jviotti Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

TobiasNx Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jviotti Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

TobiasNx Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

TobiasNx Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jviotti Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants