Skip to content

harrel56/json-schema

Repository files navigation

json-schema

build maven javadoc Coverage

Java library implementing JSON schema specification:

  • compatible with Java 8,
  • support for the newest specification versions Supported spec:
    • Draft 2020-12 Compliance,
    • Draft 2019-09 Compliance,
  • support for custom keywords,
  • support for annotation collection,
  • support for format validation (for a price of one additional dependency 😉),
  • compatible with most of the JSON/YAML libraries (supported libraries),
  • and no additional dependencies on top of that.

Check how it compares with other implementations:

  • Bowtie - specification compliance (only mandatory behaviour),
  • Creek's benchmark - benchmark for JVM based implementations.

Demo

You can check out how it works here.

Installation

Please note that you will also need to include at least one of the supported JSON provider libraries (see JSON provider setup).

Maven

<dependency>
    <groupId>dev.harrel</groupId>
    <artifactId>json-schema</artifactId>
    <version>1.6.0</version>
</dependency>

Gradle

implementation 'dev.harrel:json-schema:1.6.0'

Usage

To validate JSON against a schema, you just need to invoke:

String schema = """
        {
          "type": "boolean"
        }""";
String instance = "true";
boolean valid = new ValidatorFactory().validate(schema, instance).isValid();

Validation result could be queried for more verbose output than a simple boolean flag:

Validator.Result result = new ValidatorFactory().validate(schema, instance);
boolean valid = result.isValid(); // Boolean flag indicating if validation succeeded
List<Error> errors = result.getErrors(); // Details where validation exactly failed
List<Annotation> annotations = result.getAnnotations(); // Collected annotation during validation process

Error and Annotation classes contain specific information where the event occurred, along with error message or annotation value. For specific structure details please refer to the documentation.

Reusing schema

Probably most common case is to validate multiple JSON objects against one specific schema. Approach listed above parses schema for each validation request. To avoid this performance hit, it is better to use Validator class directly.

Validator validator = new ValidatorFactory().createValidator();
URI schemaUri = validator.registerSchema(schema); // Returns URI which should be used to refer to this schema
Validator.Result result1 = validator.validate(schemaUri, instance1);
Validator.Result result2 = validator.validate(schemaUri, instance2);

This way, schema is parsed only once. You could also register multiple schemas this way and refer to them independently. Keep in mind that the "registration space" for schemas is common for one Validator - this can be used to refer dynamically between schemas.

Thread safety

  • ValidatorFactory IS NOT thread-safe as it contains mutable configuration elements which may lead to memory visibility issues. validate(...) methods are however stateless, so if the factory is configured before it has been shared between threads, it can be used concurrently.
  • Validator IS thread-safe as its configuration is immutable. The internal schema registry is configured for multi-threaded usage.
  • All library provided implementations (SchemaResolver, JsonNodeFactory, EvaluatorFactory, Evaluator) are thread safe. For custom user implementations: if intended for use in multi-threaded environment, the implementation should ensure thread safety.

JSON/YAML providers

Supported JSON providers:

  • com.fasterxml.jackson.core:jackson-databind (default),
  • com.google.code.gson:gson,
  • jakarta.json:jakarta.json-api,
  • org.json:json,
  • new.minidev:json-smart,
  • org.codehouse.jettison:jettison.

Supported YAML providers:

  • org.yaml:snakeyaml.

The default provider is com.fasterxml.jackson.core:jackson-databind, so if you are not planning on changing the ValidatorFactory configuration, you need to have this dependency present in your project.

Specific version of provider dependencies which were tested can be found in project POM (uploaded to maven central) listed as optional dependencies.

All adapter classes for JSON provider libs can be found in this package. Anyone is free to add new adapter classes for any JSON lib of their choice, but keep in mind that it is not trivial. If you do so, ensure that test suites for providers pass.

Changing JSON/YAML provider

Provider Factory class Provider node class
com.fasterxml.jackson.core:jackson-databind JacksonNode.Factory com.fasterxml.jackson.databind.JsonNode
com.google.code.gson:gson GsonNode.Factory com.google.gson.JsonElement
jakarta.json:jakarta.json-api JakartaJsonNode.Factory jakarta.json.JsonValue
org.json:json OrgJsonNode.Factory
new.minidev:json-smart JsonSmartNode.Factory
  • net.minidev.json.JSONObject,
  • net.minidev.json.JSONArray,
  • literal types.
org.codehouse.jettison:jettison JettisonNode.Factory
  • org.codehaus.jettison.json.JSONObject,
  • org.codehaus.jettison.json.JSONArray,
  • literal types.
org.yaml:snakeyaml SnakeYamlNode.Factory org.yaml.snakeyaml.nodes.Node

com.fasterxml.jackson.core:jackson-databind

new ValidatorFactory().withJsonNodeFactory(new JacksonNode.Factory());

com.google.code.gson:gson

new ValidatorFactory().withJsonNodeFactory(new GsonNode.Factory());

jakarta.json:jakarta.json-api

Keep in mind that this library contains only interfaces without concrete implementation. It would be required to also have e.g. org.glassfish:jakarta.json dependency in your classpath. Although, it was tested with newest jakarta.json-api version, it should be compatible down to 1.1 version.

new ValidatorFactory().withJsonNodeFactory(new JakartaJsonNode.Factory());

org.json:json

new ValidatorFactory().withJsonNodeFactory(new OrgJsonNode.Factory());

new.minidev:json-smart

new ValidatorFactory().withJsonNodeFactory(new JsonSmartNode.Factory());

org.codehouse.jettison:jettison

new ValidatorFactory().withJsonNodeFactory(new JettisonNode.Factory());

org.yaml:snakeyaml

Library is compatible with YAML 1.1 standard. However, there are few constraints:

  • all object keys are treated as strings,
  • object keys cannot be duplicated,
  • anchors and aliases are supported while override syntax (<<) is not.
new ValidatorFactory().withJsonNodeFactory(new SnakeYamlNode.Factory());

If you expect schemas to be in JSON format and data in YAML format, you can use the following method:

new ValidatorFactory().withJsonNodeFactories(otherJsonFactory, new SnakeYamlNode.Factory());

Provider literal types

Some providers don't have a single wrapper class for their JSON node representation:

  • org.json:json,
  • new.minidev:json-smart,
  • org.codehouse.jettison:jettison,

and they represent literal nodes with these classes:

  • java.lang.String,
  • java.lang.Boolean,
  • java.lang.Character,
  • java.lang.Enum,
  • java.lang.Integer,
  • java.lang.Long,
  • java.lang.Double,
  • java.math.BigInteger,
  • java.math.BigDecimal.

Format validation

By default, format keyword performs no validation (only collects annotations as mandated by the JSON Schema specification). If you want to use format validation, please add an explicit dependency to jmail library (maven link):

<dependency>
  <groupId>com.sanctionco.jmail</groupId>
  <artifactId>jmail</artifactId>
  <version>1.6.2</version>
</dependency>
implementation 'com.sanctionco.jmail:jmail:1.6.2'

To enable format validation, attach FormatEvaluatorFactory to your ValidatorFactory instance:

new ValidatorFactory().withEvaluatorFactory(new FormatEvaluatorFactory());

If usage of another custom EvaluatorFactory is required, you can use EvaluatorFactory.compose() method:

new ValidatorFactory().withEvaluatorFactory(EvaluatorFactory.compose(customFactory, new FormatEvaluatorFactory()));

Supported formats

  • date, date-time, time - uses java.time.format.DateTimeFormatter with standard ISO formatters,
  • duration - uses regex validation as it may be combination of java.time.Duration and java.time.Period,
  • email, idn-email - uses com.sanctionco.jmail.JMail,
  • hostname - uses regex validation,
  • idn-hostname - not supported - performs same validation as hostname,
  • ipv4, ipv6 - uses com.sanctionco.jmail.net.InternetProtocolAddress,
  • uri, uri-reference, iri, iri-reference - uses java.net.URI,
  • uuid - uses java.util.UUID,
  • uri-template - lenient checking of unclosed braces (should be compatible with Spring's implementation),
  • json-pointer, relative-json-pointer - uses manual validation,
  • regex - uses java.util.regex.Pattern.

Note that provided format validation is not 100% specification compliant. Instead, it focuses to be more "Java environment oriented". So for example, when a value is validated as being in uri-reference format, it is guaranteed that URI.create(value) call will succeed.

Advanced configuration

Resolving external schemas

By default, the only schemas that are resolved externally, are specification meta-schemas (e.g. https://json-schema.org/draft/2020-12/schema) which are used for validating schemas during registration process. The meta-schema files are fetched from the classpath and are packaged with jar.

There is no mechanism to pull schemas via HTTP requests. If such behaviour is required it should be implemented by the user.

Providing custom SchemaResolver would look like this:

SchemaResolver resolver = (String uri) -> {
    if ("urn:my-schema1".equals(uri)) {
        // Here goes the logic to retrieve this schema
        // This may be e.g. HTTP call
        String rawSchema = ...
        return SchemaResolver.Result.fromString(rawSchema);
    } else if ("urn:my-schema2".equals(uri)) {
        // Same thing here
        String rawSchema = ...
        return SchemaResolver.Result.fromString(rawSchema);
    } else {
        return SchemaResolver.Result.empty();
    }
};

Then it just needs to be attached to ValidatorFactory:

new ValidatorFactory().withSchemaResolver(resolver);

For more information about return type please refer to the documentation.

Dialects

By default, draft 2020-12 dialect is used, but it can be changed with:

new ValidatorFactory().withDialect(new Dialects.Draft2019Dialect()); // or any other dialect

Custom dialects are also supported, see more here.

Meta-schemas

Dialects come with their meta-schemas. Each schema will be validated by meta-schema provided by used dialect. If validation fails InvalidSchemaException is thrown.

For each specific schema this behaviour can be overridden by providing $schema keyword with desired meta-schema URI. Resolution of meta-schema follows the same rules as for a regular schema.

There is a configuration option that disables all schema validations (affects $schema and vocabularies semantics too):

new ValidatorFactory().withDisabledSchemaValidation(true);

Adding custom keywords

Customizing specific keywords behaviour can be achieved by providing custom EvaluatorFactory implementation. Each dialect comes with its core EvaluatorFactory which will always be used, but additional EvaluatorFactory implementation can be provided on top of that. If you want to completely alter how schemas are validated, please refer to custom dialects.

First step is to implement Evaluator interface:

class ContainsStringEvaluator implements Evaluator {
    /* A value which should be contained in a validated string */
    private final String value;

    ContainsStringEvaluator(JsonNode node) {
        /* Other types are not supported - this exception will be handled appropriately by factory returned by the builder */
        if (!node.isString()) {
            throw new IllegalArgumentException();
        }
        this.value = node.asString();
    }
    
    @Override
    public Evaluator.Result evaluate(EvaluationContext ctx, JsonNode node) {
        /* To stay consistent with other keywords, types not applicable to this keyword should succeed */
        if (!node.isString()) {
            return Evaluator.Result.success();
        }
        
        /* Actual validation logic */
        if (node.asString().contains(value)) {
            return Evaluator.Result.success();
        } else {
            return Evaluator.Result.failure(String.format("\"%s\" does not contain required value [%s]", node.asString(), value));
        }
    }
}

For the simplest cases (like this one) it is recommended to use EvaluatorFactory.Builder. This example shows how to create an evaluator factory using builder:

EvaluatorFactory factory = new EvaluatorFactory.Builder()
    .withKeyword("containsString", ContainsStringEvaluator::new)
    .build();

For more complex cases when you need more control over creation of evaluators, you should provide your own factory implementation:

class CustomEvaluatorFactory implements EvaluatorFactory {
    @Override
    public Optional<Evaluator> create(SchemaParsingContext ctx, String fieldName, JsonNode schemaNode) {
        /* Check if fieldName equals the keyword value you want to support.
         * Additionally, check if the node type of this keyword field is a string - this is the only type which makes sense in this case.
         * It may be tempting to fail if the keyword field type is different than string,
         * but it's strongly recommended to just return Optional.empty() in such case. */
        if ("containsString".equals(fieldName) && schemaNode.isString()) {
            return Optional.of(new ContainsStringEvaluator(schemaNode));
        }
        return Optional.empty();
    }
}

Then the factory just needs to be attached to ValidatorFactory:

new ValidatorFactory().withEvaluatorFactory(factory);

And if you have more than one custom evaluator factory, you should use compose function:

new ValidatorFactory().withEvaluatorFactory(EvaluatorFactory.compose(new CustomEvaluatorFactory1(), new CustomEvaluatorFactory2()));

Having such configuration as above you would have following list of evaluator factories:

  1. CustomEvaluatorFactory1,
  2. CustomEvaluatorFactory2,
  3. Draft2020EvaluatorFactory (the core evaluator factory provided by Dialect object, might be different depending on which dialect is set).

The ordering of factories is important as it will only query the next factory only if the previous one returned Optional.empty(). This allows for overriding keywords logic easily.

E.g. if CustomEvaluatorFactory1 returns an evaluator for type keyword, the type evaluator from Draft2020EvaluatorFactory would not be provided.

Each keyword & keyword-value pair can have 0 or 1 evaluators attached. If you want multiple evaluators attached to a single keyword & keyword-value pair, you would need to simulate such behavior by encapsulating multiple evaluators logic in just one evaluator instance.

Difference between schema parsing and validation

The implementation logic consist of two parts:

  1. Schema parsing - running EvaluatorFactory.create(...) logic and constructing concrete Evaluator instance. You probably want to do as much computation heavy processing in this part.
  2. Validation - running Evaluator.evaluate(...).

Please ensure that your EvaluatorFactory and Evaluator implementations correctly handle cases when node types are different than expected. This is because EvaluatorFactory.create(...) will be called for all schema object properties (nested too). So for example this may be intended usage:

{
  "containsString": "hello"
}

but the EvaluatorFactory will be called also in this case:

{
  "properties": {
    "containsString": {
      "type": "null"
    }
  }
}

Custom dialects

If you want you could provide your custom dialect configuration:

Dialect customDialect = new Dialect() {
    @Override
    public SpecificationVersion getSpecificationVersion() {
        return SpecificationVersion.DRAFT2020_12;
    }
    
    @Override
    public String getMetaSchema() {
        return "https://example.com/custom/schema";
    }
    
    @Override
    public EvaluatorFactory getEvaluatorFactory() {
        return new Draft2020EvaluatorFactory();
    }
    
    @Override
    public Set<String> getSupportedVocabularies() {
        return Collections.singleton("custom-vocabulary");
    }
    
    @Override
    public Set<String> getRequiredVocabularies() {
        return Collections.emptySet();
    }
    
    @Override
    public Map<String, Boolean> getDefaultVocabularyObject() {
        return Collections.singletonMap("custom-vocabulary", true);
    }
};
new ValidatorFactory().withDialect(customDialect);

See the documentation for more details.