Skip to content

Latest commit

 

History

History
595 lines (442 loc) · 22.7 KB

validation.md

File metadata and controls

595 lines (442 loc) · 22.7 KB

Data validation

Validation basics

JSON Schema validation keywords

Ajv supports all validation keywords from draft-07 of JSON Schema standard - see JSON Schema validation keywords for more details.

ajv-keywords package provides additional validation keywords that can be used with Ajv.

Annotation keywords

JSON Schema specification defines several annotation keywords that describe schema itself but do not perform any validation.

  • title and description: information about the data represented by that schema
  • $comment (NEW in draft-07): information for developers. With option $comment Ajv logs or passes the comment string to the user-supplied function. See Options.
  • default: a default value of the data instance, see Assigning defaults.
  • examples (NEW in draft-06): an array of data instances. Ajv does not check the validity of these instances against the schema.
  • readOnly and writeOnly (NEW in draft-07): marks data-instance as read-only or write-only in relation to the source of the data (database, api, etc.).
  • contentEncoding: RFC 2045, e.g., "base64".
  • contentMediaType: RFC 2046, e.g., "image/png".

Please note: Ajv does not implement validation of the keywords examples, contentEncoding and contentMediaType but it reserves them. If you want to create a plugin that implements any of them, it should remove these keywords from the instance.

Formats

From version 7 Ajv does not include formats defined by JSON Schema specification - these and several other formats are provided by ajv-formats plugin.

To add all formats from this plugin:

import Ajv from "ajv"
import addFormats from "ajv-formats"

const ajv = new Ajv()
addFormats(ajv)

See ajv-formats documentation for further details.

It is recommended NOT to use "format" keyword implementations with untrusted data, as they use potentially unsafe regular expressions - see ReDoS attack.

Please note: if you need to use "format" keyword to validate untrusted data, you MUST assess their suitability and safety for your validation scenarios.

The following formats are defined in ajv-formats for string validation with "format" keyword:

  • date: full-date according to RFC3339.
  • time: time with optional time-zone.
  • date-time: date-time from the same source (time-zone is mandatory).
  • uri: full URI.
  • uri-reference: URI reference, including full and relative URIs.
  • uri-template: URI template according to RFC6570
  • url (deprecated): URL record.
  • email: email address.
  • hostname: host name according to RFC1034.
  • ipv4: IP address v4.
  • ipv6: IP address v6.
  • regex: tests whether a string is a valid regular expression by passing it to RegExp constructor.
  • uuid: Universally Unique IDentifier according to RFC4122.
  • json-pointer: JSON-pointer according to RFC6901.
  • relative-json-pointer: relative JSON-pointer according to this draft.

Please note: JSON Schema draft-07 also defines formats iri, iri-reference, idn-hostname and idn-email for URLs, hostnames and emails with international characters. These formats are available in ajv-formats-draft2019 plugin.

You can add and replace any formats using addFormat method.

Modular schemas

Combining schemas with $ref

You can structure your validation logic across multiple schema files and have schemas reference each other using $ref keyword.

Example:

const schema = {
  $id: "http://example.com/schemas/schema.json",
  type: "object",
  properties: {
    foo: {$ref: "defs.json#/definitions/int"},
    bar: {$ref: "defs.json#/definitions/str"},
  },
}

const defsSchema = {
  $id: "http://example.com/schemas/defs.json",
  definitions: {
    int: {type: "integer"},
    str: {type: "string"},
  },
}

Now to compile your schema you can either pass all schemas to Ajv instance:

const ajv = new Ajv({schemas: [schema, defsSchema]})
const validate = ajv.getSchema("http://example.com/schemas/schema.json")

or use addSchema method:

const ajv = new Ajv()
const validate = ajv.addSchema(defsSchema).compile(schema)

See Options and addSchema method.

Please note:

  • $ref is resolved as the uri-reference using schema $id as the base URI (see the example).
  • References can be recursive (and mutually recursive) to implement the schemas for different data structures (such as linked lists, trees, graphs, etc.).
  • You don't have to host your schema files at the URIs that you use as schema $id. These URIs are only used to identify the schemas, and according to JSON Schema specification validators should not expect to be able to download the schemas from these URIs.
  • The actual location of the schema file in the file system is not used.
  • You can pass the identifier of the schema as the second parameter of addSchema method or as a property name in schemas option. This identifier can be used instead of (or in addition to) schema $id.
  • You cannot have the same $id (or the schema identifier) used for more than one schema - the exception will be thrown.
  • You can implement dynamic resolution of the referenced schemas using compileAsync method. In this way you can store schemas in any system (files, web, database, etc.) and reference them without explicitly adding to Ajv instance. See Asynchronous schema compilation.

$data reference

With $data option you can use values from the validated data as the values for the schema keywords. See proposal for more information about how it works.

$data reference is supported in the keywords: const, enum, format, maximum/minimum, exclusiveMaximum / exclusiveMinimum, maxLength / minLength, maxItems / minItems, maxProperties / minProperties, formatMaximum / formatMinimum, formatExclusiveMaximum / formatExclusiveMinimum, multipleOf, pattern, required, uniqueItems.

The value of "$data" should be a JSON-pointer to the data (the root is always the top level data object, even if the $data reference is inside a referenced subschema) or a relative JSON-pointer (it is relative to the current point in data; if the $data reference is inside a referenced subschema it cannot point to the data outside of the root level for this subschema).

Examples.

This schema requires that the value in property smaller is less or equal than the value in the property larger:

const ajv = new Ajv({$data: true})

const schema = {
  properties: {
    smaller: {
      type: "number",
      maximum: {$data: "1/larger"},
    },
    larger: {type: "number"},
  },
}

const validData = {
  smaller: 5,
  larger: 7,
}

ajv.validate(schema, validData) // true

This schema requires that the properties have the same format as their field names:

const schema = {
  additionalProperties: {
    type: "string",
    format: {$data: "0#"},
  },
}

const validData = {
  "date-time": "1963-06-19T08:30:06.283185Z",
  email: "joe.bloggs@example.com",
}

$data reference is resolved safely - it won't throw even if some property is undefined. If $data resolves to undefined the validation succeeds (with the exclusion of const keyword). If $data resolves to incorrect type (e.g. not "number" for maximum keyword) the validation fails.

$merge and $patch keywords

With the package ajv-merge-patch you can use the keywords $merge and $patch that allow extending JSON Schemas with patches using formats JSON Merge Patch (RFC 7396) and JSON Patch (RFC 6902).

To add keywords $merge and $patch to Ajv instance use this code:

require("ajv-merge-patch")(ajv)

Examples.

Using $merge:

{
  $merge: {
    source: {
      type: "object",
      properties: {p: {type: "string"}},
      additionalProperties: false
    },
    with: {
      properties: {q: {type: "number"}}
    }
  }
}

Using $patch:

{
  $patch: {
    source: {
      type: "object",
      properties: {p: {type: "string"}},
      additionalProperties: false
    },
    with: [{op: "add", path: "/properties/q", value: {type: "number"}}]
  }
}

The schemas above are equivalent to this schema:

{
  type: "object",
  properties: {
    p: {type: "string"},
    q: {type: "number"}
  },
  additionalProperties: false
}

The properties source and with in the keywords $merge and $patch can use absolute or relative $ref to point to other schemas previously added to the Ajv instance or to the fragments of the current schema.

See the package ajv-merge-patch for more information.

User-defined keywords

The advantages of defining keywords are:

  • allow creating validation scenarios that cannot be expressed using pre-defined keywords
  • simplify your schemas
  • help bringing a bigger part of the validation logic to your schemas
  • make your schemas more expressive, less verbose and closer to your application domain
  • implement data processors that modify your data (modifying option MUST be used in keyword definition) and/or create side effects while the data is being validated

If a keyword is used only for side-effects and its validation result is pre-defined, use option valid: true/false in keyword definition to simplify both generated code (no error handling in case of valid: true) and your keyword functions (no need to return any validation result).

The concerns you have to be aware of when extending JSON Schema standard with additional keywords are the portability and understanding of your schemas. You will have to support these keywords on other platforms and to properly document them so that everybody can understand and use your schemas.

You can define keywords with addKeyword method. Keywords are defined on the ajv instance level - new instances will not have previously defined keywords.

Ajv allows defining keywords with:

  • code generation function (used by all pre-defined keywords)
  • validation function
  • compilation function
  • macro function

Example. range and exclusiveRange keywords using compiled schema:

ajv.addKeyword({
  keyword: "range",
  type: "number",
  schemaType: "array",
  implements: "exclusiveRange",
  compile: ([min, max], parentSchema) =>
    parentSchema.exclusiveRange === true
      ? (data) => data > min && data < max
      : (data) => data >= min && data <= max,
})

const schema = {range: [2, 4], exclusiveRange: true}
const validate = ajv.compile(schema)
console.log(validate(2.01)) // true
console.log(validate(3.99)) // true
console.log(validate(2)) // false
console.log(validate(4)) // false

Several keywords (typeof, instanceof, range and propertyNames) are defined in ajv-keywords package - they can be used for your schemas and as a starting point for your own keywords.

See User-defined keywords for more details.

Asynchronous schema compilation

During asynchronous compilation remote references are loaded using supplied function. See compileAsync method and loadSchema option.

Example:

const ajv = new Ajv({loadSchema: loadSchema})

ajv.compileAsync(schema).then(function (validate) {
  const valid = validate(data)
  // ...
})

function loadSchema(uri) {
  return request.json(uri).then(function (res) {
    if (res.statusCode >= 400) throw new Error("Loading error: " + res.statusCode)
    return res.body
  })
}

Please note: Option missingRefs should NOT be set to "ignore" or "fail" for asynchronous compilation to work.

Asynchronous validation

Example in Node.js REPL: https://runkit.com/esp/ajv-asynchronous-validation

You can define formats and keywords that perform validation asynchronously by accessing database or some other service. You should add async: true in the keyword or format definition (see addFormat, addKeyword and User-defined keywords).

If your schema uses asynchronous formats/keywords or refers to some schema that contains them it should have "$async": true keyword so that Ajv can compile it correctly. If asynchronous format/keyword or reference to asynchronous schema is used in the schema without $async keyword Ajv will throw an exception during schema compilation.

Please note: all asynchronous subschemas that are referenced from the current or other schemas should have "$async": true keyword as well, otherwise the schema compilation will fail.

Validation function for an asynchronous format/keyword should return a promise that resolves with true or false (or rejects with new Ajv.ValidationError(errors) if you want to return errors from the keyword function).

Ajv compiles asynchronous schemas to async functions. Async functions are supported in Node.js 7+ and all modern browsers. You can supply a transpiler as a function via processCode option. See Options.

The compiled validation function has $async: true property (if the schema is asynchronous), so you can differentiate these functions if you are using both synchronous and asynchronous schemas.

Validation result will be a promise that resolves with validated data or rejects with an exception Ajv.ValidationError that contains the array of validation errors in errors property.

Example:

const ajv = new Ajv()

ajv.addKeyword({
  keyword: "idExists"
  async: true,
  type: "number",
  validate: checkIdExists,
})

function checkIdExists(schema, data) {
  return knex(schema.table)
    .select("id")
    .where("id", data)
    .then(function (rows) {
      return !!rows.length // true if record is found
    })
}

const schema = {
  $async: true,
  properties: {
    userId: {
      type: "integer",
      idExists: {table: "users"},
    },
    postId: {
      type: "integer",
      idExists: {table: "posts"},
    },
  },
}

const validate = ajv.compile(schema)

validate({userId: 1, postId: 19})
  .then(function (data) {
    console.log("Data is valid", data) // { userId: 1, postId: 19 }
  })
  .catch(function (err) {
    if (!(err instanceof Ajv.ValidationError)) throw err
    // data is invalid
    console.log("Validation errors:", err.errors)
  })

Using transpilers

const ajv = new Ajv({processCode: transpileFunc})
const validate = ajv.compile(schema) // transpiled es7 async function
validate(data).then(successFunc).catch(errorFunc)

See Options.

Modifying data during validation

Removing additional properties

With option removeAdditional (added by andyscott) you can filter data during the validation.

This option modifies original data.

Example:

const ajv = new Ajv({removeAdditional: true})
const schema = {
  additionalProperties: false,
  properties: {
    foo: {type: "number"},
    bar: {
      additionalProperties: {type: "number"},
      properties: {
        baz: {type: "string"},
      },
    },
  },
}

const data = {
  foo: 0,
  additional1: 1, // will be removed; `additionalProperties` == false
  bar: {
    baz: "abc",
    additional2: 2, // will NOT be removed; `additionalProperties` != false
  },
}

const validate = ajv.compile(schema)

console.log(validate(data)) // true
console.log(data) // { "foo": 0, "bar": { "baz": "abc", "additional2": 2 }

If removeAdditional option in the example above were "all" then both additional1 and additional2 properties would have been removed.

If the option were "failing" then property additional1 would have been removed regardless of its value and property additional2 would have been removed only if its value were failing the schema in the inner additionalProperties (so in the example above it would have stayed because it passes the schema, but any non-number would have been removed).

Please note: If you use removeAdditional option with additionalProperties keyword inside anyOf/oneOf keywords your validation can fail with this schema, for example:

{
  type: "object",
  oneOf: [
    {
      properties: {
        foo: {type: "string"}
      },
      required: ["foo"],
      additionalProperties: false
    },
    {
      properties: {
        bar: {type: "integer"}
      },
      required: ["bar"],
      additionalProperties: false
    }
  ]
}

The intention of the schema above is to allow objects with either the string property "foo" or the integer property "bar", but not with both and not with any other properties.

With the option removeAdditional: true the validation will pass for the object { "foo": "abc"} but will fail for the object {"bar": 1}. It happens because while the first subschema in oneOf is validated, the property bar is removed because it is an additional property according to the standard (because it is not included in properties keyword in the same schema).

While this behaviour is unexpected (issues #129, #134), it is correct. To have the expected behaviour (both objects are allowed and additional properties are removed) the schema has to be refactored in this way:

{
  type: "object",
  properties: {
    foo: {type: "string"},
    bar: {type: "integer"}
  },
  additionalProperties: false,
  oneOf: [{required: ["foo"]}, {required: ["bar"]}]
}

The schema above is also more efficient - it will compile into a faster function.

Assigning defaults

With option useDefaults Ajv will assign values from default keyword in the schemas of properties and items (when it is the array of schemas) to the missing properties and items.

With the option value "empty" properties and items equal to null or "" (empty string) will be considered missing and assigned defaults.

This option modifies original data.

Please note: the default value is inserted in the generated validation code as a literal, so the value inserted in the data will be the deep clone of the default in the schema.

Example 1 (default in properties):

const ajv = new Ajv({useDefaults: true})
const schema = {
  type: "object",
  properties: {
    foo: {type: "number"},
    bar: {type: "string", default: "baz"},
  },
  required: ["foo", "bar"],
}

const data = {foo: 1}

const validate = ajv.compile(schema)

console.log(validate(data)) // true
console.log(data) // { "foo": 1, "bar": "baz" }

Example 2 (default in items):

const schema = {
  type: "array",
  items: [{type: "number"}, {type: "string", default: "foo"}],
}

const data = [1]

const validate = ajv.compile(schema)

console.log(validate(data)) // true
console.log(data) // [ 1, "foo" ]

With useDefaults option default keywords throws exception during schema compilation when used in:

  • not in properties or items subschemas
  • in schemas inside anyOf, oneOf and not (see #42)
  • in if schema
  • in schemas generated by user-defined macro keywords

The strict mode option can change the behavior for these unsupported defaults (strict: false to ignore them, "log" to log a warning).

See Strict mode.

Coercing data types

When you are validating user inputs all your data properties are usually strings. The option coerceTypes allows you to have your data types coerced to the types specified in your schema type keywords, both to pass the validation and to use the correctly typed data afterwards.

This option modifies original data.

Please note: if you pass a scalar value to the validating function its type will be coerced and it will pass the validation, but the value of the variable you pass won't be updated because scalars are passed by value.

Example 1:

const ajv = new Ajv({coerceTypes: true})
const schema = {
  type: "object",
  properties: {
    foo: {type: "number"},
    bar: {type: "boolean"},
  },
  required: ["foo", "bar"],
}

const data = {foo: "1", bar: "false"}

const validate = ajv.compile(schema)

console.log(validate(data)) // true
console.log(data) // { "foo": 1, "bar": false }

Example 2 (array coercions):

const ajv = new Ajv({coerceTypes: "array"})
const schema = {
  properties: {
    foo: {type: "array", items: {type: "number"}},
    bar: {type: "boolean"},
  },
}

const data = {foo: "1", bar: ["false"]}

const validate = ajv.compile(schema)

console.log(validate(data)) // true
console.log(data) // { "foo": [1], "bar": false }

The coercion rules, as you can see from the example, are different from JavaScript both to validate user input as expected and to have the coercion reversible (to correctly validate cases where different types are defined in subschemas of "anyOf" and other compound keywords).

See Coercion rules for details.