Avro to JSON Schema, and back
Java Groovy Shell
Latest commit 30cec8a Apr 14, 2014 @fge Announce 0.1.4

README.markdown

Read me first

The license of this project is LGPLv3 or later. See file src/main/resources/LICENSE for the full text.

The current version is 0.1.4.

What this is

This package contains two processors (see json-schema-core) to convert Avro schemas to JSON Schemas, and the reverse.

Status

Avro schemas to JSON Schema

This processor can transform all Avro schemas you can think of, as long as said schemas are self contained. The generated JSON Schemas can accurately validate JSON representations of Avro data with two exceptions:

  • as JSON has no notion of order, the order property of Avro records is not enforced;
  • Avro's float and double are validated as JSON numbers, with no minimum or maximum, see below as to why. Note however that int and long's limits are enforced.

Note that this processor is demoed online here.

JSON Schema to Avro schemas

This processor is not complete yet. It is _much_ more difficult to write than the reverse for two reasons:

  • JSON Schema can describe a much broader set of data than Avro (Avro can only have strings in enums, for instance, while enums in JSON Schema can have any JSON value);
  • but Avro has notions which are not available in JSON (property order in records, binary types).

The generated Avro schemas are however reasonably good, and cover a very large subset of JSON Schema usages.

This processor is not available online yet; it will soon be.

Why limits are not enforced on Avro's float and double

While JSON Schema has minimum and maximum to enforce the minimum and maximum values of a JSON number, JSON numbers (RFC 4627, section 2.4) do not define any limit to the scale and/or precision of numbers.

That is a first reason, but then one should ask why then, are there limits for int and long. There are two reasons for this:

  • JSON Schema defines integer (as a number with no fractional and/or exponent part); integer being a discrete domain, such limits can therefore be defined without room for error;
  • but Avro's float and double are IEEE 754 floating point numbers; they do have minimums and maximums, but 0.1, for instance, cannot even be represented exactly in a double.

Defining limits would therefore not ensure that the JSON number being validated can indeed fit into the corresponding Avro type.

Maven artifact

Replace your-version-here with the appropriate version:

<dependency>
    <groupId>com.github.fge</groupId>
    <artifactId>json-schema-avro</artifactId>
    <version>your-version-here</version>
</dependency>