Keep Your Hashes Classy (a lightweight Hash validation gem)
Ruby
Clone or download
mike-bourgeous Merge pull request #24 from deseretbook/ruby-24-gemspec-update
Following up on issue #11, this PR updates the gem version and removes the Ruby version dependency that excludes Ruby 2.4.

Simplecov was also updated to get rid of a warning from Ruby 2.4 about the deprecation of Fixnum.
Latest commit af7e27e Sep 11, 2017
Permalink
Failed to load latest commit information.
bin Use Pry for debugging; it's a little easier than IRB. Aug 5, 2016
lib Added a comment about improving error messages in the future. Nov 4, 2016
spec
.gitignore Uncommented Gemfile.lock and Ruby version control files in .gitignore. May 29, 2014
.rspec Let RSpec load required files for us. Aug 3, 2016
.travis.yml
Gemfile Update Simplecov to get rid of warnings in Ruby 2.4. Sep 11, 2017
LICENSE Added "contributors" to the LICENSE copyright line. May 30, 2014
README.md Updated benchmarks in README with final v2 performance numbers. Nov 4, 2016
Rakefile Added a Rakefile that loads bundler tasks for building and releasing. Aug 9, 2014
benchmark.rb Allow specifying serializers and validators to benchmark. Nov 4, 2016
classy_hash.gemspec Bump gem version and allow Ruby 2.4. Sep 11, 2017

README.md

Classy Hash

Gem Version Test Coverage Build Status

Keep Your Hashes Classy (a lightweight Hash validation gem)

Classy Hash is a lightweight RubyGem for validating Ruby hashes against a simple schema Hash that indicates what data types are expected. Classy Hash will make sure your data matches, providing helpful error messages if it doesn't.

Classy Hash is fantastic for helping developers become familiar with an API, by letting them know exactly what they did wrong. It also guards against mistakes by verifying that incoming data meets expectations, and can serve as a convenient data structure documentation format.

Why Classy Hash?

Classy Hash was created as a lightweight alternative to the other good validation gems available. By taking advantage of built-in Ruby language features, Classy Hash can validate common Hashes much faster than some of the other gems we tested, with a dead simple schema syntax.

Classy Hash doesn't modify your Hashes or patch any core classes, so it's safe to use just about anywhere.

Classy Hash is thoroughly tested (see the Testing section below).

Finally, Classy Hash is fast:

Valid hashes:

   Serializer    |        Validator         |   Ops    |  Ops/sec   |  Alloc/op  |   Ops/GC  
-----------------+--------------------------+----------+------------+------------+-----------
 msgpack         | no_op                    |   200000 |   109200.1 |       28.0 |     1886.8
 msgpack         | classy_hash_no_raise     |   200000 |    45786.7 |       32.0 |     1666.7
 msgpack         | classy_hash_full         |   200000 |    45744.4 |       32.0 |     1666.7
 msgpack         | classy_hash_errors_array |   200000 |    45577.6 |       31.0 |     1724.1
 msgpack         | classy_hash              |   200000 |    45421.5 |       31.0 |     1724.1
 msgpack         | classy_hash_strict       |   200000 |    37348.8 |       43.0 |     1257.9
 msgpack         | classy_hash_full_strict  |   200000 |    37024.5 |       44.0 |     1234.6
 msgpack         | hash_validator           |   100000 |    22241.5 |       73.0 |      757.6
 msgpack         | schema_hash              |    50000 |    18942.2 |      118.0 |      463.0
 msgpack         | json_schema              |     8000 |     1207.1 |     1000.1 |       56.3
 msgpack         | json_schema_strict       |     8000 |     1195.6 |     1009.0 |       55.9
 msgpack         | json_schema_full         |     8000 |     1189.5 |     1013.0 |       55.6


Invalid hashes:

   Serializer    |        Validator         |   Ops    |  Ops/sec   |  Alloc/op  |   Ops/GC  
-----------------+--------------------------+----------+------------+------------+-----------
 msgpack         | classy_hash              |   500000 |    55524.3 |       28.2 |     1865.7
 msgpack         | classy_hash_no_raise     |   500000 |    54028.7 |       30.6 |     1730.1
 msgpack         | classy_hash_strict       |   500000 |    45824.1 |       33.8 |     1577.3
 msgpack         | classy_hash_errors_array |   500000 |    40601.2 |       34.6 |     1533.7
 msgpack         | classy_hash_full         |   500000 |    39895.5 |       35.4 |     1506.0
 msgpack         | classy_hash_full_strict  |   500000 |    35024.4 |       41.0 |     1312.3
 msgpack         | hash_validator           |   250000 |    20873.9 |       69.0 |      793.7
 msgpack         | json_schema_strict       |    20000 |     1399.6 |      887.4 |       63.3
 msgpack         | json_schema              |    20000 |     1398.0 |      891.6 |       63.1
 msgpack         | json_schema_full         |    20000 |     1281.2 |      956.4 |       58.8

Examples

A Classy Hash schema can be as simple or as complex as you like. At the most basic level, you list each key your Hash is required to contain, with the expected Ruby class of the value.

For more examples, see benchmark.rb and lib/spec/classy_hash_spec.rb. For complete documentation of all parameters, see lib/classy_hash.rb.

Simple example

Let's look at a simple schema for a Hash with three members:

schema = {
  key1: String,
  key2: Integer,
  key3: TrueClass
}

This specifies a Hash with a String, an Integer, and a boolean value (both TrueClass and FalseClass will accept true and false). Here's how we validate a Hash against our schema:

hash = {
  key1: 'A Hash with class',
  key2: 0,
  key3: false
}

ClassyHash.validate(hash, schema) # Returns true

Here's what happens if we try to validate an invalid Hash:

hash = {
  key1: 'A less classy Hash',
  key2: 1.25,
  key3: 'Also wrong, but not checked'
}

ClassyHash.validate(hash, schema) # Raises ":key2 is not a/an Integer"

The validate method will raise an exception if validation fails (this can be changed by passing raise_errors: false; see below). Validation proceeds until the first invalid value is found, then ClassyHash::SchemaViolationError is thrown for that value. Later values are not checked unless you run a full validation with full: true.

Controlling validation

The ClassyHash.validate method accepts several named parameters for controlling validation. For complete details, see lib/classy_hash.rb.

Strict validation

You can pass strict: true as a keyword argument to validate to raise an error if the input hash contains any members not specified in the schema. Passing verbose: true will include the names of the unexpected hash keys in the generated error message (a potential security risk in some settings). See the inline documentation in the source code for more details. As of version 0.2.0, all nested schemas will also be checked for unexpected members.

Example:

# Raises "Top level is not valid: contains members not specified in schema"
ClassyHash.validate({a: 1, b: 2}, {c: Integer}, strict: true)

# Raises "Top level is not valid: contains members :a, :b not specified in schema"
ClassyHash.validate({a: 1, b: 2}, {c: Integer}, strict: true, verbose: true)

# Raises ":a is not valid: contains members :b, :c not specified in schema"
ClassyHash.validate({a: {b: 1, c: 2}}, {a: {a: Integer}}, strict: true, verbose: true)
Full validation

If you'd like to capture all errors, you can pass full: true. If you don't also pass raise_errors: false, full validation will simply raise an error that includes all the violations in the message:

schema = {
  key1: String,
  key2: Integer,
  key3: TrueClass
}

hash = {
  key1: 'A less classy Hash',
  key2: 1.25,
  key3: 'Also wrong'
}

begin
  ClassyHash.validate(hash, schema, full: true) # Raises ":key2 is not a/an Integer, :key3 is not true or false"
rescue => e
  puts e.message

  # Individual errors are in the .entries array from the exception, just like
  # the :errors option described below.
  puts e.entries.inspect
end
Errors array, exceptionless validation

If you pass an empty array into :errors, your application code can handle the validation errors directly. If you pass raise_errors: false, .validate will return false for failed validations. Only the first error will be added to the :errors array, unless you pass full: true. These options can be used independently.

# Using schema and hash from the previous example

errors = []
ClassyHash.validate(hash, schema, errors: errors, raise_errors: false, full: true) # Returns false

# Now, errors is [{full_path: ":key2", message: "a/an Integer"}, {full_path: ":key3", message: "true or false"}]

Whether you use exceptions or a false return (with or without an :errors array) is up to your preferences. Note that if you use raise_errors: false, there is no way to obtain error messages without passing an :errors array.

Multiple choice constraints

It's possible to specify more than one option for a key, allowing multiple types and/or nil to be used as the value for a key:

schema = {
  key1: [ NilClass, String, FalseClass ]
}

ClassyHash.validate({ key1: nil }, schema) # Returns true
ClassyHash.validate({ key1: 'Hi' }, schema) # Returns true
ClassyHash.validate({ key1: true }, schema) # Returns true
ClassyHash.validate({ key1: 1337 }, schema) # Raises ":key1 is not one of a/an NilClass, a/an String, true or false"

Optional keys

Classy Hash will raise an error if a key from the schema is missing from the hash being validated:

schema = {
  key1: TrueClass
}

ClassyHash.validate({}, schema) # Raises ":key1 is not present"

If we want to allow a key to be omitted, we can mark it as optional by adding the :optional symbol as the first element of a multiple choice array:

schema = {
  key1: [:optional, TrueClass]
}

ClassyHash.validate({}, schema) # Returns true

Regular expressions

Regexp constraints, added in 0.1.4, will require values to be Strings that match a regular expression:

schema = {
  key1: /Re.*quired/i
}

ClassyHash.validate({ key1: /required/ }, schema) # Raises ":key1 is not a String matching /Re.*quired/i"
ClassyHash.validate({ key1: 'invalid' }, schema) # Raises ":key1 is not a String matching /Re.*quired/i"
ClassyHash.validate({ key1: 'The regional manager inquired about ClassyHash' }, schema) # Returns true

As with Ruby's =~ operator, Regexps can match anywhere in the String. To require the entire String to match, use the standard \A and \z anchors:

schema = {
  key1: /\AStart.*end\z/
}

ClassyHash.validate({ key1: 'One must Start to end' }, schema) # Raises ":key1 is not a String matching /\\AStart.*end\\z/"
ClassyHash.validate({ key1: 'Start now, continue to the end' }, schema) # Returns true

Ranges

If you want to check more than just the type of a value, you can specify a Range as a constraint. If your Range endpoints are Integers, Numerics, or Strings, then Classy Hash will also restrict the type of the value to Integer, Numeric, or String.

# An Integer range
schema = {
  key1: 1..10
}

ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: -5 }, schema) # Raises ":key1 is not an Integer in range 1..10"
ClassyHash.validate({ key1: 2.5 }, schema) # Raises ":key1 is not an Integer in range 1..10"
# A more interesting range -- this use is not recommended :-)
schema = {
  key1: [1]..[5]
}

ClassyHash.validate({ key1: [2, 3, 4] }, schema) # Returns true
ClassyHash.validate({ key1: [5, 0] }, schema) # Raises ":key1 is not in range [1]..[5]"

Procs

If nothing else will do, you can pass a Proc/lambda. The Proc should have no side effects, as it may be called more than once per value. When using a Proc, you should accept exactly one parameter and return true if validation succeeds. Any other value will be treated as a validation failure. If the Proc returns a String, that string will be used in the error message.

# A lambda without an error message
schema = {
  key1: ->(v){ v.is_a?(Integer) && v.odd? }
}

ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 2 }, schema) # Raises ":key1 is not accepted by Proc"
# A lambda with an error message
schema = {
  key1: ->(v){ (v.is_a?(Integer) && v.odd?) || 'an odd integer' }
}

ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 2 }, schema) # Raises ":key1 is not an odd integer"

Sets

Added in version 0.2.0, Sets constrain a value to one of a list of values. The Set constraint replaces the enum generator. Note that a Set requires an exact value match, unlike the Multiple Choice constraint or Composite generator.

schema = {
  key1: Set.new([1, 2, 3, 'see?'])
}

ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: 'see?' }, schema) # Returns true
ClassyHash.validate({ key1: 4 }, schema) # Raises ':key1 is not an element of [1, 2, 3, "see?"]'

Nested schemas

Classy Hash accepts nested schemas. You can also use a schema as one of the options in a multiple choice key.

schema = {
  key1: {
    msg: String
  },
  key2: {
    n1: [Integer, { y: Numeric }]
  }
}

hash1 = {
  key1: { msg: 'Valid' },
  key2: { n1: { y: 1.0 } }
}

hash2 = {
  key1: { msg: 'Also valid' },
  key2: { n1: -1 }
}

hash3 = {
  key1: { msg: false },
  key2: { n1: 1 }
}

hash4 = {
  key1: { msg: 'Not valid' },
  key2: { n1: { y: false } }
}

ClassyHash.validate(hash1, schema) # Returns true
ClassyHash.validate(hash2, schema) # Returns true
ClassyHash.validate(hash3, schema) # Raises ":key1[:msg] is not a/an String"
ClassyHash.validate(hash4, schema) # Raises ":key2[:n1][:y] is not a/an Numeric, :key2[:n1] is not one of a/an Integer, a Hash matching {schema with keys [:y]}"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not a Hash matching {schema with keys [:msg]}"

Complex nested multiple choice constraints can lead to confusing error messages, slower performance, and increased memory consumption. For best results, try to push multiple choice options as deep into the schema as possible, or use your own code to decide which schema to pass to ClassyHash. For example:

hash = { data: { key: 1.0 }}

# Not recommended (confusing errors, slower validation)
bad_schema = {
  data: [ { key: String }, { key: Integer } ]
}
# Raises :data[:key] is not a/an String, :data is not one of a Hash matching {schema with keys [:key]}, a Hash matching {schema with keys [:key]}
CH.validate(hash, bad_schema)

# Recommended
good_schema = {
  data: { key: [ String, Integer ] }
}
# Raises :data[:key] is not a/an String, :data[:key] is not one of a/an String, a/an Integer
CH.validate(hash, good_schema)

# Alternative
schema1 = {
  data: { key: String }
}
schema2 = {
  data: { key: Integer }
}
# Raises ":data[:key] is not a/an Integer" or ":data[:key] is not a/an String"
CH.validate(hash, api_v2 ? schema2 : schema1)

Arrays

You can use Classy Hash to validate the members of an array. Array constraints are specified by double-array-wrapping a multiple choice list. Array constraints can also themselves be part of a multiple choice list or array constraint. Empty arrays are always accepted by array constraints.

If the error messages are too verbose, you can pass in an :errors array or retrieve the first entry from the exception's .entries. Typically the first entry will be the most useful, but this is not guaranteed.

# Simple array of integers
schema = {
  key1: [[Integer]]
}

ClassyHash.validate({ key1: [] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, 3, 4, 5] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, 3, 0.5] }, schema) # Raises ":key1[3] is not a/an Integer, :key1[3] is not one of a/an Integer"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not an Array of one of a/an Integer"
# An integer, or an array of arrays of strings
schema = {
  key1: [Integer, [[ [[ String ]] ]]]
}

ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: [ [], ['a'], ['b', 'c'] ] }, schema) # Returns true

# Raises :key1[0] is not an Array of one of a/an String, :key1[0] is not an
# Array of an Array of one of a/an String, :key1 is not one of a/an Integer, an
# Array of an Array of an Array of one of a/an String
ClassyHash.validate({ key1: ['bad'] }, schema)

# Raises :key1[2][1] is not a/an String, :key1[2][1] is not one of a/an String,
# :key1[2] is not an Array of an Array of one of a/an String, :key1 is not one
# of a/an Integer, an Array of an Array of an Array of one of a/an String
ClassyHash.validate({ key1: [ [], ['a'], ['b', false] ] }, schema)

If you want to check the length of an array, you can use a Proc (also see CH::G.array_length in the Generators section below):

# An array of two integers
schema = {
  key1: ->(v){
    if v.is_a?(Array) && v.length == 2
      begin
        ClassyHash.validate({k: v}, {k: [[Integer]]})
        true
      rescue => e
        "valid: #{e}"
      end
    else
      "an array of length 2"
    end
  }
}

ClassyHash.validate({ key1: [1, 2] }, schema) # Returns true
ClassyHash.validate({ key1: [1, false] }, schema) # Raises ":key1 is not valid: :k[1] is not one of a/an Integer"
ClassyHash.validate({ key1: [1] }, schema) # Raises ":key1 is not an array of length 2"

Generators

Version 0.1.1 of Classy Hash introduces some helper methods in ClassyHash::Generate (or the CH::G alias introduced in 0.1.2) that will generate a constraint for common tasks that are difficult to represent in the base Classy Hash syntax.

Composite and negated constraints

You can combine multiple constraints in an AND or NAND fashion using the Composite generators, .all and .not. Because composite constraints can be complex and confusing, they should be used only when other approaches would be more complex and confusing. Composite constraints were added in version 0.2.0.

The .all generator requires all constraints to pass.

schema = {
  key1: CH::G.all(Integer, 1.0..100.0)
}
ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: BigDecimal.new(5) }, schema) # Raises ":key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0]"

The .not generator requires all constraints to fail.

schema = {
  key1: CH::G.not(Rational, BigDecimal, 'a'..'c', 10..15)
}
ClassyHash.validate({ key1: 5 }, schema) # Returns true
ClassyHash.validate({ key1: 10 }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'
ClassyHash.validate({ key1: Rational(3, 5) }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'
ClassyHash.validate({ key1: 'Good' }, schema) # Returns true
ClassyHash.validate({ key1: 'broken' }, schema) # Raises ':key1 is not none of [one of a/an Rational, a/an BigDecimal, a String in range "a".."c", an Integer in range 10..15]'

The .all and .not generators become more useful when combined:

schema = {
  # Note: this case could also be represented as key1: [1..9, 21..100]
  # Also note that Float ranges are used because ClassyHash only accepts
  # Integer values for Integer ranges; this is important for .not().
  key1: CH::G.all(Integer, 1.0..100.0, CH::G.not(10.0..20.0))
}
ClassyHash.validate({ key1: 9 }, schema) # Returns true
ClassyHash.validate({ key1: 10 }, schema) # Raises :key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0, none of [one of a Numeric in range 10.0..20.0]]
ClassyHash.validate({ key1: 25.0 }, schema) # Raises :key1 is not all of [one of a/an Integer, a Numeric in range 1.0..100.0, none of [one of a Numeric in range 10.0..20.0]]

Note that .not may accept a value for reasons you don't expect, since its parameters are treated as ordinary ClassyHash constraints, and only requires that its constraints raise some kind of error. For example, CH::G.not(5..10) will allow 6.0 but not 6.

Enumeration

As of version 0.2.0, the enum generator is a deprecated compatibility method that generates a Set. See the above documentation for Set constraints.

# Enumerator -- value must be one of the elements provided
schema = {
  key1: CH::G.enum(1, 2, 3, 4)
}

ClassyHash.validate({ key1: 1 }, schema) # Returns true
ClassyHash.validate({ key1: -1 }, schema) # Raises ":key1 is not an element of [1, 2, 3, 4]"
Arbitrary length

The arbitrary length generator checks the length of any type that responds to :length.

# Simple length generator -- length of value must be equal to a value, or
# within a range
schema = {
  key1: CH::G.length(5..6)
}

ClassyHash.validate({ key1: '123456' }, schema) # Returns true
ClassyHash.validate({ key1: {a: 1, b: 2, c: 3, d: 4, e: 5} }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2] }, schema) # Raises ":key1 is not of length 5..6"
ClassyHash.validate({ key1: 5 }, schema) # Raises ":key1 is not a type that responds to :length"
String or Array length

Since checking the length of a String or an Array is very common, there are generators that will verify a value is the correct type and the correct length.

# String length generator
schema = {
  key1: CH::G.string_length(0..15)
}

ClassyHash.validate({ key1: 'x' * 15 }, schema) # Returns true
ClassyHash.validate({ key1: 'x' * 16 }, schema) # Raises ":key1 is not a String of length 0..15"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not a String of length 0..15"

The Array length constraint generator also checks the values of the array.

# Array length generator
schema = {
  key1: CH::G.array_length(4, Integer, String)
}

ClassyHash.validate({ key1: [1, 'two', 3, 4] }, schema) # Returns true
ClassyHash.validate({ key1: [1, 2, false, 4] }, schema) # Raises ":key1 is not valid: :array[2] is not a/an Integer, :array[2] is not one of a/an Integer, a/an String"
ClassyHash.validate({ key1: false }, schema) # Raises ":key1 is not an Array of length 4"

A practical example (user and address)

Here's a more practical application of Classy Hash. Suppose you have an API that accepts POSTs containing JSON user data, that you convert to a Ruby Hash with something like JSON.parse(data, symbolize_names: true). The user should have some basic fields, an array of addresses, and no extra fields.

Here's how you might use Classy Hash to validate your user objects and generate helpful error messages:

# Note: this is not guaranteed to be a useful address checking schema.
address_schema = {
  street1: /[0-9]+/,
  street2: [NilClass, String],
  city: String,
  state: [NilClass, String],
  country: String,
  postcode: [NilClass, Integer, String]
}
user_schema = {
  id: Integer,
  name: String,
  email: ->(v){ (v.is_a?(String) && v.include?('@')) || 'an e-mail address' },
  addresses: [[ address_schema ]]
}

data = <<JSON
{
  "id": 1,
  "name": "",
  "email": "@",
  "addresses": [
    {
      "street1": "123 Fake Street",
      "street2": null,
      "city": "",
      "state": "",
      "country": "",
      "postcode": ""
    },
    {
      "street1": "Building 53",
      "street2": "",
      "city": 5
    }
  ]
}
JSON

# Raises "Top level is not valid: contains members not specified in schema"
ClassyHash.validate({ :extra_key => 0 }, user_schema, strict: true)

# Raises :addresses[1][:city] is not a/an String, :addresses[1] is not one of a
# Hash matching {schema with keys [:street1, :street2, :city, :state, :country,
# :postcode]}
ClassyHash.validate(JSON.parse(data, symbolize_names: true), user_schema, strict: true)

Testing

Classy Hash includes extremely thorough RSpec tests:

# Execute within a clone of the classy_hash Git repository:
bundle install
rspec

Who wrote it?

Classy Hash was written by Mike Bourgeous for API validation and documentation in internal DeseretBook.com systems, and subsequently enhanced by inside and outside contributors. See the Git history for details.

Alternatives

If you decide Classy Hash isn't for you, here are some of the other options we considered before deciding to roll our own:

License

Classy Hash is released under the MIT license (see the LICENSE file for the license text and copyright notice, and the git history for more contributors).