Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Field class #11

Merged
merged 14 commits into from
Oct 12, 2016
30 changes: 22 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,6 @@ schema.primary_keys
#=> ["id"]
schema.foreign_keys
#=> [{"fields" => "state", "reference" => { "datapackage" => "http://data.okfn.org/data/mydatapackage/", "resource" => "the-resource", "fields" => "state_id" } } ]
schema.cast('height', '10')
#=> 10.0
schema.get_field('id')
#=> {"name"=>"id", "constraints"=>{"required"=>true}, "type"=>"string", "format"=>"default"}
schema.has_field?('foo')
Expand All @@ -176,13 +174,13 @@ schema.get_fields_by_type('string')
#=> [{"name"=>"id", "constraints"=>{"required"=>true}, "type"=>"string", "format"=>"default"}, {"name"=>"height", "type"=>"string", "format"=>"default"}]
schema.get_constraints('id')
#=> {"required" => true}
schema.convert_row(['string', '10.0'])
schema.cast_row(['string', '10.0'])
#=> ['string', 10.0]
schema.convert([['foo', '12.0'],['bar', '10.0']])
schema.cast([['foo', '12.0'],['bar', '10.0']])
#=> [['foo', 12.0],['bar', 10.0]]
```

When converting a row (using `convert_row`), or a number of rows (using `convert`), by default the converter will fail on the first error it finds. If you pass `false` as the second argument, the errors will be collected into a `errors` attribute for you to review later. For example:
When casting a row (using `cast_row`), or a number of rows (using `cast`), by default the converter will fail on the first error it finds. If you pass `false` as the second argument, the errors will be collected into a `errors` attribute for you to review later. For example:

```ruby
schema_hash = {
Expand All @@ -209,14 +207,31 @@ rows = [
['wrong column count']
]

schema.convert(rows)
schema.cast(rows)
Copy link
Member

@roll roll Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experience almost all interactions with schema related to casting go thru schema.cast_row - not sure this cast method needed at all. Also name could be confusing (schema.cast - cast what? schema? data?). WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this was carried over from the old Python API - calling it cast_rows would probably make more sense

#=> JsonTableSchema::InvalidCast: notanumber is not a number
schema.convert(rows, false)
schema.cast(rows, false)
#=> JsonTableSchema::MultipleInvalid
schema.errors
#=> [#<JsonTableSchema::InvalidCast: notanumber is not a number>, #<JsonTableSchema::InvalidCast: notanumber is not a number>, #<JsonTableSchema::ConversionError: The number of items to convert (1) does not match the number of headers in the schema (2)>]
```

## Field

```ruby
# Init field
field = JsonTableSchema::Field.new({'type': 'number'})

# Cast a value
field.cast_value('12345')
#=> 12345.0
```

Data values can be cast to native Ruby objects with a Field instance. Type instances can be initialized with f[ield descriptors](http://dataprotocols.org/json-table-schema/#field-descriptors). This allows formats and constraints to be defined.

Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema. E.g. a date value (in ISO 8601 format) can be cast with a DateType instance. Values that can't be cast will raise an `InvalidCast` exception.

Casting a value that doesn't meet the constraints will raise a `ConstraintError` exception.

## Development

After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
Expand All @@ -227,7 +242,6 @@ To install this gem onto your local machine, run `bundle exec rake install`. To

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/jsontableschema. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.


## License

The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
1 change: 1 addition & 0 deletions lib/jsontableschema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
require "jsontableschema/types/string"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it's something like __init__.py in Python? I'm not sure how it works in Ruby but in Python we use this to declare API - adding only public interface and not adding internal stuff (like types after field introduction). Like it's internal API could be changed anytime without any notification etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's it, this is the main file that requires everything needed by the gem. There are other patterns (i.e. we could modularise, so you could only require jsonschema/table or jsonschema/schema), but this is the generally accepted pattern, especially for small, focussed gems.

require "jsontableschema/types/time"

require "jsontableschema/field"
require "jsontableschema/validate"
require "jsontableschema/model"
require "jsontableschema/data"
Expand Down
22 changes: 11 additions & 11 deletions lib/jsontableschema/data.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ module Data

attr_reader :errors

def convert(rows, fail_fast = true)
def cast_rows(rows, fail_fast = true)
@errors ||= []
rows.map! do |r|
begin
convert_row(r, fail_fast)
cast_row(r, fail_fast)
rescue MultipleInvalid, ConversionError => e
raise e if fail_fast == true
@errors << e if e.is_a?(ConversionError)
Expand All @@ -17,19 +17,19 @@ def convert(rows, fail_fast = true)
rows
end

def convert_row(row, fail_fast = true)
alias_method :convert, :cast_rows

def cast_row(row, fail_fast = true)
@errors ||= []
raise_header_error(row) if row.count != fields.count
fields.each_with_index do |field,i|
row[i] = convert_column(row[i], field, fail_fast)
row[i] = cast_column(field, row[i], fail_fast)
end
check_for_errors
row
end

def cast(field_name, value)
convert_column(value, get_field(field_name), true)
end
alias_method :convert_row, :cast_row

private

Expand All @@ -41,10 +41,8 @@ def check_for_errors
raise(JsonTableSchema::MultipleInvalid.new("There were errors parsing the data")) if @errors.count > 0
end

def convert_column(col, field, fail_fast)
klass = get_class_for_type(field['type'] || 'string')
converter = Kernel.const_get(klass).new(field)
converter.cast(col)
def cast_column(field, col, fail_fast)
field.cast_value(col)
rescue Exception => e
if fail_fast == true
raise e
Expand All @@ -53,5 +51,7 @@ def convert_column(col, field, fail_fast)
end
end

alias_method :convert_column, :cast_column

end
end
41 changes: 41 additions & 0 deletions lib/jsontableschema/field.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
module JsonTableSchema
class Field < Hash
include JsonTableSchema::Helpers

attr_reader :type_class

def initialize(descriptor)
self.merge! descriptor
@type_class = get_type
end

def name
self['name']
end

def type
self['type'] || 'string'
end

def format
self['format'] || 'default'
end

def constraints
self['constraints'] || {}
end

def cast_value(col)
klass = get_class_for_type(type)
converter = Kernel.const_get(klass).new(self)
converter.cast(col)
end

private

def get_type
Object.const_get get_class_for_type(type)
end

end
end
2 changes: 1 addition & 1 deletion lib/jsontableschema/helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def false_values
end

def get_class_for_type(type)
"JsonTableSchema::Types::#{type_class_lookup[type]}"
"JsonTableSchema::Types::#{type_class_lookup[type] || 'String'}"
end

def type_class_lookup
Expand Down
8 changes: 4 additions & 4 deletions lib/jsontableschema/model.rb
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,6 @@ def get_fields_by_type(type)

private

def fields
self['fields']
end

def transform(name)
name.downcase! if @opts[:case_insensitive_headers]
name
Expand All @@ -69,5 +65,9 @@ def expand!
end
end

def load_fields!
self['fields'] = (self['fields'] || []).map { |f| JsonTableSchema::Field.new(f) }
end

end
end
21 changes: 11 additions & 10 deletions lib/jsontableschema/schema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,27 @@ class Schema < Hash
include JsonTableSchema::Data
include JsonTableSchema::Helpers

def initialize(schema, opts = {})
self.merge! parse_schema(schema)
def initialize(descriptor, opts = {})
self.merge! parse_schema(descriptor)
@messages = []
@opts = opts
load_fields!
load_validator!
expand!
end

def parse_schema(schema)
if schema.class == Hash
schema
elsif schema.class == String
def parse_schema(descriptor)
if descriptor.class == Hash
descriptor
elsif descriptor.class == String
begin
JSON.parse open(schema).read
JSON.parse open(descriptor).read
rescue Errno::ENOENT
raise SchemaException.new("File not found at `#{schema}`")
raise SchemaException.new("File not found at `#{descriptor}`")
rescue OpenURI::HTTPError => e
raise SchemaException.new("URL `#{schema}` returned #{e.message}")
raise SchemaException.new("URL `#{descriptor}` returned #{e.message}")
rescue JSON::ParserError
raise SchemaException.new("File at `#{schema}` is not valid JSON")
raise SchemaException.new("File at `#{descriptor}` is not valid JSON")
end
else
raise SchemaException.new("A schema must be a hash, path or URL")
Expand Down
6 changes: 3 additions & 3 deletions lib/jsontableschema/table.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ def self.infer_schema(csv, opts = {})
JsonTableSchema::Table.new(csv, nil, opts)
end

def initialize(csv, schema, opts = {})
def initialize(csv, descriptor, opts = {})
@opts = opts
@csv = parse_csv(csv)
@schema = schema.nil? ? infer_schema(@csv) : JsonTableSchema::Schema.new(schema)
@schema = descriptor.nil? ? infer_schema(@csv) : JsonTableSchema::Schema.new(descriptor)
end

def parse_csv(csv)
Expand All @@ -25,7 +25,7 @@ def csv_options
def rows(opts = {})
fail_fast = opts[:fail_fast] || opts[:fail_fast].nil?
rows = opts[:limit] ? @csv.to_a.drop(1).take(opts[:limit]) : @csv.to_a.drop(1)
converted = @schema.convert(rows, fail_fast)
converted = @schema.cast_rows(rows, fail_fast)
opts[:keyed] ? coverted_to_hash(@csv.headers, converted) : converted
end

Expand Down
2 changes: 1 addition & 1 deletion lib/jsontableschema/version.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
module JsonTableSchema
VERSION = "0.1.0"
VERSION = "0.2.0"
end
Loading