From 1aaa113888da7b5cb79b6db1a42861a9ac4b1e2a Mon Sep 17 00:00:00 2001 From: kleanthisk10 Date: Mon, 19 Nov 2018 15:22:59 +0200 Subject: [PATCH] add cran badges and change description --- DESCRIPTION | 6 +- README.Rmd | 5 +- README.md | 632 ++++++++++++++++++++++++++++------------------------ 3 files changed, 351 insertions(+), 292 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index c8231b8..77e1618 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,14 +1,14 @@ Package: tableschema.r Type: Package -Title: A Library for Working with 'Table Schema' +Title: Frictionless Data Table Schema Version: 1.1.0 -Date: 2018-10-25 +Date: 2018-11-5 Authors@R: c(person("Kleanthis", "Koupidis", email = "koupidis@okfn.gr", role = c("aut", "cre")), person("Lazaros", "Ioannidis", email = "larjohn@gmail.com", role = "aut"), person("Charalampos", "Bratsas", email = "cbratsas@math.auth.gr", role = "aut"), person("Open Knowledge International", email = "info@okfn.org", role = "cph")) Maintainer: Kleanthis Koupidis -Description: A library for working with 'Table Schema' (). +Description: Allows to work with 'Table Schema' (). 'Table Schema' is well suited for use cases around handling and validating tabular data in text formats such as 'csv', but its utility extends well beyond this core usage, towards a range of applications where data benefits from a portable schema format. The 'tableschema.r' package can load and validate any table schema descriptor, allow the creation and modification of descriptors, expose methods for reading and streaming data that conforms to a 'Table Schema' via the 'Tabular Data Resource' abstraction. URL: https://github.com/frictionlessdata/tableschema-r BugReports: https://github.com/frictionlessdata/tableschema-r/issues License: MIT + file LICENSE diff --git a/README.Rmd b/README.Rmd index 47a342e..b86ee9d 100644 --- a/README.Rmd +++ b/README.Rmd @@ -5,7 +5,7 @@ output: html_preview: no number_sections: yes --- - +[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/tableschema.r)](https://cran.r-project.org/package=tableschema.r) [![Build Status](https://travis-ci.org/frictionlessdata/tableschema-r.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-r) [![Coverage status](https://coveralls.io/repos/github/frictionlessdata/tableschema-r/badge.svg)](https://coveralls.io/r/frictionlessdata/tableschema-r?branch=master) [![Github Issues](http://githubbadges.herokuapp.com/frictionlessdata/tableschema-r/issues.svg)](https://github.com/frictionlessdata/tableschema-r/issues) @@ -13,6 +13,7 @@ output: [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) [![packageversion](https://img.shields.io/badge/Package%20version-1.1.0-orange.svg?style=flat-square)](commits/master) [![minimal R version](https://img.shields.io/badge/R%3E%3D-3.5-6666ff.svg)](https://cran.r-project.org/) +[![](http://cranlogs.r-pkg.org/badges/grand-total/tableschema.r)](http://cran.rstudio.com/web/packages/tableschema.r/index.html) [![Licence](https://img.shields.io/badge/licence-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/frictionlessdata/chat) @@ -701,4 +702,4 @@ More detailed information about how to create and run tests you can find in [tes [coding_standards]: https://github.com/okfn/coding-standards [tableschema]: http://specs.frictionlessdata.io/table-schema/ [news]: https://github.com/frictionlessdata/tableschema-r/blob/master/NEWS.md -[commits]: https://github.com/frictionlessdata/tableschema-r/commits/master +[commits]: https://github.com/frictionlessdata/tableschema-r/commits/master \ No newline at end of file diff --git a/README.md b/README.md index e7264cc..4ff4b91 100644 --- a/README.md +++ b/README.md @@ -1,29 +1,48 @@ -

rictionless Data -
Table Schema +

rictionless +Data -
Table Schema ================ -[![Build Status](https://travis-ci.org/frictionlessdata/tableschema-r.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-r) [![Coverage status](https://coveralls.io/repos/github/frictionlessdata/tableschema-r/badge.svg)](https://coveralls.io/r/frictionlessdata/tableschema-r?branch=master) [![Github Issues](http://githubbadges.herokuapp.com/frictionlessdata/tableschema-r/issues.svg)](https://github.com/frictionlessdata/tableschema-r/issues) [![Pending Pull-Requests](http://githubbadges.herokuapp.com/frictionlessdata/tableschema-r/pulls.svg)](https://github.com/frictionlessdata/tableschema-r/pulls) [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) [![packageversion](https://img.shields.io/badge/Package%20version-1.1.0-orange.svg?style=flat-square)](commits/master) [![minimal R version](https://img.shields.io/badge/R%3E%3D-3.5-6666ff.svg)](https://cran.r-project.org/) [![Licence](https://img.shields.io/badge/licence-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/frictionlessdata/chat) - -Description -=========== - -R library for working with [Table Schema](http://specs.frictionlessdata.io/table-schema/). - -Features -======== - -- `Table` class for working with data and schema -- `Schema` class for working with schemas -- `Field` class for working with schema fields -- `validate` function for validating schema descriptors -- `infer` function that creates a schema based on a data sample - -Getting started -=============== - -Installation ------------- - -In order to install the latest distribution of [R software](https://www.r-project.org/) to your computer you have to select one of the mirror sites of the [Comprehensive R Archive Network](https://cran.r-project.org/), select the appropriate link for your operating system and follow the wizard instructions. +[![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version/tableschema.r)](https://cran.r-project.org/package=tableschema.r) +[![Build +Status](https://travis-ci.org/frictionlessdata/tableschema-r.svg?branch=master)](https://travis-ci.org/frictionlessdata/tableschema-r) +[![Coverage +status](https://coveralls.io/repos/github/frictionlessdata/tableschema-r/badge.svg)](https://coveralls.io/r/frictionlessdata/tableschema-r?branch=master) +[![Github +Issues](http://githubbadges.herokuapp.com/frictionlessdata/tableschema-r/issues.svg)](https://github.com/frictionlessdata/tableschema-r/issues) +[![Pending +Pull-Requests](http://githubbadges.herokuapp.com/frictionlessdata/tableschema-r/pulls.svg)](https://github.com/frictionlessdata/tableschema-r/pulls) +[![Project Status: Active – The project has reached a stable, usable +state and is being actively +developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) +[![packageversion](https://img.shields.io/badge/Package%20version-1.1.0-orange.svg?style=flat-square)](commits/master) +[![minimal R +version](https://img.shields.io/badge/R%3E%3D-3.5-6666ff.svg)](https://cran.r-project.org/) +[![](http://cranlogs.r-pkg.org/badges/grand-total/tableschema.r)](http://cran.rstudio.com/web/packages/tableschema.r/index.html) +[![Licence](https://img.shields.io/badge/licence-MIT-blue.svg)](https://opensource.org/licenses/MIT) +[![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/frictionlessdata/chat) + +# Description + +R library for working with [Table +Schema](http://specs.frictionlessdata.io/table-schema/). + +# Features + + - `Table` class for working with data and schema + - `Schema` class for working with schemas + - `Field` class for working with schema fields + - `validate` function for validating schema descriptors + - `infer` function that creates a schema based on a data sample + +# Getting started + +## Installation + +In order to install the latest distribution of [R +software](https://www.r-project.org/) to your computer you have to +select one of the mirror sites of the [Comprehensive R Archive +Network](https://cran.r-project.org/), select the appropriate link for +your operating system and follow the wizard instructions. For windows users you can: @@ -33,11 +52,17 @@ For windows users you can: 4. Download the latest R version 5. Run installation file and follow the instrustions of the installer. -(Mac) OS X and Linux users may need to follow different steps depending on their system version to install R successfully and it is recommended to read the instructions on CRAN site carefully. +(Mac) OS X and Linux users may need to follow different steps depending +on their system version to install R successfully and it is recommended +to read the instructions on CRAN site carefully. -Even more detailed installation instructions can be found in [R Installation and Administration manual](https://cran.r-project.org/doc/manuals/R-admin.html). +Even more detailed installation instructions can be found in [R +Installation and Administration +manual](https://cran.r-project.org/doc/manuals/R-admin.html). -To install [RStudio](https://www.rstudio.com/), you can download [RStudio Desktop](https://www.rstudio.com/products/rstudio/download/) with Open Source License and follow the wizard instructions: +To install [RStudio](https://www.rstudio.com/), you can download +[RStudio Desktop](https://www.rstudio.com/products/rstudio/download/) +with Open Source License and follow the wizard instructions: 1. Go to [RStudio](https://www.rstudio.com/products/rstudio/) 2. Click download on RStudio Desktop @@ -45,7 +70,8 @@ To install [RStudio](https://www.rstudio.com/), you can download [RStudio Deskto 4. Select the appropriate file for your system 5. Run installation file -To install the `tableschema` library it is necessary to install first `devtools` library to make installation of github libraries available. +To install the `tableschema` library it is necessary to install first +`devtools` library to make installation of github libraries available. ``` r # Install devtools package if not already @@ -59,8 +85,7 @@ Install `tableschema.r` devtools::install_github("frictionlessdata/tableschema-r") ``` -Load library ------------- +## Load library ``` r # Install devtools package if not already @@ -73,19 +98,34 @@ library(future) library(tableschema.r) ``` -Documentation -============= +# Documentation -[Jsonlite package](https://CRAN.R-project.org/package=jsonlite) is internally used to convert json data to list objects. The input parameters of functions could be json strings, files or lists and the outputs are in list format to easily further process your data in R environment and exported as desired. The examples below show how to use jsonlite package to convert the output back to json adding indentation whitespace. More details about handling json you can see jsonlite documentation or vignettes [here](https://CRAN.R-project.org/package=jsonlite). +[Jsonlite package](https://CRAN.R-project.org/package=jsonlite) is +internally used to convert json data to list objects. The input +parameters of functions could be json strings, files or lists and the +outputs are in list format to easily further process your data in R +environment and exported as desired. The examples below show how to use +jsonlite package to convert the output back to json adding indentation +whitespace. More details about handling json you can see jsonlite +documentation or vignettes +[here](https://CRAN.R-project.org/package=jsonlite). -Moreover [future package](https://CRAN.R-project.org/package=future) is also used to load and create Table and Schema classes asynchronously. To retrieve the actual result of the loaded Table or Schema you have to use `value(...)` to the variable you stored the loaded Table/Schema. More details about future package and sequential and parallel processing you can find [here](https://CRAN.R-project.org/package=future). +Moreover [future package](https://CRAN.R-project.org/package=future) is +also used to load and create Table and Schema classes asynchronously. To +retrieve the actual result of the loaded Table or Schema you have to use +`value(...)` to the variable you stored the loaded Table/Schema. More +details about future package and sequential and parallel processing you +can find [here](https://CRAN.R-project.org/package=future). -Table ------ +## Table -A table is a core concept in a tabular data world. It represents a data with a metadata (Table Schema). Let's see how we could use it in practice. +A table is a core concept in a tabular data world. It represents a data +with a metadata (Table Schema). Let’s see how we could use it in +practice. -Consider we have some local csv file. It could be inline data or remote link - all supported by `Table` class (except local files for in-brower usage of course). But say it's `data.csv` for now: +Consider we have some local csv file. It could be inline data or remote +link - all supported by `Table` class (except local files for in-brower +usage of course). But say it’s `data.csv` for now: > data/cities.csv @@ -96,7 +136,8 @@ paris,"48.85,2.30" rome,N/A ``` -Let's create and read a table. We use static `Table.load` method and `table.read` method with a `keyed` option to get array of keyed rows: +Let’s create and read a table. We use static `Table.load` method and +`table.read` method with a `keyed` option to get array of keyed rows: ``` r def = Table.load('inst/extdata/data.csv') @@ -131,7 +172,9 @@ table.headers ## [[2]] ## [1] "location" -As we could see our locations are just a strings. But it should be geopoints. Also Rome's location is not available but it's also just a `N/A` string instead of `null`. First we have to infer Table Schema: +As we could see our locations are just a strings. But it should be +geopoints. Also Rome’s location is not available but it’s also just a +`N/A` string instead of `null`. First we have to infer Table Schema: ``` r # add indentation whitespace to JSON output with jsonlite package @@ -182,7 +225,10 @@ toJSON(table$schema$descriptor, pretty = TRUE) # function from jsonlite package table$read(keyed = TRUE) # Fails ``` -Let's fix not available location. There is a `missingValues` property in Table Schema specification. As a first try we set `missingValues` to `N/A` in `table$schema$descriptor`. Schema descriptor could be changed in-place but all changes should be commited by `table$schema$commit()`: +Let’s fix not available location. There is a `missingValues` property in +Table Schema specification. As a first try we set `missingValues` to +`N/A` in `table$schema$descriptor`. Schema descriptor could be changed +in-place but all changes should be commited by `table$schema$commit()`: ``` r table$schema$descriptor['missingValues'] = 'N/A' @@ -204,7 +250,9 @@ table$schema$errors ## [[1]] ## [1] "Descriptor validation error:\n data.missingValues - is the wrong type" -As a good citiziens we've decided to check out schema descriptor validity. And it's not valid! We sould use an array for `missingValues` property. Also don't forget to have an empty string as a missing value: +As a good citiziens we’ve decided to check out schema descriptor +validity. And it’s not valid\! We sould use an array for `missingValues` +property. Also don’t forget to have an empty string as a missing value: ``` r table$schema$descriptor[['missingValues']] = list("", 'N/A') @@ -219,7 +267,7 @@ table$schema$valid # true ## [1] TRUE -All good. It looks like we're ready to read our data again: +All good. It looks like we’re ready to read our data again: ``` r table$read() # or @@ -231,18 +279,20 @@ toJSON(table$read(), pretty = TRUE) # function from jsonlite package Now we see that: -- locations are arrays with numeric lattide and longitude + - locations are arrays with numeric lattide and longitude -- Rome's location is `null` + - Rome’s location is `null` -And because there are no errors on data reading we could be sure that our data is valid againt our schema. Let's save it: +And because there are no errors on data reading we could be sure that +our data is valid againt our schema. Let’s save it: ``` r table$schema$save('schema.json') table$save('data.csv') ``` -Our `data.csv` looks the same because it has been stringified back to `csv` format. But now we have `schema.json`: +Our `data.csv` looks the same because it has been stringified back to +`csv` format. But now we have `schema.json`: ``` json { @@ -265,7 +315,9 @@ Our `data.csv` looks the same because it has been stringified back to `csv` form } ``` -If we decide to improve it even more we could update the schema file and then open it again. But now providing a schema path. +If we decide to improve it even more we could update the schema file and +then open it again. But now providing a schema +path. ``` r def = Table.load('inst/extdata/data.csv', schema = 'inst/extdata/schema.json') @@ -295,85 +347,114 @@ table ## strict_: FALSE ## uniqueFieldsCache_: list -It was one basic introduction to the `Table` class. To learn more let's take a look on `Table` class API reference. +It was one basic introduction to the `Table` class. To learn more let’s +take a look on `Table` class API reference. #### `Table.load(source, schema, strict=FALSE, headers=1, ...)` -Factory method to instantiate `Table` class. This method is async and it should be used with `value(...)` keyword or as a `Promise`. If references argument is provided foreign keys will be checked on any reading operation. - -- `source (String/list()/Stream/Function)` - data source (one of): -- local CSV file (path) -- remote CSV file (url) -- list of lists representing the rows -- readable stream with CSV file contents -- function returning readable stream with CSV file contents -- `schema (Object)` - data schema in all forms supported by `Schema` class -- `strict (Boolean)` - strictness option to pass to `Schema` constructor -- `headers (Integer/String[])` - data source headers (one of): -- row number containing headers (`source` should contain headers rows) -- array of headers (`source` should NOT contain headers rows) -- `... (Object)` - options to be used by CSV parser. All options listed at . By default `ltrim` is true according to the CSV Dialect spec. -- `(errors.TableSchemaError)` - raises any error occured in table creation process -- `(Table)` - returns data table class instance +Factory method to instantiate `Table` class. This method is async and it +should be used with `value(...)` keyword or as a `Promise`. If +references argument is provided foreign keys will be checked on any +reading operation. + + - `source (String/list()/Stream/Function)` - data source (one of): + - local CSV file (path) + - remote CSV file (url) + - list of lists representing the rows + - readable stream with CSV file contents + - function returning readable stream with CSV file contents + - `schema (Object)` - data schema in all forms supported by `Schema` + class + - `strict (Boolean)` - strictness option to pass to `Schema` + constructor + - `headers (Integer/String[])` - data source headers (one of): + - row number containing headers (`source` should contain headers rows) + - array of headers (`source` should NOT contain headers rows) + - `... (Object)` - options to be used by CSV parser. All options + listed at . By default + `ltrim` is true according to the CSV Dialect spec. + - `(errors.TableSchemaError)` - raises any error occured in table + creation process + - `(Table)` - returns data table class instance #### `table$headers` -- `(String[])` - returns data source headers + - `(String[])` - returns data source headers #### `table$schema` -- `(Schema)` - returns schema class instance + - `(Schema)` - returns schema class +instance #### `table$iter(keyed, extended, cast=TRUE, relations=FALSE, stream=FALSE)` -Iter through the table data and emits rows cast based on table schema. Data casting could be disabled. - -- `keyed (Boolean)` - iter keyed rows -- `extended (Boolean)` - iter extended rows -- `cast (Boolean)` - disable data casting if false -- `relations (Object)` - list object of foreign key references from a form of JSON `{resource1: [{field1: value1, field2: value2}, ...], ...}`. If provided foreign key fields will checked and resolved to its references -- `stream (Boolean)` - return Readable Stream of table rows -- `(errors$TableSchemaError)` - raises any error occured in this process -- `(Iterator/Stream)` - iterator/stream of rows: -- `[value1, value2]` - base -- `{header1: value1, header2: value2}` - keyed -- `[rowNumber, [header1, header2], [value1, value2]]` - extended +Iter through the table data and emits rows cast based on table schema. +Data casting could be disabled. + + - `keyed (Boolean)` - iter keyed rows + - `extended (Boolean)` - iter extended rows + - `cast (Boolean)` - disable data casting if false + - `relations (Object)` - list object of foreign key references from a + form of JSON `{resource1: [{field1: value1, field2: value2}, ...], + ...}`. If provided foreign key fields will checked and resolved to + its references + - `stream (Boolean)` - return Readable Stream of table rows + - `(errors$TableSchemaError)` - raises any error occured in this + process + - `(Iterator/Stream)` - iterator/stream of rows: + - `[value1, value2]` - base + - `{header1: value1, header2: value2}` - keyed + - `[rowNumber, [header1, header2], [value1, value2]]` - extended #### `table$read(keyed, extended, cast=TRUE, relations=FALSE, limit)` -Read the whole table and returns as array of rows. Count of rows could be limited. - -- `keyed (Boolean)` - flag to emit keyed rows -- `extended (Boolean)` - flag to emit extended rows -- `cast (Boolean)` - disable data casting if false -- `relations (Object)` - list object of foreign key references from a form of JSON `{resource1: [{field1: value1, field2: value2}, ...], ...}`. If provided foreign key fields will checked and resolved to its references -- `limit (Number)` - integer limit of rows to return -- `(errors$TableSchemaError)` - raises any error occured in this process -- `(List[])` - returns list of rows (see `table$iter`) +Read the whole table and returns as array of rows. Count of rows could +be limited. + + - `keyed (Boolean)` - flag to emit keyed rows + - `extended (Boolean)` - flag to emit extended rows + - `cast (Boolean)` - disable data casting if false + - `relations (Object)` - list object of foreign key references from a + form of JSON `{resource1: [{field1: value1, field2: value2}, ...], + ...}`. If provided foreign key fields will checked and resolved to + its references + - `limit (Number)` - integer limit of rows to return + - `(errors$TableSchemaError)` - raises any error occured in this + process + - `(List[])` - returns list of rows (see `table$iter`) #### `table$infer(limit=100)` -Infer a schema for the table. It will infer and set Table Schema to `table$schema` based on table data. +Infer a schema for the table. It will infer and set Table Schema to +`table$schema` based on table data. -- `limit (Number)` - limit rows samle size -- `(Object)` - returns Table Schema descriptor + - `limit (Number)` - limit rows samle size + - `(Object)` - returns Table Schema descriptor #### `table$save(target)` -Save data source to file locally in CSV format with `,` (comma) delimiter +Save data source to file locally in CSV format with `,` (comma) +delimiter -- `target (String)` - path where to save a table data -- `(errors$TableSchemaError)` - raises an error if there is saving problem -- `(Boolean)` - returns true on success + - `target (String)` - path where to save a table data + - `(errors$TableSchemaError)` - raises an error if there is saving + problem + - `(Boolean)` - returns true on success -Schema ------- +## Schema ### Schema -A model of a schema with helpful methods for working with the schema and supported data. Schema instances can be initialized with a schema source as a url to a JSON file or a JSON object. The schema is initially validated (see [validate](#validate) below). By default validation errors will be stored in `schema$errors` but in a strict mode it will be instantly raised. +A model of a schema with helpful methods for working with the schema and +supported data. Schema instances can be initialized with a schema source +as a url to a JSON file or a JSON object. The schema is initially +validated (see [validate](#validate) below). By default validation +errors will be stored in `schema$errors` but in a strict mode it will be +instantly raised. -Let's create a blank schema. It's not valid because `descriptor$fields` property is required by the [Table Schema](http://specs.frictionlessdata.io/table-schema/) specification: +Let’s create a blank schema. It’s not valid because `descriptor$fields` +property is required by the [Table +Schema](http://specs.frictionlessdata.io/table-schema/) specification: ``` r def = Schema.load({}) @@ -390,7 +471,8 @@ schema$errors ## [[1]] ## [1] "Descriptor validation error:\n data.fields - is required" -To do not create a schema descriptor by hands we will use a `schema$infer` method to infer the descriptor from given data: +To do not create a schema descriptor by hands we will use a +`schema$infer` method to infer the descriptor from given data: ``` r toJSON( @@ -455,7 +537,9 @@ toJSON( ## ] ## } -Now we have an inferred schema and it's valid. We could cast data row against our schema. We provide a string input by an output will be cast correspondingly: +Now we have an inferred schema and it’s valid. We could cast data row +against our schema. We provide a string input by an output will be cast +correspondingly: ``` r toJSON( @@ -469,7 +553,10 @@ toJSON( ## "Sam" ## ] -But if we try provide some missing value to `age` field cast will fail because for now only one possible missing value is an empty string. Let's update our schema: +But if we try provide some missing value to `age` field cast will fail +because for now only one possible missing value is an empty string. +Let’s update our + schema: ``` r schema$castRow(helpers.from.json.to.list('["6", "N/A", "Walt"]')) @@ -501,104 +588,119 @@ schema$castRow(helpers.from.json.to.list('["6", "", "Walt"]')) ## [[3]] ## [1] "Walt" -We could save the schema to a local file. And we could continue the work in any time just loading it from the local file: +We could save the schema to a local file. And we could continue the work +in any time just loading it from the local file: ``` r schema$save('schema.json') schema = Schema.load('schema.json') ``` -It was onle basic introduction to the `Schema` class. To learn more let's take a look on `Schema` class API reference. +It was onle basic introduction to the `Schema` class. To learn more +let’s take a look on `Schema` class API reference. #### `Schema.load(descriptor, strict=FALSE)` -Factory method to instantiate `Schema` class. This method is async and it should be used with `value(...)` keyword. - -- `descriptor (String/Object)` - schema descriptor: -- local path -- remote url -- object -- `strict (Boolean)` - flag to alter validation behaviour: -- if false error will not be raised and all error will be collected in `schema$errors` -- if strict is true any validation error will be raised immediately -- `(errors$TableSchemaError)` - raises any error occured in the process -- `(Schema)` - returns schema class instance +Factory method to instantiate `Schema` class. This method is async and +it should be used with `value(...)` keyword. + + - `descriptor (String/Object)` - schema descriptor: + - local path + - remote url + - object + - `strict (Boolean)` - flag to alter validation behaviour: + - if false error will not be raised and all error will be collected in + `schema$errors` + - if strict is true any validation error will be raised immediately + - `(errors$TableSchemaError)` - raises any error occured in the + process + - `(Schema)` - returns schema class instance #### `schema$valid` -- `(Boolean)` - returns validation status. It always true in strict mode. + - `(Boolean)` - returns validation status. It always true in strict + mode. #### `schema$errors` -- `(Error[])` - returns validation errors. It always empty in strict mode. + - `(Error[])` - returns validation errors. It always empty in strict + mode. #### `schema$descriptor` -- `(Object)` - returns schema descriptor + - `(Object)` - returns schema descriptor #### `schema$primaryKey` -- `(str[])` - returns schema primary key + - `(str[])` - returns schema primary key #### `schema$foreignKeys` -- `(Object[])` - returns schema foreign keys + - `(Object[])` - returns schema foreign keys #### `schema$fields` -- `(Field[])` - returns an array of `Field` instances. + - `(Field[])` - returns an array of `Field` instances. #### `schema$fieldNames` -- `(String[])` - returns an array of field names. + - `(String[])` - returns an array of field names. #### `schema$getField(name)` Get schema field by name. -- `name (String)` - schema field name -- `(Field/null)` - returns `Field` instance or null if not found + - `name (String)` - schema field name + - `(Field/null)` - returns `Field` instance or null if not found #### `schema$addField(descriptor)` -Add new field to schema. The schema descriptor will be validated with newly added field descriptor. +Add new field to schema. The schema descriptor will be validated with +newly added field descriptor. -- `descriptor (Object)` - field descriptor -- `(errors.TableSchemaError)` - raises any error occured in the process -- `(Field/null)` - returns added `Field` instance or null if not added + - `descriptor (Object)` - field descriptor + - `(errors.TableSchemaError)` - raises any error occured in the + process + - `(Field/null)` - returns added `Field` instance or null if not added #### `schema$removeField(name)` -Remove field resource by name. The schema descriptor will be validated after field descriptor removal. +Remove field resource by name. The schema descriptor will be validated +after field descriptor removal. -- `name (String)` - schema field name -- `(errors.TableSchemaError)` - raises any error occured in the process -- `(Field/null)` - returns removed `Field` instances or null if not found + - `name (String)` - schema field name + - `(errors.TableSchemaError)` - raises any error occured in the + process + - `(Field/null)` - returns removed `Field` instances or null if not + found #### `schema$castRow(row)` Cast row based on field types and formats. -- `row (any())` - data row as an list of values -- `(any[])` - returns cast data row + - `row (any())` - data row as an list of values + - `(any[])` - returns cast data row #### `schema$infer(rows, headers=1)` Infer and set `schema$descriptor` based on data sample. -- `rows (List())` - list of lists representing rows. -- `headers (Integer/String[])` - data sample headers (one of): -- row number containing headers (`rows` should contain headers rows) -- list of headers (`rows` should NOT contain headers rows) -- `{Object}` - returns Table Schema descriptor + - `rows (List())` - list of lists representing rows. + - `headers (Integer/String[])` - data sample headers (one of): + - row number containing headers (`rows` should contain headers rows) + - list of headers (`rows` should NOT contain headers rows) + - `{Object}` - returns Table Schema descriptor #### `schema$commit(strict)` Update schema instance if there are in-place changes in the descriptor. -- `strict (Boolean)` - alter `strict` mode for further work -- `(errors.TableSchemaError)` - raises any error occured in the process -- `(Boolean)` - returns true on success and false if not modified + - `strict (Boolean)` - alter `strict` mode for further work + - `(errors.TableSchemaError)` - raises any error occured in the + process + - `(Boolean)` - returns true on success and false if not modified + + ``` r descriptor = '{"fields": [{"name": "field", "type": "string"}]}' @@ -638,16 +740,18 @@ schema$getField("field")$type Save schema descriptor to target destination. -- `target (String)` - path where to save a descriptor -- `(errors$TableSchemaError)` - raises any error occured in the process -- `(Boolean)` - returns true on success + - `target (String)` - path where to save a descriptor + - `(errors$TableSchemaError)` - raises any error occured in the + process + - `(Boolean)` - returns true on success -Field ------ +## Field Class represents field in the schema. -Data values can be cast to native R types. Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema. +Data values can be cast to native R types. Casting a value will check +the value is of the expected type, is in the correct format, and +complies with any constraints imposed by a schema. ``` json { @@ -661,7 +765,10 @@ Data values can be cast to native R types. Casting a value will check the value } ``` -Following code will not raise the exception, despite the fact our date is less than minimum constraints in the field, because we do not check constraints of the field descriptor +Following code will not raise the exception, despite the fact our date +is less than minimum constraints in the field, because we do not check +constraints of the field +descriptor ``` r field = Field$new(helpers.from.json.to.list('{"name": "name", "type": "number"}')) @@ -671,7 +778,10 @@ dateType # print the result ## [1] 12345 -And following example will raise exception, because we set flag 'skip constraints' to `false`, and our date is less than allowed by `minimum` constraints of the field. Exception will be raised as well in situation of trying to cast non-date format values, or empty values +And following example will raise exception, because we set flag ‘skip +constraints’ to `false`, and our date is less than allowed by `minimum` +constraints of the field. Exception will be raised as well in situation +of trying to cast non-date format values, or empty values ``` r tryCatch( @@ -682,101 +792,29 @@ tryCatch( ## Error in private$castValue(...): Field character(0) can't cast value 2014-05-29 for type number with format default -Values that can't be cast will raise an `Error` exception. Casting a value that doesn't meet the constraints will raise an `Error` exception. - -Table below shows the available types, formats and resultant value of the cast: - - ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeFormatsCasting result

any

default

Any

array

default

Array

boolean

default

Boolean

date

default, any

Date

datetime

default, any

Date

duration

default

Duration

geojson

default, topojson

Object

geopoint

default, list, object

[Number, Number]

integer

default

Number

number

default

Number

object

default

Object

string

default, uri, email, binary

String

time

default, any

Date

year

default

Number

yearmonth

default

[Number, Number]

+Values that can’t be cast will raise an `Error` exception. Casting a +value that doesn’t meet the constraints will raise an `Error` exception. + +Table below shows the available types, formats and resultant value of +the cast: + +| Type | Formats | Casting result | +| :-------- | :-------------------------- | :----------------- | +| any | default | Any | +| array | default | Array | +| boolean | default | Boolean | +| date | default, any | Date | +| datetime | default, any | Date | +| duration | default | Duration | +| geojson | default, topojson | Object | +| geopoint | default, list, object | \[Number, Number\] | +| integer | default | Number | +| number | default | Number | +| object | default | Object | +| string | default, uri, email, binary | String | +| time | default, any | Date | +| year | default | Number | +| yearmonth | default | \[Number, Number\] | #### `Field$new(descriptor, missingValues=[''])` @@ -784,57 +822,64 @@ Table below shows the available types, formats and resultant value of the cast: Constructor to instantiate `Field` class. -- `descriptor (Object)` - schema field descriptor -- `missingValues (String[])` - an array with string representing missing values -- `(errors.TableSchemaError)` - raises any error occured in the process -- `(Field)` - returns field class instance + - `descriptor (Object)` - schema field descriptor + - `missingValues (String[])` - an array with string representing + missing values + - `(errors.TableSchemaError)` - raises any error occured in the + process + - `(Field)` - returns field class instance #### `field$name` -- `(String)` - returns field name + - `(String)` - returns field name #### `field$type` -- `(String)` - returns field type + - `(String)` - returns field type #### `field$format` -- `(String)` - returns field format + - `(String)` - returns field format #### `field$required` -- `(Boolean)` - returns true if field is required + - `(Boolean)` - returns true if field is required #### `field$constraints` -- `(Object)` - returns an object with field constraints + - `(Object)` - returns an object with field constraints #### `field$descriptor` -- `(Object)` - returns field descriptor + - `(Object)` - returns field descriptor #### `field$cast_value(value, constraints=TRUE)` Cast given value according to the field type and format. -- `value (any)` - value to cast against field -- `constraints (Boolean/String[])` - gets constraints configuration -- it could be set to true to disable constraint checks -- it could be a List of constraints to check e.g. \['minimum', 'maximum'\] -- `(errors$TableSchemaError)` - raises any error occured in the process -- `(any)` - returns cast value + - `value (any)` - value to cast against field + - `constraints (Boolean/String[])` - gets constraints configuration + - it could be set to true to disable constraint checks + - it could be a List of constraints to check e.g. \[‘minimum’, + ‘maximum’\] + - `(errors$TableSchemaError)` - raises any error occured in the + process + - `(any)` - returns cast value #### `field$testValue(value, constraints=TRUE)` Test if value is compliant to the field. -- `value (any)` - value to cast against field -- `constraints (Boolean/String[])` - constraints configuration -- `(Boolean)` - returns if value is compliant to the field + - `value (any)` - value to cast against field + - `constraints (Boolean/String[])` - constraints configuration + - `(Boolean)` - returns if value is compliant to the field ### Validate -> `validate()` validates whether a **schema** is a validate Table Schema accordingly to the [specifications](https://frictionlessdata.io/schemas/table-schema.json). It does **not** validate data against a schema. +> `validate()` validates whether a **schema** is a validate Table Schema +> accordingly to the +> [specifications](https://frictionlessdata.io/schemas/table-schema.json). +> It does **not** validate data against a schema. Given a schema descriptor `validate` returns a validation object: @@ -853,15 +898,16 @@ valid_errors Validate a Table Schema descriptor. -- `descriptor (String/Object)` - schema descriptor (one of): -- local path -- remote url -- object -- `(Object)` - returns `{valid, errors}` object + - `descriptor (String/Object)` - schema descriptor (one of): + - local path + - remote url + - object + - `(Object)` - returns `{valid, errors}` object ### Infer -Given data source and headers `infer` will return a Table Schema as a JSON object based on the data values. +Given data source and headers `infer` will return a Table Schema as a +JSON object based on the data values. Given the data file, example.csv: @@ -879,7 +925,8 @@ Call `infer` with headers and values from the datafile: descriptor = infer('inst/extdata/data_infer.csv') ``` -The `descriptor` variable is now a list object that can easily converted to JSON: +The `descriptor` variable is now a list object that can easily converted +to JSON: ``` r toJSON( @@ -915,21 +962,31 @@ toJSON( Infer source schema.. -- `source (String/List()/Stream/Function)` - source as path, url or inline data -- `headers (String[])` - array of headers -- `options (Object)` - any `Table.load` options -- `(errors.TableSchemaError)` - raises any error occured in the process -- `(Object)` - returns schema descriptor + - `source (String/List()/Stream/Function)` - source as path, url or + inline data + - `headers (String[])` - array of headers + - `options (Object)` - any `Table.load` options + - `(errors.TableSchemaError)` - raises any error occured in the + process + - `(Object)` - returns schema descriptor -Changelog - News ----------------- +## Changelog - News -In [NEWS.md](https://github.com/frictionlessdata/tableschema-r/blob/master/NEWS.md) described only breaking and the most important changes. The full changelog could be found in nicely formatted [commit](https://github.com/frictionlessdata/tableschema-r/commits/master) history. +In +[NEWS.md](https://github.com/frictionlessdata/tableschema-r/blob/master/NEWS.md) +described only breaking and the most important changes. The full +changelog could be found in nicely formatted +[commit](https://github.com/frictionlessdata/tableschema-r/commits/master) +history. -Contributing -============ +# Contributing -The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards). There are common commands to work with the project.Recommended way to get started is to create, activate and load the library environment. To install package and development dependencies into active environment: +The project follows the [Open Knowledge International coding +standards](https://github.com/okfn/coding-standards). There are common +commands to work with the project.Recommended way to get started is to +create, activate and load the library environment. To install package +and development dependencies into active +environment: ``` r devtools::install_github("frictionlessdata/tableschema-r", dependencies = TRUE) @@ -949,11 +1006,12 @@ To run tests: devtools::test() ``` -More detailed information about how to create and run tests you can find in [testthat package](https://github.com/hadley/testthat). +More detailed information about how to create and run tests you can find +in [testthat +package](https://github.com/hadley/testthat). -Github -====== +# Github -- + -