A php library for working with Table Schema.
PHP Shell
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
tests
.coveralls.yml
.gitignore
.install_cs_fixer.sh
.php_cs.dist
.travis.yml
CONTRIBUTING.md
LICENSE
README.md
composer.json
update-schema.php

README.md

Table Schema

Travis Coveralls Scrutinizer-ci Packagist SemVer Gitter

A utility library for working with Table Schema in php.

Features summary and Usage guide

Installation

$ composer require frictionlessdata/tableschema

Table

Table class allows to iterate over data conforming to a table schema

Instantiate a Table object based on a data source and a table schema.

use frictionlessdata\tableschema\Table;

$table = new Table("tests/fixtures/data.csv", ["fields" => [
    ["name" => "first_name"],
    ["name" => "last_name"],
    ["name" => "order"]
]]);

Schema can be any parameter valid for the Schema object (See below), so you can use a url or filename which contains the schema

$table = new Table("tests/fixtures/data.csv", "tests/fixtures/data.json");

iterate over the data, all the values are cast and validated according to the schema

foreach ($table as $row) {
    print($row["order"]." ".$row["first_name"]." ".$row["last_name"]."\n");
};

validate function will validate the schema and get some sample of the data itself to validate it as well

Table::validate(new CsvDataSource("http://invalid.data.source/"), $schema);

You can instantiate a table object without schema, in this case the schema will be inferred automatically based on the data

$table = new Table("tests/fixtures/data.csv");
$table->schema()->fields();  // ["first_name" => StringField, "last_name" => StringField, "order" => IntegerField]

Optionally, specify a CSV Dialect:

$table = new Table("tests/fixtures/data.csv", null, ["delimiter" => ";"]);

Table::read method allows to get all data as an array, it also supports options to modify reader behavior

$table->read()  // returns all the data as an array

read accepts an options parameter, for example:

$table->read(["cast" => false, "limit": 5])

The following options are available (the values are the default values):

$table->read([
    "keyed" => true,  // flag to emit keyed rows
    "extended" => false,  // flag to emit extended rows
    "cast" => true,  //flag to disable data casting if false
    "limit" => null,  // integer limit of rows to return
]);

Additional methods and functionality

$table->headers()  // ["first_name", "last_name", "order"]
$table->save("output.csv")  // iterate over all the rows and save the to a csv file
$table->schema()  // get the Schema object
$table->read()  // returns all the data as an array

Schema

Schema class provides helpful methods for working with a table schema and related data.

use frictionlessdata\tableschema\Schema;

Schema objects can be constructed using any of the following:

  • php array (or object)
$schema = new Schema([
    'fields' => [
        [
            'name' => 'id', 'title' => 'Identifier', 'type' => 'integer', 
            'constraints' => [
                "required" => true,
                "minimum" => 1,
                "maximum" => 500
            ]
        ],
        ['name' => 'name', 'title' => 'Name', 'type' => 'string'],
    ],
    'primaryKey' => 'id'
]);
  • string containing json
$schema = new Schema("{
    \"fields\": [
        {\"name\": \"id\"},
        {\"name\": \"height\", \"type\": \"integer\"}
    ]
}");
$schema = new Schema("https://raw.githubusercontent.com/frictionlessdata/testsuite-extended/ecf1b2504332852cca1351657279901eca6fdbb5/datasets/synthetic/schema.json");

The schema is loaded, parsed and validated and will raise exceptions in case of any problems.

access the schema data, which is ensured to conform to the specs.

$schema->missingValues(); // [""]
$schema->primaryKey();  // ["id"]
$schema->foreignKeys();  // []
$schema->fields(); // ["id" => IntegerField, "name" => StringField]
$field = $schema->field("id");  // Field object (See Field reference below)

validate function accepts the same arguemnts as the Schema constructor but returns a list of errors instead of raising exceptions

// validate functions accepts the same arguments as the Schema constructor
$validationErrors = Schema::validate("http://invalid.schema.json");
foreach ($validationErrors as $validationError) {
    print(validationError->getMessage();
};

validate and cast a row of data according to the schema

$row = $schema->castRow(["id" => "1", "name" => "First Name"]);

will raise exception if row fails validation

it returns the row with all native values

$row  // ["id" => 1, "name" => "First Name"];

validate the row to get a list of errors

$schema->validateRow(["id" => "foobar"]);  // ["id is not numeric", "name is required" .. ]

Infer schema based on source data:

$schema = Schema::infer("tests/fixtures/data.csv");
$table->schema()->fields();  // ["first_name" => StringField, "last_name" => StringField, "order" => IntegerField]

You can also create a new empty schema for editing

$schema = new Schema();

set fields

$schema->fields([
    "id" => (object)["type" => "integer"],
    "name" => (object)["type" => "string"],
]);

appropriate Field object is created according to the given descriptor (see below for Field class reference)

$schema->field("id");  // IntegerField object

add / update or remove fields

$schema->field("email", ["type" => "string", "format" => "email"]);
$schema->field("name", ["type" => "string"]);
$schema->removeField("name");

set or update other table schema attributes

$schema->primaryKey(["id"]);

after every change - schema is validated and will raise Exception in case of validation errors

Finally, you can get the full validated descriptor

$schema->fullDescriptor();

And, save it to a json file

$schema->save("my-schema.json");

Field

Field class represents a single table schema field descriptor

Create a field from a descriptor

use frictionlessdata\tableschema\Fields\FieldsFactory;
$field = FieldsFactory::field([
    "name" => "id", "type" => "integer",
    "constraints" => ["required" => true, "minimum" => 5]
]);

Cast and validate values using the field

$field->castValue("3");  // exception: value is below minimum
$field->castValue("7");  // 7

Additional method to access field data

$field("id")->format();  // "default"
$field("id")->name();  // "id"
$field("id")->type(); // "integer"
$field("id")->constraints();  // (object)["required"=>true, "minimum"=>1, "maximum"=>500]
$field("id")->enum();  // []
$field("id")->required();  // true
$field("id")->unique();  // false
$field("id")->title();  // "Id" (or null if not provided in descriptor)
$field("id")->description();  // "The ID" (or null if not provided in descriptor)
$field("id")->rdfType();  // "http://schema.org/Thing" (or null if not provided in descriptor)

Contributing

Please read the contribution guidelines: How to Contribute