Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table API feedback #25

Closed
roll opened this issue Jul 6, 2017 · 2 comments
Closed

Table API feedback #25

roll opened this issue Jul 6, 2017 · 2 comments

Comments

@roll
Copy link
Member

roll commented Jul 6, 2017

Overview

Based on this readme listing I'm adding feedback based on existent implementations and expected lib user competencies (as we target many almost non-tech users - publisher, data wranglers etc).

// iterate over a remote data source conforming to a table schema
$table = new tableschema\Table(
    new tableschema\DataSources\CsvDataSource("http://www.example.com/data.csv"), 
    new tableschema\Schema("http://www.example.com/data-schema.json")
);
foreach ($table as $person) {
    print($person["first_name"]." ".$person["last_name"]);
}

// infer schema of a remote data source
$dataSource = new tableschema\DataSources\CsvDataSource("http://www.example.com/data.csv");
$schema = new tableschema\InferSchema();
$table = new tableschema\Table($dataSource, $schema);
foreach ($table as $row) {
    var_dump($row); // row will be in inferred native values
    var_dump($schema->descriptor()); // will contain the inferred schema descriptor
    // the more iterations you make, the more accurate the inferred schema might be
    // once you are satisifed with the schema, lock it
    $rows = $schema->lock();
    // it returns all the rows received until the lock, casted to the final inferred schema
    // you may now continue to iterate over the rest of the rows
};

Is it possible to hide under Table class data source and schema creation?

As a {USER} I'd more like to write just $table = new tableschema\Table('data-path.csv', 'schema-path.json'); instead of creating data source and schema by myself. Especially it's actual if you don't know before runtime what kind of data source you have e.g. new tableschema\Table('data-path.csv-or-xls') (we don't support Excel here but as an example). In this case there should be $table.schema exposed.

Infer schema if schema argument is just omitted?

As a {USER} I'd more like to just have $table = new tableschema\Table('data.csv'); without schema argument to have schema infer instead of having a deal with tableschema\InferSchema(); additional class.

Provide headers?

As a {USER} I'd like to have $table.headers property (it's a new but useful property in https://github.com/frictionlessdata/implementations reference)

Provide save method?

$table.save('data.csv') is useful method in addition to $table.schema.save('schema.json') and it could be re-used on data package level to save a data package (e.g. as zip).

Option to don't cast data?

It was often requested feature:

table.read(cast=false) // list of strings

It allow to work with malformed data sources and validate it e.g. filed-based with custom error handling.


Related to usage of Iterator interface as a Table core:

  • It seems cool but have some comments
  • I think we need provide documentation how e.g. read(limit=10) could be achieved
  • Based on readme only keyed rows are emitted. Python and JavaScript also support:
    • default rows ['value1', 'value2', ...] - esp. useful with malformed data and cast=false (because header-values map doesn't work in this case)
    • extended rows [1, ['header1', 'header2'], ['value1, value2']] - to get row number
@OriHoch
Copy link
Collaborator

OriHoch commented Jul 13, 2017

most of the feedback was fixed in #28

regarding more read options like cast=false, extended / keyed - moved to a separate issue - #29

@OriHoch
Copy link
Collaborator

OriHoch commented Jul 13, 2017

merged, available in latest draft release (v0.1.6)

@OriHoch OriHoch closed this as completed Jul 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants