-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column-to-Field mapping strategy #123
Comments
@roll i get this is tricky. For example, we'd generally like to be able to validate a stream of rows where each row is either an array or a dict (basically either what you get from python csv.reader or csv.DictReader ...). My point in the original issue is not that we must accomodate order stuff but just that we should not ignore something if not there - if we expected a column called name and not there i think we should warn or error. |
So
Also |
* fixed column producing for body context * removed column from schema checks only if name slugs are different * implemented order_fields algo * improved comments
* Rebased on goodtables.next codebase (#118) * removed current codebase * added updated codebase * fixed linting * updated readme * updated readme * updated readme * fixed source checks * added dataset checks * min style change * removed ecode filter from filter_checks * renamed cells back to columns + row_number * added dataset check stubs * implemented dataset checks * fixed linting * moved __inspect_table next to inspect for better reading * fixed list.clear for python2 * added error limit to dataset errors * updated readme * updated readme * added breaking note to readme * added custom checks support * implemented custom profiles * fixed linting * added options order_fields and infer_fields * fixed extra_header * added comments * updated spec * renamed unordered_headers to non_matching_header * renamed col-number to column-number * min * updated added dataset errors to readme * splitted error and check concpets * fixed linting * min * fixed readme * updated readme * updated readme * fixed readme * fixed readme * added guard assertion to checks * updated custom checks API * fixed linting, readme * fixed linting, readme * updated readme * updated readme * typo * moved table errors to Inspector, deleted checks * added ability to profilies to return errors * fixed head checks not columns break * rebased check on in-place erorrs update * rebased profiles on in-place errors * fixed readme * fixed readme * fixed readme * fixed readme * fixed readme * no extra-header error if infer_fields is True * implemented proper non-matching-header without ordering * added custom_profiles, custom_checks arguments instead of global registry * moved default args to signatures * removed make release (use github releases instead!) * added return code to cli * improved cli error formatting * improved error messages, tests * updated examples * added custom examples * added inspector tests * added limit tests * fixed tests * fixed spec link * added checks options to cli * fixed profiles * added description to setup * added entry_points, keywords * moved ckan profile to examples * removed report from spec * updated version to v1 * updated install instruction for now * Fixed jsontableschema-error message (#133) * Rebased on granular tabulator exceptions (#115) * updated dependencies * rebases on new tabulator exceptions * Renamed profile to preset with simplified API (#124) * renamed profile to preset * removed errors, tables arguments from preset * Added infer_schema option, updated preset API (#128) * minor improvements * added infer_schema option false by default * Added support for schema constraints (#55) * updated jsontableschema version * implemented all constraints except unique * implemented unique constraint check * updated to jsontableschema-v0.8.2 * fixed linting * Added tables preset (#125) * added tables preset * fixed linting * added tables test * Implemented order_fields option (#123) * fixed column producing for body context * removed column from schema checks only if name slugs are different * implemented order_fields algo * improved comments * Rebased on external spec (#131) Rebased on external spec * Improved tests (#127) * added prev version data examples * implemented feature tests * moved files to data * updated verson * added features * marked spec test as xfail * Rebased on spec-v1.0.0-alpha1 (#131) * updated spec, added spec to API * added config with checks order * rebased in inspector on updated spec * updated @check API * rebased on spec message templates * fixed cutom checks * fixed linting * Updated readme note
Overview
We have a list of columns and list of fields. There are two ways to link them:
There is a check for it - https://github.com/frictionlessdata/goodtables-py/blob/next/goodtables/checks/head/non_matching_header.py - any logic could be implemented.
For now it uses current goodtables approach checking names using order but removing column from inspection if there is non matching name. Also there is not implemented
order_fields
option. But this approach could lead to problems described below. It could be solved removing whitespaces from names.But also may be instead of
order_fileds
to use option to set mapping strategy:Problems
@rgrp has wrote:
Tasks
non_matching_headers(order_fields=True)
raisesNotImplementedError
but it should re-order fields iforder_fields
option is True. Needed some tricky algorithm because headers and fields could be duplicated etc. [IF NEEDED]The text was updated successfully, but these errors were encountered: