Registry of published datasets in the Datasets Project, as a datapackage
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
test
.env.template
.gitignore
README.md
core-list-testing.csv
core-list.csv
datapackage.json
index.js
package.json

README.md

Core data registry and tooling.

Registry

Registry is maintained as Tabular Data Package with list of datasets in core-list.csv.

To add a dataset add it to the core-list.csv - we recommend fork and pull.

Discussion of proposals for new datasets and for incorporation of prepared datasets takes place in the issues.

To propose a new dataset for inclusion, please create a new issue.

Core Dataset Tools

Installation

$ npm install

Usage

  • Environmental variables

DOMAIN - testing or production environment. For example: https://datahub.io TYPE - type of dataset. For example: examples or core

node index.js [COMMAND] [PATH]

# PATH - path to csv file

Clone datasets

To clone all core datasets run the following command:

node index.js clone [PATH]

It will clone all core datasets into following directory: data/${pkg_name}

Check datasets

To check all core datasets run the following command:

node index.js check [PATH]

It will validate metadata and data according to the latest spec.

Normalize datasets

To normalize all core datasets run the following command:

node index.js norm [PATH]

It will normalize all core datasets into following directory: data/${pkg_name}

Push datasets

To publish all core data packages run the following command:

node index.js push [PATH]

Running tests

We use Ava for our tests. For running tests use:

$ [sudo] npm test

To run tests in watch mode:

$ [sudo] npm run watch:test