CSV File Field Value Counter

John Wieczorek edited this page Nov 8, 2016 · 2 revisions

This workflow:

  • creates a given directory as a workspace
  • takes a CSV or TXT file as input
  • takes an output file name as input
  • takes an output file format ('csv' or 'txt') as input
  • downloads vocabulary lookup files from https://github.com/kurator-org/kurator-validation/tree/master/packages/kurator_dwca/data/vocabularies.
  • extracts the core file of a Darwin Core Archive to a tab-separated text file
  • for each field in a the list of Darwin Core Controlled Value fields (see below), creates a report of counts of distinct values
  • for each field in a the list of Darwin Core Controlled Value fields (see below), creates a report of recommended values for values that are not standard.

The files produced by this workflow are:

References

Workflow configuration file: https://github.com/kurator-org/kurator-validation/blob/master/packages/kurator_dwca/workflows/file_controlled_term_assessor.yaml

Darwin Core Controlled Value lookup files: https://github.com/kurator-org/kurator-validation/tree/master/packages/kurator_dwca/data/vocabularies

Darwin Core Controlled Value fields (from http://rs.tdwg.org/dwc/terms/index.htm):

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.