Skip to content

DataHarmonizer Getting Started

Rhiannon Cameron edited this page Jul 10, 2023 · 2 revisions

Table of Contents

Basic operations

The DataHarmonizer contextual data collection template and validator application enables users to enter data as in an excel spreadsheet - type and pick values from dropdowns, copy/paste, add rows, etc. You can edit the cells manually, or upload .xlsx, .xls, .tsv, .csv and .json files via File > Open.

edit copy paste delete

Change template

The default template loaded is the "CanCOGeN Covid-19" template. To change the spreadsheet template, select the white text box to the right of Template, it always contains the name of the template currently active, or navigated to File > Change Template. An in-app window will appear that allows you to select from the available templates in the drop-down menu. After selecting the desired template, click Open to activate the template.

change template

Toggle required columns

Fields are color-coded; required fields (minimal metadata) are yellow, strongly recommended fields are purple, optional fields are white/gray., You can toggle between all, the required fields, and the required + recommended fields by going to Settings > Show required columns, etc.

toggle required columns

Field header information

Double click on field headers to see the definition of the field, guidance on filling in the field, and examples of how data might look structured according to the constraints of the validator.

double click headers

Selecting picklist values

Picklists of controlled vocabulary are available for many fields. There are dropdown menus for some, and multi-select options for others (e.g. Signs and Symptoms). You can also see the list of terms in multi-select columns by clicking once in the multi-select box.

selecting values

Validating

When you’re done entering your data, you can validate values by clicking on Validate. Errors and missing information in required fields will be colored light red and dark red, respectively. You can navigate to and between errors by clicking the Next Error button that appears when errors are present. After resolving these errors, revalidate to see if any remain. If there are no more errors the Next Error button will change to No Errors and fade away.

validating

Toggle rows

To facilitate editing data post-validation, you can change your view to only Show valid rows or only Show invalid rows by selecting these options under the Settings menu. You can select Show all rows to return to the original view.

show rows

Show section

You can quickly navigate to a section of fields by selecting Settings and then selecting one of the options listed under Show section:.

show section

Jump to column

You can quickly navigate to a column by selecting Settings > Jump to…. An in-app window will appear, select the desired column header from the drop-down list or begin typing its name to narrow down the list options. Selecting the column header from the drop down list will immediately relocate you to that column on the spreadsheet.

jump to column

Fill column with specified value

You can also automatically fill a column with a specified value, but only in rows with corresponding values in the first “sample ID” column. To use this feature select Settings > Fill column.... Select the desired column header from the drop-down list or begin typing its name to narrow down the list options, then specify the value to fill with and click “Ok” to apply.

fill column

Exporting

When you’ve entered and validated your data, you can save it as a .xlsx, .xls, .tsv, .csv, or .json format file, by clicking on File followed by “Save as”. You can also format your data for IRIDA upload (you will need to perform the upload yourself), or use your entered data to create a GISAID submission form (you will need to add additional information, and perform the upload yourself), by clicking on File > Export to.

exporting files

Provenance

If your template uses the DataHarmonizer provenance field, provenance information will automatically be added upon Validation for all rows containing data. Provenance information includes the DataHarmonizer application version number, and the template schema version number (if available).

provenance

More information

You can find more information about a template under the Help button. Reference Guide includes additional information about fields and the SOP (short for Standard Operating Procedure) contains additional information about curating data.

For a reference guide that includes picklist values, please contact us.

more info

Make custom templates

For information on how to build a custom DataHarmonizer template please refer to the application development repository: https://github.com/cidgoh/DataHarmonizer/wiki/DataHarmonizer-Templates

Version info & updates

Make sure you are using the latest version of the DataHarmonizer (and your template). You can check your current Data Harmonizer Template version by looking at the bottom of the Settings menu. To update your application and/or template, open the Help > Get latest release.

If you have questions, or require assistance, contact emma_griffiths@sfu.ca.

version update