title | tags | slideOptions | ||||
---|---|---|---|---|---|---|
Excel |
presentation |
|
Questions:
- How can we carry out basic quality control and quality assurance in spreadsheets?
Objectives:
- Apply quality control techniques to identify errors in spreadsheets and limit incorrect data entry.
- Don't forget to keep the original data pristine and create a
readme.txt
file - Helpful resource: Cornell University's guidance on creating
readme.txt
files
- Invalid values often sort to the bottom or top of a column
- Sort the data on each column one at a time to root out poor data
- Use with caution
- Can be a good strategy for flagging outliers
Questions:
- How can we export data from spreadsheets in a way that is useful for downstream applications?
Objectives:
- Store spreadsheet data in universal file formats.
- Export data from a spreadsheet to a
.csv
file.
- We do not recommend using
.xls
,.xlsx
(Excel), or.ods
(LibreOffice) file formats - More and more journals and grant agencies are requiring researchers to deposit data in a data repository (most of them do not accept Excel formats)
- Storing data in universal, open, and static formats keeps data consistent and reduces the possibility of error
- Tab-delimited (
.tsv
) files - Comma-delimited (
.csv
) files
- Most software recognizes these formats
- The encoding is simple, so they will continue to be interpretable into the future
- These formats work well with version-control software like Git