Pinsky Lab Data Management Plan
The Pinsky Lab is an open science lab that believes in reproducible science. To that aim:
- We keep lab notebooks to record what we did, learned, or produced each day. Can be physical notebooks, text files, Evernote, Jupyter notebooks, etc.
- We do our data analysis in GitHub repositories in the pinskylab organization to facilitate collaboration and sharing within the lab.
- We keep our raw data in the github repository related to the project, unless the data files are too large.
- We generally use a folder called "data" within the repository.
- We store our raw data with metadata describing what’s in the file and what the columns mean
- If we clean the data, we often use a folder called something like "data-raw" and a folder called "data-clean" to differentiate data in its original form from data that has been manipulated.
- If we are using data downloaded from another data source, we often have a folder called "data_dl" for downloaded data, that is not tracked in GitHub. We include the data source in a README file for reproducibility.
- If our data are too large to store on github, please store them on the lab server in /local/shared/pinskylab/.
- We back up our data in at least two places (beyond our computers). The cloud can be one site. An external harddrive can be another. Amphiprion is backed up off-site daily.
- We use scripts to process data, make models, do analyses, etc.
- We publish git repositories through Zenodo upon publication of a manuscript.
- Collaborative manuscripts are written in google drive.
- We generally make presentations in google slides so that other lab members can easily access useful graphics.
- When matriculating, make sure all projects, code, papers, data, etc. are in the Pinsky Lab organization on GitHub or Google Drive.