Skip to content
four week intro to data management: spreadsheets, project organization, metadata, reproducibility
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Data for Data Science


Researchers face a growing data management challenge, starting with data collection and continuing through data analysis, publication, and archival. Potential problems research labs may face include scalability of their data management methods to many and/or very large data files, fully documenting data and its organization, and meeting requirements of grants/publication related to data sharing. This four-week course is designed to introduce attendees to best practices in data organization and management. Each one-hour lecture will include lecture, discussion, and practice exercises. This course assumes no prior training in data science. At the end of this course, you will be able to identify resources at Fred Hutch for data management and apply best practices in data organization to your own research projects.

Software requirements for this course can be found on's Software page.


  • Week 1: Data entry and creating spreadsheets
  • Week 2: Organizing data and project files
  • Week 3: Documenting data with metadata
  • Week 4: Data manipulation and reproducibility


  • Each week of class has a directory containing relevant materials, including:
    • WeekX_Topic.pdf: PDF of slides for presentation, where X indicates the week of class
    • notes to guide instructor presentation and activity engagement

Further reading

You can’t perform that action at this time.