Avoiding data disasters

Outline

It has been said that 80% of data analysis is spent on the process of cleaning and preparing the data. Not only does this represent a significant time investment for the data analyst, but is often a hurdle for the non-specialist trying to get to grips with analysing their own data after attending an R or Python course. Despite the best intentions, a spreadsheet that is intuitive and easily-understandable by human eyes can lead to disaster when trying to process computationally.

This workshop will go through the basic principles that we can all adopt in order to work with data more effectively and "think like a computer". Moreover, we will discuss the best practices for data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future.

Timetable

12:30 - 13:00 Andy - philosophical introduction
13:00 - 14:30 Sergio / Valeria - Typical problems talk + practical
14:30 - 15:00 Anne - File management
15:00 - 15:30 Peter - Strategies for backup

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
images		images
20161205_FileManagement.pdf		20161205_FileManagement.pdf
BackupforDataDisasters.pdf		BackupforDataDisasters.pdf
PRE_RDMCRUK_V1_20161205.pptx		PRE_RDMCRUK_V1_20161205.pptx
Presentation20161205.pdf		Presentation20161205.pdf
Presentation20161205.pptx		Presentation20161205.pptx
README.md		README.md
description.md		description.md
fake-data.Rmd		fake-data.Rmd
index.md		index.md
patient-data-cleaned.tsv		patient-data-cleaned.tsv
patient-data.txt		patient-data.txt
practise1.xlsx		practise1.xlsx
practise1_correct.csv		practise1_correct.csv
practise2.xlsx		practise2.xlsx
practise2_correct.csv		practise2_correct.csv
principles.Rmd		principles.Rmd
principles.html		principles.html
refine-demo.Rmd		refine-demo.Rmd
refine-demo.html		refine-demo.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Avoiding data disasters

Outline

Timetable

About

Releases

Packages

Contributors 4

Languages

bioinformatics-core-shared-training/avoid-data-disaster

Folders and files

Latest commit

History

Repository files navigation

Avoiding data disasters

Outline

Timetable

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages