Skip to content

Course materials for a series of lectures that give an introduction to working with data and computation.

License

Notifications You must be signed in to change notification settings

be-green/data-class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data class

These course materials are designed for an introduction to computation and working with data. It will include organized data examples, a set of slides and examples, notebooks, etc.

Since I'm most comfortable programming in R, this class will mostly be taught using R.

Definitely a work in progress; if you have thoughts/suggestions/additions please open a github issue.

Topics

This is in no particular order, grouped by topic. As lectures come together I'll group these by the set of slides. Each group of topics will have a set of slides with an associated set of example scripts and data.

Data

  • Organizing and structuring datasets
  • Tidy data frameworks and key-value pairs
  • Web and other non-tabular formats
  • Visual checks and identifying mistakes
  • Unit tests and assertions for data
  • Safe joining and merging of datasets

Visualization

  • A layered grammar of graphics, ggplot2, etc.
  • Interactive plotting tools (plotly, highcharts)
  • Javascript libraries, htmlwidgets (DT, leaflet)
  • Simple interactive web applications with shiny

Reproducible Research

  • Structuring your project as a pipeline
  • Notebook tools for exploration and writing (e.g. Jupyter, Rmarkdown)
  • Documentation tools, managing requirements
  • Version control and collaboration with git

Programming

  • Writing and documenting code
  • Unit testing
  • Functional programming
  • Continuous integration tools

Web stuff

  • Working with APIs
  • Web data structures (JSON & XML)

Maybe other stuff

  • Developing packages?

Other resources

Here are other good guides and tools people have put together. Most of my material is either learned or conceptually in debt to these things. If I've missed something please submit an issue or pull request!

Learning R:

Specific tools:

Reproducible Research Frameworks:

Coding resources:

Misc:

License

The content of this project itself is licensed under the Creative Commons Attribution 4.0 license, and the underlying source code used to generate that content is licensed under the MIT license.

About

Course materials for a series of lectures that give an introduction to working with data and computation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published