Skip to content

mdlama/Data_Manipulation

Repository files navigation

Tutorial: Data Manipulation with dplyr

🚀 Launch Tutorial

Course Description

Figure 1: The data life cycle. Image from “R for Data Science”

A model of the data life cycle from the free online book “R for Data Science” is shown in Figure 1. The whirlpool in the middle, besides making you dizzy I’m sure, is one of the most rewarding parts of the process - data exploration! There are some important take home points from the “Explore” portion of the diagram:

  1. Data might require transformation before proper visualization.
  2. It is best to visualize your data before you do statistical analysis (also known as statistical modeling).
  3. Rarely if ever will you complete this cycle only once.
  4. Iterate towards perfection. In other words, don’t worry about perfectly formatted publication quality graphs and results. Exploration should be full of fun and curiosity. Save the publication quality results for the “Communicate” step.

In this tutorial, we will introduce you to the tidyverse and start with the Transform part of the life cycle, exploring the tidyverse data manipulation (also known as data wrangling) package dplyr.

Built with

Quarto Live - WebAssembly powered code blocks and exercises for Quarto HTML documents. Quarto is the “next generation” of R Markdown. Quarto Live uses WebR, which facilitates running R code directly in the browser with no need for a server.

Acknowledgments

  • This course is based on R for Data Science: Data Transformation

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Tutorial: Data Manipulation with dplyr

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages