This course introduces tools for working with data, focused on R and the R ecosystem. It begins by showing you how to set up R and RStudio. The course explains R packages, functions, data structures, control flow, and loops. Once you are comfortable with the basics, it explains data visualization and graphics. You learn how to build statistical and advanced plots using the powerful ggplot2 library. You learn data management concepts, such as factors, pivots, aggregation, merging, and dealing with missing values. By the end of this course, you are ready to complete an entire data science project of your own for your portfolio or blog.
- Use the basic programming concepts of R such as loading packages, arithmetic functions, data structures, and flow control
- Import data to R from various formats, such as CSV, Excel, and SQL
- Clean data by handling missing values and standardizing fields
- Perform univariate and bivariate analysis using ggplot2
- Create statistical summary and advanced plots, such as histograms, scatter plots, box plots, and interaction plots
- Apply data management techniques, such as factors, pivots, aggregation, merging, and dealing with missing values, on the example data sets
For an optimal student experience, we recommend the following hardware configuration:
- Processor: 2.6 GHz or higher, preferably multi-core
- Memory: 4GB RAM
- Hard disk: 10GB or more
- An Internet connection
You’ll also need the following software installed in advance:
- Operating System: Windows (8 or higher).
- R and RStudio