Skip to content

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

License

Notifications You must be signed in to change notification settings

extwiii/DataScience-JHU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

DataScience-Johns.Hopkins.University ✅

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

Course 1 - The Data Scientist’s Toolbox

  • Welcome
    • Introduction to basic tools
  • Installing the Toolbox
    • R, Git, Github
  • Conceptual Issues
    • Steps in a data analysis, Putting the science in data science
  • Course Project Submission & Evaluation

Course 2 - R Programming

  • Background, Getting Started, and Nuts & Bolts
  • Programming with R
    • Derek Franks has written a very nice tutorial to help you get up to speed
  • Loop Functions and Debugging
    • lapply, apply, tapply, split, mapply
  • Simulation & Profiling
    • Simulate a random normal variable with an arbitrary mean and standard deviation

Course 3 - Getting and Cleaning Data

  • Data collection & Data formats
    • Raw files (.csv,.xlsx), Databases (mySQL), APIs, Flat files (.csv,.txt), XML, JSON
  • Making data tidy
  • Distributing data
  • Scripting for data cleaning

Course 4 - Exploratory Data Analysis

  • Making exploratory graphs
  • Principles of analytic graphics
  • Clustering methods
  • Dimension reduction techniques

Course 5 - Reproducible Research

  • Concepts, Ideas, & Structure
  • Markdown & knitr
  • Reproducible Research Checklist & Evidence-based Data Analysis
  • Case Studies & Commentaries

Course 6 - Statistical Inference

  • Probability & Expected Values
  • Variability, Distribution, & Asymptotics
  • Intervals, Testing, & Pvalues
  • Power, Bootstrapping, & Permutation Tests

Course 7 - Regression Models

  • Least Squares and Linear Regression
  • Linear Regression & Multivariable Regression
  • Multivariable Regression, Residuals, & Diagnostics
  • Logistic Regression and Poisson Regression

Course 8 - Practical Machine Learning

  • Prediction, Errors, and Cross Validation
  • The Caret Package
  • Predicting with trees, Random Forests, & Model Based Predictions
  • Regularized Regression and Combining Predictors

Course 9 - Developing Data Products

  • Shiny, GoogleVis, and Plotly
  • R Markdown and Leaflet
  • R Packages
  • Swirl and Course Project

The most up to date information on the course lecture notes will always be in the course Github repository

Taught by:

Roger D. Peng, PhD - Associate Professor, Biostatistics

Brian Caffo, PhD - Professor, Biostatistics

Jeff Leek, PhD - Associate Professor, Biostatistics

Rating 🌕🌕🌕🌕🌕🌕🌕🌑🌑🌑

Difficulty 🌕🌕🌕🌕🌕🌕🌑🌑🌑🌑

Created By Bilal Cagiran | E-Mail | Github | LinkedIn | CodePen | Blog/Site | FreeCodeCamp

Releases

No releases published

Packages

No packages published