/ ModernDive_book Public

# Add infer to Chapters 9-11 and DataCamp courses. Big additions to Chapters 8 and 12.

Choose a tag to compare
Nothing to show
ismayc released this 22 Jul 21:36
· 1809 commits to master since this release
`afba032`

# ModernDive 0.4.0

## Highlights

1. The `infer` package is ready for prime-time! Thus we made a first pass at incorporating it into the book in Chapters 9 and 10 on confidence intervals and hypothesis testing!
2. Chapter 12 on "Thinking with Data" now includes a case study using the Seattle house prices dataset on Kaggle.com. Chapters 3 and 4 from new "Modeling with Data in the Tidyverse" DataCamp course by Albert Y. Kim are based on this analysis!
3. Speaking of DataCamp, we point readers to various DataCamp courses that directly align with various chapters in the book!
4. We significantly cleaned up Chapter 8 on sampling! In particular: adding a 2013 Obama approval rating poll example to tie in with our sampling bowl tactile and virtual simulations and making it very clear that ultimately we are performing statistical inference via sampling.

## All content changes

• Introduction: Added section on correspondence of chapters to various DataCamp courses. Furthermore, links to relevant DataCamp course are included at the outset of each chapter.
• Chapter 3 - Data visualization:
• Added simplified `geom_jitter()` example
• More explanations for how whiskers and outliers are constructed in `geom_boxplots`
• Added summary of table of all 5 named graphs
• Chapter 4 - Tidy data:
• Added section on importing Excel data via RStudio
• Added example of tidy vs non-tidy: `fivethirtyeight::drinks`
• Chapter 5 - Data wrangling:
• Added computing available seat miles data wrangling case study
• Abandoned "5 Main Verbs" 5MV notion
• Added `_join()` and `group_by()` multiple variables
• Chapter 6 - Basic regression:
• Clarified explanations of indicator/dummy variables when using categorical variable in regression.
• Expanded "Correlation is not necessarily causation" subsection with example of "does sleeping with shoes on cause headaches?" including causal diagram
• Introduced concept of a "wrapper function" when introducing `moderndive::get_regression_table()` function
• Replaced all `base::summary()` with `skimr::skim()` for quick numerical summaries
• Chapter 7 - Multiple regression:
• Changed all "everything else being equal" interpretation statements with "taking into account/controlling for all other variables in our model"
• Chapter 8 - Sampling:
• Significantly cleaned up sampling terminology and definitions and made more clear that we are sampling for inference
• Cleaned up section and subsection structure to be much cleaner:
1. Tactile sampling simulation
2. Virtual sampling simulation
3. In real-life sampling: Introduced example of 2013 Obama approval rating poll and then tie everything with sampling bowl.
• Major overhaul: Chapter 9 - Confidence intervals
• Major overhaul: Chapter 10 - Hypothesis testing
• Chapter 11 - Inference for Regression
• Added a simple linear regression example using the `infer` package
• Major overhaul: Chapter 12 - Thinking with data
• Added case study of Seattle house prices dataset from Kaggle, which is now available in `house_prices` dataframe in `moderndive` package.
1. Chapters 3 and 4 from new "Modeling with Data in the Tidyverse" DataCamp course are based on this analysis
2. Includes a discussion on the importance of `log10`-transformations
3. Introduces modeling/regression for prediction: predicting house prices
• Laid outline for "effective data storytelling" using `fivethirtyeight` data and added one small example using US births data
• At the beginning of chapter, we now come full circle and revisit the discussion on the ModernDive flowchart in the introduction.