A useful overview of less used dplyr functions. The first post in a series of four.
A video of Hadley Wickham showing how he does exploratory data analysis.
The offical pandas docs sometimes leave a bit to be desired, but the tutorials and exercises here are very useful.
The excellent book by Hadley Wickham about all things R related.
This package allows you to create graphs declaratively, similar to ggplot. However, it is made for python and sits ontop of D3.js, so interactive plots can be made easily.
I haven't tried this, but it looks like an interesting tool for interactively building plots without code.
A great resource not only for learning about what makes a good visualization, but allows you to check your own figures.
This is a transcript of an excellent talk from Jonathan Corum, the science graphics editor at the New York Times.
An interactive ggplot creator, useful for ggplot beginners
Storytelling With Data is a blog all about improving your visualizations. There's often figure fix ups and reader competitions.
Learning to Program
In my opinion, it's the best intro to python book that I've come across.
Codecademy is sometimes criticized as it leaves beginners without an understanding of how to actually write a program. However, as a bare bones intro or syntax refresher I think it does a great job. Just pair it with a book afterwards.
SQL is great if you are working with larger datasets, or even just for understanding the philosophy underneath many data analysis toosl. Mode's tutorial is all interactive so you get to actually practice with real data. If you have experience analyzing data already, SQL is relatively quick to pick up for how valuable it is.
Sentdex has amazing video tutorials for almost everything you could want to do with data in python.
Fast.ai has both a deep learning course and a basic machine learning course. It's unique from other resources as it builds top down and explains concepts through code rather than math.
A great, practical introduction to machine learning. Large focus on immediately useful tools like regression, random forest and xgboost.
Kaggle hosts machine learning competitions. The titanic dataset is a good place to start.
These are a series of excellent math tutorials. 3Blue1Brown uses beautiful visualizations to try to make complex topics clear, and focuses on teaching at a conceptual level. I found the linear algebra course to be particularly useful.
The old version of this tutorial was incredibly helpful for making my website. It's comprehensive and now up-to-date.
A thorough description of R Markdown and everything you can do with it.
Really useful for slower computers.
This is a really useful intro to brms, a package that makes working with bayesian hierarchical mixed models in stan really easy.
A great visual demo of what is happening in a hierarchical linear model (AKA a random or mixed effects model).
A useful article describing how to calculate effect sizes in a number of different situations, including spreadsheets to do the calculations.
Gaussian Processes are really cool, but they never clicked for me until I saw this paper. Distill is also just an awesome model for what scientific publishing could like like in the future.
If you wish your GLMs were wigglier you should start here.
A great introduction about how MCMCs sample distriubtions and why NUTS is useful.
Interested in Bayes stats for psychology? This special issue in PBR has a ton of excellent papers on the topic.
This is a great intro to stan using a practical example workflow.
Bayesian stats specifically for psychologists
Richard McElreath's excellent course on bayesian statistics. Useful even for people who are feel like they need some help understanding the foundation of frequentist analyses.