For neural networks - Deep Learning - Ian Goodfellow, Yoshua Bengio and Aaron Courville - read chapters 1-3, 5-6, 111-12
For everything else (linear models, random forests etc):
- Elements of Statistical Learning - Trevor Hastie, Robert Tibshirani and Jerome Friedman - read chapters 1-4, 7-8 and 9
- Pattern Recognition and Machine Learning - Christopher M. Bishop
There is also An Introduction to Statistical Learning - James et. al, which covers the same topics as Elements of Statistical Learning, but concentrates on applications & less on math.
floodsung/Deep-Learning-Papers-Reading-Roadmap - answers the question 'Which paper should I start reading from?'
One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.
The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries.
Papers With Code - a free and open resource with Machine Learning papers, code and evaluation tables
Distill - research journal with a focus on clear communication.