If you are starting out with machine learning
For neural networks - Deep Learning - Ian Goodfellow, Yoshua Bengio and Aaron Courville
For everything else (linear models, random forests etc)
- Elements of Statistical Learning - Trevor Hastie, Robert Tibshirani and Jerome Friedman
- Pattern Recognition and Machine Learning - Christopher M. Bishop
One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.
The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries.
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World - Pedro Domingos