Master Algorithm References

"The Master Algorithm" by Pedro Domingos is one of my most favourite books. It inspired me early on to get into machine learning. I especially loved the fact that he put together a whole list of resources at the end of the book which are invaluable to anyone who wants to understand the various paradigms of machine learning as described by him. Here is a compilation of those resources. I will convert them to links in the near future, but for now enjoy :)

“Behind-the-scenes data mining,” by George John (SIGKDD Explorations, 1999)
Eric Siegel’s book Predictive Analytics (Wiley, 2013)
McKinsey Global Institute’s 2011 report Big Data: e Next Frontier for Innovation, Competition, and Productivity
Big Data: A Revolution at Will Change How We Live, Work, and Think, by Viktor Mayer-Schönberger and Kenneth Cukier (Houghton Mi- lin Harcourt, 2013)
Artificial Intelligence,* by Elaine Rich (McGraw-Hill, 1983)
Artificial Intelligence: A Modern Approach, by Stuart Russell and Peter Norvig (3rd ed., Prentice Hall, 2010)
Nils Nilsson’s The Quest for Articial Intelligence (Cambridge University Press, 2010)
Nine Algorithms that Changed the Future, by John MacCormick (Princeton University Press, 2012)
Algorithms,* by Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani (McGraw-Hill, 2008)
The Pattern on the Stone, by Danny Hillis (Basic Books, 1998)
The Innovators (Simon & Schuster, 2014)
“Spreadsheet data manipulation using examples,”* by Sumit Gulwani, William Harris, and Rishabh Singh (Communications of the ACM, 2012)
Competing on Analytics, by Tom Davenport and Jeanne Harris (HBS Press, 2007)
Steven Levy (Simon & Schuster, 2011)
Information Rules (HBS Press, 1999)
The Long Tail (Hyperion, 2006)
The Fourth Paradigm, edited by Tony Hey, Stewart Tansley, and Kristin Tolle (Microso Research, 2009)
“Machine science,” by James Evans and Andrey Rzhetsky (Science, 2010)
Scientific Discovery: Computational Explorations of the Creative Processes,* by Pat Langley et al. (MIT Press, 1987)
“From digitized images to online catalogs,” by Usama Fayyad, George Djorgovski, and Nicholas Weir (AI Magazine, 1996)
“Machine learning in drug discovery and development,”* by Niki Wale (Drug Development Research, 2001)
“The automation of science,” by Ross King et al. (Science, 2009)
Sasha Issenberg’s The Victory Lab (Broadway Books, 2012)
“How President Obama’s campaign used big data to rally individual votes,” by the same author (MIT Technology Review, 2013)
P. W. Singer’s Wired for War (Penguin, 2009)
Cyber War, by Richard Clarke and Robert Knake (Ecco, 2012)
“Adversarial classication,”* by Nilesh Dalvi et al. (Proceedings of the Tenth International Conference on Knowledge Discovery and Data Mining, 2004)
Predictive Policing, by Walter Perry et al. (Rand, 2013)
“Visual behaviour mediated by retinal projections directed to the auditory pathway,” by Laurie von Melchner, Sarah Pallas, and Mriganka Sur (Nature, 2000)
Ben Underwood’s story is told in “Seeing with sound,” by Joanna Moorhead (Guardian, 2007), and at www.benunderwood.com
“Generality of the functional structure of the neocortex” (Naturwissenschaen, 9780465065707-text.indd 299 7/16/15 12:44 PM 300 | Further Readings 1977)
“An organizing principle for cerebral function: e unit model and the distributed system,” in The Mindful Brain, edited by Gerald Edelman and Vernon Mountcastle (MIT Press, 1978)
Gary Marcus, Adam Marblestone, and Tom Dean make the case against in “The atoms of neural computation” (Science, 2014).
“The unreasonable effectiveness of data,” by Alon Halevy, Peter Norvig, and Fernando Pereira (IEEE Intelligent Systems, 2009)
Benoît Mandelbrot explores the fractal geometry of nature in the eponymous book* (Freeman, 1982)
James Gleick’s Chaos (Viking, 1987)
Love and Math, by Edward Frenkel (Basic Books, 2014)
The Golden Ticket, by Lance Fortnow (Princeton University Press, 2013)
The Annotated Turing,* by Charles Petzold (Wiley, 2008)
“Cyc: Toward programs with common sense,”* by Douglas Lenat et al. (Communications of the ACM, 1990)
“On Chomsky and the two cultures of statistical learning” (http://norvig.com/chomsky.html)
Jerry Fodor’s The Modularity of Mind (MIT Press, 1983)
“What big data will never explain,” by Leon Wieseltier (New Republic, 2013)
“Pundits, stop sounding ignorant about data,” by Andrew McAfee (Harvard Business Review, 2013)
Thinking, Fast and Slow (Farrar, Straus and Giroux, 2011)
“Computer scientists may have what it takes to help cure cancer” (New York Times, 2011)
A Treatise of Human Nature (1739)
“The lack of a priori distinctions between learning algorithms”* (Neural Computation, 1996)
“Toward knowledge-rich data mining”* (Data Mining and Knowledge Discovery, 2007)
“The role of Occam’s razor in knowledge discovery”* (Data Mining and Knowledge Discovery, 1999)
The Signal and the Noise, by Nate Silver (Penguin Press, 2012)
“Why most published research findings are false,”* by John Ioannidis (PLoS Medicine, 2005)
“Controlling the false discovery rate: A practical and powerful approach to multiple testing”* (Journal of the Royal Statistical Society, Series B, 1995)
“Neural networks and the bias/variance dilemma,” by Stuart Geman, Elie Bienenstock, and René Doursat (Neural Computation, 1992
“Machine learning as an experimental science,” by Pat Langley (Machine Learning, 1988)
The Principles of Science (1874)
“Machine learning of first-order predicates by inverting resolution,”* by Steve Muggleton and Wray Buntine (Proceedings of the Fifth International Conference on Machine Learning, 1988)
Relational Data Mining,* edited by Sašo Džeroski and Nada Lavrač (Springer, 2001)
“The CN2 Induction Algorithm,”* by Peter Clark and Tim Niblett (Machine Learning, 1989)
“Fast algorithms for mining association rules,”* by Rakesh Agrawal and Ramakrishnan Srikant (Proceedings of the Twentieth International Conference on Very Large Databases, 1994)
“Carcinogenesis predictions using inductive logic programming,” by Ashwin Srinivasan, Ross King, Stephen Muggleton, and Michael Sternberg (Intelligent Data Analysis in Medicine and Pharmacology, 1997)
C4.5: Programs for Machine Learning,* by J. Ross Quinlan (Morgan Kaufmann, 1992)
Classication and Regression Trees,* by Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone (Chapman and Hall, 1984)
“Real-time human pose recognition in parts from single depth images,”* by Jamie Shotton et al. (Communications of the ACM, 2013)
“Competing approaches to predicting Supreme Court decision making,” by Andrew Martin et al. (Perspectives on Politics, 2004)
“Computer science as empirical enquiry: Symbols and search” (Communications of the ACM, 1976)
Vision* (Freeman, 1982)
Machine Learning: An Artificial Intelligence Approach,* edited by Ryszard Michalski, Jaime Carbonell, and Tom Mitchell (Tioga, 1983)
“Connectionist AI, symbolic AI, and the brain,”* by Paul Smolensky (Articial Intelligence Review, 1987)
Sebastian Seung’s Connectome (Houghton Miin Harcourt, 2012)
Parallel Distributed Processing,* edited by David Rumelhart, James McClelland, and the PDP research group (MIT Press, 1986)
Neurocomputing,* edited by James Anderson and Edward Rosenfeld (MIT Press, 1988)
Neural Networks: Tricks of the Trade, edited by Genevieve Orr and Klaus-Robert Müller (Springer, 1998)
“Life in the fast lane: The evolution of an adaptive vehicle control system,” by Todd Jochem and Dean Pomerleau (AI Magazine, 1996)
Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences* (Harvard University, 1974)
Applied Optimal Control* (Blaisdell, 1969)
Learning Deep Architectures for AI,* by Yoshua Bengio (Now, 2009)
“Learning long-term dependencies with gradient descent is difficult,”* by Yoshua Bengio, Patrice Simard, and Paolo Frasconi (IEEE Transactions on Neural Networks, 1994)
“How many computers to identify a cat? 16,000,” by John Marko (New York Times, 2012)
“Gradient-based learning applied to document recognition,”* by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haner (Proceedings of the IEEE, 1998)
“The $1.3B quest to build a supercomputer replica of a human brain,” by Jonathon Keats (Wired, 2013)
“The NIH BRAIN Initiative,” by Thomas Insel, Story Landis, and Francis Collins (Science, 2013)
Chapter 2 of How the Mind Works (Norton, 1997)
“One AI or Many?” (Daedalus, 1988)
The Birth of the Mind, by Gary Marcus (Basic Books, 2004)
“Evolutionary robotics,” by Josh Bongard (Communications of the ACM, 2013)
Artificial Life, by Steven Levy (Vintage, 1993)
Chapter 5 of Complexity, by Mitch Waldrop (Touchstone, 1992)
Genetic Algorithms in Search, Optimization, and Machine Learning,* by David Goldberg (Addison-Wesley, 1989)
“Punctuated equilibria: An alternative to phyletic gradualism,” in Models in Paleobiology, edited by T. J. M. Schopf (Freeman, 1972)
Chapter 9 of The Blind Watchmaker (Norton, 1986)
Chapter 2 of Reinforcement Learning,* by Richard Sutton and Andrew Barto (MIT Press, 1998)
Adaptation in Natural and Articial Systems* (University of Michigan Press, 1975)
John Koza’s Genetic Programming* (MIT Press, 1992)
“Evolving team Darwin United,”* by David Andre and Astro Teller, in RoboCup-98: Robot Soccer World Cup II, edited by Minoru Asada and Hiroaki Kitano (Springer, 1999)
Genetic Programming III,* by John Koza, Forrest Bennett III, David Andre, and Martin Keane (Morgan Kaufmann, 1999)
“Co-evolving parasites improve simulated evolution as an optimization procedure”* (Physica D, 1990). Adi Livnat, Christos Papadimitriou, Jonathan Dusho, and Marcus Feldman propose that sex optimizes mixability in “A mixability theory of the role of sex in evolution”* (Proceedings of the National Academy of Sciences, 2008)
“A response to the ML-95 paper entitled . . . ”* (unpublished; online at www. genetic -programming.com/jktahoe24page.html)
“A new factor in evolution” (American Naturalist, 1896)
“How learning can guide evolution”* (Complex Systems, 1987)
1996 special issue* of the journal Evolutionary Computation edited by Peter Turney, Darrell Whitley, and Russell Anderson
The Scope and Method of Political Economy (Macmillan, 1891)
A First Course in Bayesian Statistical Methods,* by Peter Ho (Springer, 2009)
Pattern Classification and Scene Analysis,* by Richard Duda and Peter Hart (Wiley, 1973)
“The methodology of positive economics,” which appears in Essays in Positive Economics (University of Chicago Press, 1966)
“Stopping spam,” by Joshua Goodman, David Heckerman, and Robert Rounthwaite (Scientic American, 2005)
“Relevance weighting of search terms,”* by Stephen Robertson and Karen Sparck Jones (Journal of the American Society for Information Science, 1976)
“First links in the Markov chain,” by Brian Hayes (American Scientist, 2013)
“Large language models in machine translation,”* by orsten Brants et al. (Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007)
“The PageRank citation ranking: Bringing order to the Web,”* by Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd (Stanford University technical report, 1998)
Statistical Language Learning,* by Eugene Charniak (MIT Press, 1996)
Statistical Methods for Speech Recognition,* by Fred Jelinek (MIT Press, 1997)
“The Viterbi algorithm: A personal history,” by David Forney (unpublished; online at arxiv.org/pdf/cs/0504020v2.pdf)
Bioinformatics: The Machine Learning Approach,* by Pierre Baldi and Søren Brunak (2nd ed., MIT Press, 2001)
“Engineers look to Kalman Filtering for guidance,” by Barry Cipra (SIAM News, 1993)
Probabilistic Reasoning in Intelligent Systems* (Morgan Kaufmann, 1988)
“Bayesian networks without tears,”* by Eugene Charniak (AI Magazine, 1991)
“Probabilistic interpretation for MYCIN’s certainty factors,”* by David Heckerman (Proceedings of the Second Conference on Uncertainty in Articial Intelligence, 1986)
“Module networks: Identifying regulatory modules and their condition-specific regulators from gene expression data,” by Eran Segal et al. (Nature Genetics, 2003)
“Microsoft virus fighter: Spam may be more dicult to stop than HIV,” by Ben Paynter (Fast Company, 2012)
“Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base,” by M. A. Shwe et al. (Parts I and II, Methods of Information in Medicine, 1991)
Section 26.5.4 of Kevin Murphy’s Machine Learning* (MIT Press, 2012)
“TrueSkillTM: A Bayesian skill rating system,”* by Ralf Herbrich, Tom Minka, and ore Graepel (Advances in Neural Information Processing Systems 19, 2007)
Modeling and Reasoning with Bayesian Networks,* by Adnan Darwiche (Cambridge University Press, 2009)
The January/February 2000 issue* of Computing in Science and Engineering, edited by Jack Dongarra and Francis Sullivan
“Stanley: The robot that won the DARPA Grand Challenge,” by Sebastian Thrun et al. (Journal of Field Robotics, 2006)
“Bayesian networks for data mining,”* by David Heckerman (Data Mining and Knowledge Discovery, 1997)
“Gaussian processes: A replacement for supervised neural networks?,”* by David MacKay (NIPS tutorial notes, 1997; online at www.inference.eng.cam.ac.uk/mackay/gp.pdf)
Section 9.6 of Speech and Language Processing,* by Dan Jurafsky and James Martin (2nd ed., Prentice Hall, 2009)
“On the optimality of the simple Bayesian classier under zero-one loss”* (Machine Learning, 1997; expanded journal version of the 1996 conference paper)
Markov Random Fields for Vision and Image Processing,* edited by Andrew Blake, Pushmeet Kohli, and Carsten Rother (MIT Press, 2011)
“Conditional random fields: Probabilistic models for segmenting and labeling sequence data,”* by John Lafferty, Andrew McCallum, and Fernando Pereira (International Conference on Machine Learning, 2001)
“From knowledge bases to decision models,”* by Michael Wellman, John Breese, and Robert Goldman (Knowledge Engineering Review, 1992)
Catch Me If You Can, cowritten with Stan Redding (Grosset & Dunlap, 1980)
“Discriminatory analysis: Nonparametric discrimination: Consistency properties”* (USAF School of Aviation Medicine, 1951)
Nearest Neighbor (NN) Norms,* edited by Belur Dasarathy (IEEE Computer Society Press, 1991)
“Locally weighted learning,”* by Chris Atkeson, Andrew Moore, and Stefan Schaal (Articial Intelligence Review, 1997)
“GroupLens: An open architecture for collaborative fitering of netnews,”* by Paul Resnick et al. (Proceedings of the 1994 ACM Conference on Computer-Supported Cooperative Work, 1994)
“Amazon.com recommendations: Item-to-item collaborative filtering,”* by Greg Linden, Brent Smith, and Jeremy York (IEEE Internet Computing, 2003)
“Nearest neighbor pattern classification”* (IEEE Transactions on Information Theory)
Section 2.5 of e Elements of Statistical Learning,* by Trevor Hastie, Rob Tibshirani, and Jerry Friedman (2nd ed., Springer, 2009)
“Wrappers for feature subset selection,”* by Ron Kohavi and George John (Articial Intelligence, 1997)
“Similarity metric learning for a variable-kernel classier,”* by David Lowe (Neural Computation, 1995)
“Support vector machines and kernel methods: The new generation of learning machines,”* by Nello Cristianini and Bernhard Schölkopf (AI Magazine, 2002)
“A training algorithm for optimal margin classiers,”* by Bernhard Boser, Isabel Guyon, and Vladimir Vapnik (Proceedings of the Fifth Annual Workshop on Computational Learning theory, 1992)
“Text categorization with support vector machines,”* by orsten Joachims (Proceedings of the Tenth European Conference on Machine Learning, 1998)
Chapter 5 of An Introduction to Support Vector Machines,* by Nello Cristianini and John Shawe-Taylor (Cambridge University Press, 2000)
Case-Based Reasoning,* by Janet Kolodner (Morgan Kaufmann, 1993)
“Using case-based retrieval for customer technical support,”* by Evangelos Simoudis (IEEE Expert, 1992)
“Rise of the so aware machines” (Economist, 2013)
Modeling Legal Arguments* (MIT Press, 1991)
“Recombinant music: Using the computer to explore musical style” (IEEE Computer, 1991)
“Structure mapping: A theoretical framework for analogy”* (Cognitive Science, 1983)
“The man who would teach machines to think,” by James Somers (Atlantic, 2013)
“Unifying instance-based and rule-based induction”* (Machine Learning, 1996)
The Scientist in the Crib, by Alison Gopnik, Andy Meltzo, and Pat Kuhl (Harper, 1999)
“Least squares quantization in PCM”* (which later appeared as a paper in the IEEE Transactions on Information theory in 1982)
“Maximum likelihood from incomplete data via the EM algorithm,”* by Arthur Dempster, Nan Laird, and Donald Rubin (Journal of the Royal Statistical Society B, 1977)
Finding Groups in Data: An Introduction to Cluster Analysis,* by Leonard Kaufman and Peter Rousseeuw (Wiley, 1990)
“On lines and planes of closest t to systems of points in space”* (Philosophical Magazine)
“Indexing by latent semantic analysis”* (Journal of the American Society for Information Science, 1990)
“Matrix factorization techniques for recommender systems”* (IEEE Computer, 2009)
“A global geometric framework for nonlinear dimensionality reduction,”* by Josh Tenenbaum, Vin de Silva, and John Langford (Science, 2000)
Reinforcement Learning: An Introduction,* by Rich Sutton and Andy Barto (MIT Press, 1998)
Universal Artificial Intelligence,* by Marcus Hutter (Springer, 2005)
“Some studies in machine learning using the game of checkers”* (IBM Journal of Research and Development, 1959)
Learning from Delayed Rewards* (Cambridge University, 1989)
“Human-level control through deep reinforcement learning,”* by Volodymyr Mnih et al. (Nature, 2015)
“A cognitive odyssey: From the power law of practice to a general learning mechanism and beyond” (Tutorials in Quantitative Methods for Psychology, 2006)
“Practical guide to controlled experiments on the Web: Listen to your customers not to the HiPPO,”* by Ron Kohavi, Randal Henne, and Dan Sommereld (Proceedings of the irteenth International Conference on Knowledge Discovery and Data Mining, 2007)
Introduction to Statistical Relational Learning,* edited by Lise Getoor and Ben Taskar (MIT Press, 2007)
“Mining social networks for viral marketing” (IEEE Intelligent Systems, 2005)
Model Ensembles: Foundations and Algorithms,* by Zhi-Hua Zhou (Chapman and Hall, 2012)
“Stacked generalization,”* by David Wolpert (Neural Networks, 1992)
“Bagging predictors”* (Machine Learning, 1996)
“Random forests”* (Machine Learning, 2001)
“Experiments with a new boosting algorithm,” by Yoav Freund and Rob Schapire (Proceedings of the Thirteenth International Conference on Machine Learning, 1996)
“I, Algorithm,” by Anil Ananthaswamy (New Scientist, 2011)
Markov Logic: An Interface Layer for Articial Intelligence,* which I cowrote with Daniel Lowd (Morgan & Claypool, 2009)
Alchemy website, http:// alchemy.cs.washington.edu, also includes tutorials, videos, MLNs, data sets, publications, pointers to other systems
“Hybrid Markov logic networks,”* by Jue Wang and Pedro Domingos (Proceedings of the TwentyThird AAAI Conference on Artificial Intelligence, 2008)
“Integrating multiple learning components through Markov logic”* (Proceedings of the TwentyThird AAAI Conference on Artificial Intelligence, 2008)
“Extracting semantic networks from text via relational clustering,”* by Stanley Kok and Pedro Domingos (Proceedings of the Nineteenth European Conference on Machine Learning, 2008)
“Large-scale distributed deep networks,”* by Je Dean et al. (Advances in Neural Information Processing Systems 25, 2012)
“A general framework for mining massive data streams,”* by Pedro Domingos and Geo Hulten (Journal of Computational and Graphical Statistics, 2003)
“The machine that would predict the future,” by David Weinberger (Scientic American, 2011
“Cancer: The march on malignancy” (Nature supplement, 2014)
“Using patient data for personalized cancer treatments,” by Chris Edwards (Communications of the ACM, 2014)
“Simulating a living cell,” by Markus Covert (Scientic American, 2014)
“Breakthrough Technologies 2015: Internet of DNA,” by Antonio Regalado (MIT Technology Review, 2015)
“Cancer: A Computational Disease that AI Can Cure,” by Jay Tenenbaum and Je Shrager (AI Magazine, 2011)
“Love, actuarially,” by Kevin Poulsen (Wired, 2014)
Dataclysm, by Christian Rudder (Crown, 2014)
Total Recall, by Gordon Moore and Jim Gemmell (Dutton, 2009)
The Naked Future, by Patrick Tucker (Current, 2014)
“Privacy pragmatism” (Foreign Affairs, 2014)
The Second Machine Age, by Eric Brynjolfsson and Andrew McAfee (Norton, 2014)
“World War R,” by Chris Baraniuk (New Scientist, 2014)
“Transcending complacency on superintelligent machines,” by Stephen Hawking et al. (Huffington Post, 2014)
Nick Bostrom’s Superintelligence (Oxford University Press, 2014)
Radical Evolution (Broadway Books, 2005)
In What Technology Wants (Penguin, 2010)
Darwin Among the Machines, by George Dyson (Basic Books, 1997)
Life at the Speed of Light (Viking, 2013)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Master Algorithm References

About

Uh oh!

Releases

Packages

vijpandaturtle/master-algorithm-reading-list

Folders and files

Latest commit

History

Repository files navigation

Master Algorithm References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages