# Chapter 1 The Golem of Prague


## Section 1.1 Statistical golems

Scientists make golems.

- McElreath draws this metaphor from Collins and Pinch (1998), _The Golem: What You Should Know about Science_. It is very similar to E. T. Jaynes’ 2003 metaphor of statistical models as robots, although with a less precise and more monstruous implication. 
- Jaynes, E. T. (2003). _Probability Theory: The Logic of Science._ Cambridge University Press.

As George Box put it, "all models are wrong, but some are useful." Statistical models are engineered constructs for some purpose.

The golem is a prosthesis, doing the heavy duty work for us, but there are no wisdom in it. It does not discern whether the context is appropriate for its answers. It just knows its own procedure. Viewed this way, statistics is neither mathematics nor a science, but rather a branch of engineering. 

And like engineering, a common set of design principles and constraints produces a great diversity of specialized applications. This is what students in their introductory statistics get confused about. All the golems are preconstructed for them, each having a particular purpose. It is hard to grasp the unity of these procedures.

Moreover, these things are not enough for research. They can fail in unpredictable ways, and they are not flexible to adapt. Furthermore, statistical golems do not understand cause and effect. 

What we need is some unified theory of golem engineering, a set of principles for designing, building, and refining special-purpose golems. But this theory is never taught anywhere at universities.

![golem-decision-tree](../images/golem-decision-tree.png)


## Section 1.2 Statistical rethinking

Learning Outcomes: to understand why doing science is not doing logical falsification.

We need to understand the computational inner workings of the golems. And we also need to pass the conceptual obstacles with how to define statistical objectives and interpret statistical results. We need some statistical epistemology, an appreciation of how statistical models relate to hypotheses and the natural mechanisms of interest.

Karl Popper argued that science advances by falsifying hypotheses. So maybe the gold standard of statistical procedures is to falsify hypotheses. But that's a kind of folk Popperism. In fact, deductive falsification is impossible in nearly every scientific context. 

1. Hypotheses are not models. Many hypotheses correspond to one model, and many models correspond to one hypothesis. 
2. Measurement matters. Sometimes another observer will debate our methods and measures. Sometimes they are right.

If you still believe science does often work, then knowing that it doesn't work via falsification will enlighten you to do better science.

**Rethinking: is NHST falsificationist? (and is Bayesian inference inductive?)**
- [Confirmationist and falsificationist paradigms of science](https://statmodeling.stat.columbia.edu/2014/09/05/confirmationist-falsificationist-paradigms-science/) by Andrew Gelman (2014)
- [Philosophy and the practice of Bayesian statistics](http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf) by Gelman and Shalizi (2014)


**1.2.1. Hypotheses are not models**

One hypothesis can derive several process/causal models. You can use such models to probe their causal implications. Sometimes these probes reveal, before we turn to statistical inference, that the model cannot explain a phenomenon of interest. In order to challenge process models with data, they have to be made into statistical models, but statistical models can correspond to many causal models.

**Rethinking: Entropy and model identification.** Nature loves entropy. More in Chapter 10. "The practical implication is
that one can no more infer evolutionary process from a power law than one can infer developmental
process from the fact that height is normally distributed. This fact should make us humble about what
typical regression models—the meat of this book—can teach us about mechanistic process. On the
other hand, the maximum entropy nature of these distributions means we can use them to do useful
statistical work, even when we can’t identify the underlying process. Not only can we not identify it,
but we don’t have to."

**1.2.2. Measurement matters.**

We once had an hypothesis $H_0: \text{All swans are white}$. It only took one observation of black swan after we discovered Australia to reject this. But generally seeking disconfirming evidence cannot be as powerful as the swan story makes it appear.

&ensp;&thinsp;&ensp;&thinsp;_1.2.2.1. Observation error._ The ability to measure a hypothetical phenomenon is often in question as much as the phenomenon itself. If there is only one black swan in the whole earth, it's hard to disprove $H_0$ even if it only takes one black swan.

&ensp;&thinsp;&ensp;&thinsp;_1.2.2.2 Continuous hypotheses._ Another problem for the swan story: most interesting scientific hypotheses are rather of the kind $H_0: \text{80% of the swans are white.}$

**1.2.3. Falsification is consensual.** Scientific communities argue towards consensus about the meaning of evidence. Falsification is not logical, but some textbooks misrepresent the history (of arguments) so it appears like logical falsification. Such revisionism may make science as an easy target by promoting an easily attacked model of scientific epistemology. And it may hurt the public by exaggerating the definitiveness of scientific knowledge.

## Section 1.3 Tools for golem engineering

The goal here is to reduce chances of easily wrecking Prague. Make no mistake, you will wreck Prague eventually. But if you're a good golem engineer, you'll notice the destruction and figure out why, then your next golem won't be as bad. 

We want our models for several purposes: 
- designing inquiry, 
- extracting information from data, and 
- making predictions. 

We have tools including Bayesian data analysis, model comparison, multilevel models, and graphical causal models.

**1.3.1. Bayesian data analysis.** How should we use data to learn about the world? Bayesian data analysis takes a question in the form of a model and uses logic to produce an answer in the form of probability distributions. Why does the Bayesian way make more sense? When Galileo first discovered the ring of Saturn with a primitive telescope, he observed some blob, not the clear shape (uncertainty). The sampling procedure here is always deterministic, but the blob always existed there until greater technology came into being. The resampling way of thinking does not make sense, but we can still use probability to help modeling the uncertain shape of Saturn.

Note that there is a tradition of using Bayesian tools as a normative description of rational belief, a tradition called **Bayesianism**. This book neither describes nor advocates it. 

- Further reading: [Rational Decisions](https://press.princeton.edu/books/paperback/9780691149899/rational-decisions) by Ken Binmore, 2009.