# Not All Models Are Wrong
## Peter Dresslar, Arizona State University
- CAS 520 Bergin
- CAS 598 Damerow
- CAS 598 Wei

This week, with the start of two foundational Masters-level Complexity Sciences courses, the same quotation was shared in each as the central basis of the week's discussion to open our exploration of model design and emergent phenomena.

"All models are wrong. Some are useful."[^1]

This famous saying—promulgated by prominent twentieth-century statistician George Box—is one of the more widely-known commentaries on modeling discipline: so much so that it has its own wikipedia page.[^2] However, speaking at least from a contemporary perspective, one wonders how helpful it is to welcome new students and prospective pracitioners to an exploration in complexity science. 

In other words, is this quotation useful?

<p align="center">. . .</p>

Before we get ahead of ourselves, it seems appropriate to discuss the context of the aphorism[^3] in which it was originally published.[^4] Here it is quoted from Box's article, *Robustness in the Strategy of Scientific Model Building*, published in 1979. In the following quote, the capitalization and lack of internal punctuation are both taken from the source text.

>ALL MODELS ARE WRONG BUT SOME ARE USEFUL ... Now it would be very remarkable if any system existing in the real world could be exactly represented by any simple model. However, cunningly chosen parsimonious models often do provide remarkably useful approximations. For example, the law PV = RT relating pressure P, volume V and temperature T of an "ideal" gas via a constant R is not exactly true for any real gas, but it frequently provides a useful approximation and furthermore its structure is informative since it springs from a physical view of the behavior of gas molecules. For such a model there is no need to ask the question "Is the model true?". If "truth" is to be the "whole truth" the answer must be "No". The only question of interest is "Is the model illuminating and useful?".[^5] 

Interestingly, there are actually two seperate ideas hinted at in the discussion here. The first is the difficulty in replicating real-world conditions in the form of model math with absolute fidelity. The second is the idea that the value of a model is independent of such fidelity. Each of these two ideas is developed at length in chapters of the book, though the latter one is tempered through rigorous discussion of methods for model "Robustification."[^6] As Box was a statistician, those Robustification methods are unsurprisingly rooted deeply in analytical methods from that field.

Box would go on to utter the phrase in various forms at conferences and other fora, seemingly as sort of a "catch phrase" that caught on. In his writings and from the discussion of his life on his Wikipedia entry and other authors, it seems very clear that Box was both an iconoclast and—to leverage this author's lived experience—very much a man of his place and time. The pithiness of "all models are wrong" suited those places and times quite well. 

But places and times have changed.

Some academics have commented on the quotation over the years, with at least a few of them seeming to take a bit of issue with it.

>"Finally it does not seem helpful just to say that all models are wrong. The very word model implies simplification and idealization. The idea that complex physical, biological or sociological systems can be exactly described by a few formulae is patently absurd. The construction of idealized representations that capture important stable aspects of such systems is, however, a vital part of general scientific analysis..." [^7]

This commentary, by David Cox, appears in the commentary from a well-cited paper by Chris Chatfield called *Model Uncertainty, Data Mining and Statistical Inference.* Intriguingly, the comment somewhat mixes up elements of the Box discussion, since Box actually refers to an "idealized system" in his prose. It would seem, perhaps, that Cox did not actually have access to the Box.[^8]

Still, his point is well-taken. What else would models exist to do, except to translate the systems of the world around us into "logical devices" through which we can generate understanding and predictive ability? And if such models are providing such utility, to the degree that we are able to trust them, how could we call them wrong?

<p align="center">...</p>

Roughly fifty years after Box's first utterance of the saying, we are unquestionably living in a new regime of thinking; or, by the estimate of some, a new regime of *un*-thinking. Readers will need little reminder of society's changing attitudes toward basic sciences and even the idea of facts themselves. 

Whether or not Box's quotation is useful... it is certainly wrong.

Let's demonstrate this. To start, we'll use a model style that is a bit more familiar to those practicing simulations along the lines of agent-based modles.


In [3]:
import datetime
import time

however_many_years = 99
todays_year = datetime.datetime.now().year
start_year = todays_year  # or feel free to choose your own
dt = 7  # years to step

for year in range(start_year, start_year + however_many_years, dt):
    if year % 4 == 0 and (year % 100 != 0 or year % 400 == 0):
        print(f"\r{year} is a leap year", end="", flush=True) # no \n
    else:
        print(f"\r{year} is not a leap year", end="", flush=True) # no \n
    time.sleep(1)  #  Seconds. For human-utility purposes.

2123 is not a leap year

This model is a flawless reprentation of the Gregorian calendar, a real-world, everyday system used by billions of people. We might estimate that this model has faint but non-zero usefulness. 

It is not wrong.

But what about more complex, real-world models?

<p align="center">. . . </p>

In the article, *The adaptive value of morphological, behavioural and life-history traits in reproductive female wolves*, Daniel Stahler *et al* use a comprehensive dataset of wolf behaviors gathered from over a decade of observations at Yellowstone National Park to build a comprehensive understanding of wolf family and social structures.[^8] 

The article describes a model tailored to these data a Generalize Linear Mixed Model, or GLMM. A GLMM uses a linking function to connect random and fixed effects within a system. Bolker *et al* helpfully point out the advantage of the approach, communicating that "(n)onnormal data such as counts or proportions often defy classical statistical procedures. Generalized linear mixed models (GLMMs) provide a more flexible approach for analyzing nonnormal data when random effects are present."[^9] This description of GLMMs is not only a helpful description of the model class itself, but also a useful view into the kinds of concrete challenge facing modelers, which is where we started in the first place!

Since this is a Python notebook, we can easily outline the GLMM from the Stahler article as follows:


In [4]:
# GLMM Simplified
# See, for instance, https://github.com/junpenglao/GLMM-in-Python/blob/master/GLMM_in_python.ipynb

import statsmodels.formula.api as smf
import pandas as pd

# Example wolf data, intended to mimic the Stahler data

data = pd.DataFrame({
    'pups_born': [5, 4, 6, 7, 3, 2, 6, 4, 5, 3],
    'body_mass': [34, 37, 42, 47, 29, 44, 39, 36, 41, 33],
    'pack_id': ['A', 'A', 'B', 'B', 'C', 'C', 'D', 'D', 'E', 'F']
})

# Fit a Generalized Linear Mixed Model (GLMM)
model = smf.mixedlm('pups_born ~ body_mass',  # linking function!
                    data=data, 
                    groups=data['pack_id'])  # note that Stahler2013 uses a Poisson GLMM; here we simplify to Gaussian distribution for simplicity

result = model.fit()
print(result.summary())

         Mixed Linear Model Regression Results
Model:            MixedLM Dependent Variable: pups_born
No. Observations: 10      Method:             REML     
No. Groups:       6       Scale:              1.3098   
Min. group size:  1       Log-Likelihood:     -18.1904 
Max. group size:  2       Converged:          Yes      
Mean group size:  1.7                                  
--------------------------------------------------------
           Coef.  Std.Err.    z    P>|z|  [0.025  0.975]
--------------------------------------------------------
Intercept  1.660     4.515  0.368  0.713  -7.188  10.509
body_mass  0.073     0.118  0.621  0.535  -0.158   0.305
Group Var  0.981     1.925                              



While the model (or, perhaps more aptly, model of a model) above is extremely simplified and departs from the statistical controls presented in the paper's analysis, even this version is *still* somewhat useful in order to conceptualize how the authors came to their conclusions. While the paper acknowledges the possibility of observational errors and uncertainties, it goes to great lengths in understanding, documenting, and controlling those errors.

And those conclusions seem, to the wolf-layperson, quite striking:

> (O)ur study our study clarifies how life history, sociality and ecological conditions interact in cooperative breeders and ranks the adaptive value of traits in promoting individual fitness in competitive and stochastic environments... In wolves, it appears that individual performance is influenced more by phenotypes than environmental conditions, and it would be valuable to know if this were true in other taxa. [^10]

The 2013 work has since been cited 130 times, according to Google Scholar, with at least a few citations that confirm the original findings. [^11]

So we have a model that is definitely useful in applying ecological controls to a species of animal that needs special management to survive in anthropocenic times. It matches actually observed phenomena of the world to a degree that it can precisely measure. And it even has the power of extensive peer review behind it: peer review not being flawless, of course, but also being a system through which we human beings leverage our collective understanding to improve the verification and contextualization of new knowledge.

How could we call this model wrong?

- Grimm



"All models are wrong but some are useful" is still evocative, pithy as ever, but not particularly useful in 2025. There are other, clearer ways to say the things that Box was trying to communicate, and those other ways would have the added advantage of avoiding the discrediting of science at a time when disinformation is, to put it mildly, problematic.

Not all models are wrong, many models are helpful, and making even better models is a great idea (that should be funded).

<p align="center">. . .</p>

### Notes

[^1] George Box. Various sources and transctriptions. ca. 1976-1979.

[^2] [Wikipedia](https://en.wikipedia.org/wiki/All_models_are_wrong).

[^3] Wikipedia calls the saying an aphorism, which seems appropriate for a statement so apparently widely used and widely poorly-sourced.

[^4] Sourcing the print origin of the quotation is complicated by the fact that Box first printed *part* of it in his earlier article, *Science and Statistics*, in 1976 (emphasis mine): 

> Parsimony ... Since **all models are wrong** the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.

One could argue that this is the superior introduction of the concept! But, alas, it does not contain the entirety of the famous quotation.

[^5] It is perhaps worth noting that the model that Box uses in his example is itself not complete. The Ideal Gas Law itself includes an important additional factor, $n$, which hails from Avogadro's law and represents the "number of Moles" of the substance being measured (see: Libretexts). So, the quote "a physical view of the behavior of gas molecules" is somewhat undercut by their complete abscence from the formula Box retells. Perhaps by providing a wrong model, Box was cleverly reinforcing his point.

[^6] Box 1979, Page 204.

[^7]

[^8] A turn of phrase that might be seen as fortunate, or unfortunate, depending upon the perspective of the reader. Nonetheless, I believe it to be true.

[^] Stahler 2013.

[^] Bolker *et al* 2009.

[^] Stahler 2013. Page 232.

[^] Google Scholar. The confirming works were retrieved from the website [scite.ai](scite.ai) in 2025. They are Clement 2024 and Cassidy 2017.



### References

An evaluation of potential inbreeding depression in wild Mexican wolves
Clement1, Oakleaf2, Heffelfinger3 et al. 2024J Wildl Manag

Sexually dimorphic aggression indicates male gray wolves specialize in pack defense against conspecific groups
Cassidy1, Mech2, MacNulty3 et al. 2017Behavioural Processes