In [1]:
import pandas as pd
from math import exp, floor, log, log10
import numpy as np

In [2]:
def format_integer(x):
    """Round an integer to two significant digits"""
    cut = int(floor(log10(abs(x))))-1
    ret = int(round(x, -cut))
    return format(ret, ",d")

def print_odds(d):
    """Convert d, the differnce in log-likelihood (or equivalently, the 
    log of the odds ratio), return a string indicating the conventional
    odds ratio"""
    marker = ' '
    if abs(d) > log(1.0e15):
        d = d/abs(d) * log(1.0e15)
        marker = '>'
    if d>0:
        return "{0}{1:s}:1".format(marker, format_integer(exp(d)))
    else:
        return "{0}1:{1:s}".format(marker, format_integer(exp(-d)))

In [3]:
def final_results():
    df = pd.DataFrame({'name' : [ 'Federalist #{0:d}'.format(n) for n in (18, 19, 20, 49, 50, 51, 52, 53, 54, 55, 56, 57, 62, 63) ],
                       'Naive Bayes Poisson'    : [ 47, 31, 10, 22, 33, 70, 42, 45, 38, 14, 46, 44, 34, 49 ],
                       '(Bayes)^2 Poisson'      : [ 61, 39, 19, 17, 28, 55, 40, 40, 33, 14, 40, 38, 33, 49 ],
                       'Decoupled MCMC'         : [ 19, 15,  7, 11, 11, 23, 14, 19, 12,  3, 10,  8, 11, 12 ],
                       'Coupled MCMC'           : [ 14, 13,  9,  5,  4, 11,  6,  8, 5, 0.5,  7,  6,  7,  7 ],
                       'Coupled MCMC (juiced)'  : np.array([ 22, 19, 13,  7,  6, 16, 10, 14,  8,  2, 10,  8, 11, 12 ])/1.75,
                       'M&W Poisson'            : [ 20, 19,  7, 18, 18, 33, 23, 22, 23,  7, 11, 26, 27, 32 ],
                       'M&W N.B. (depreciated)' : [  7,  7,  3,  8,  9, 13, 10, 10,  9,  4,  5, 10, 10, 11 ]
                      })

    def my_fmt(x):
        if type(x) == str:
            return x
        return print_odds(x)
    
    return df[['name', 'Naive Bayes Poisson', '(Bayes)^2 Poisson', 'Decoupled MCMC',
               'Coupled MCMC', 'Coupled MCMC (juiced)', 'M&W N.B. (depreciated)']].style.format(my_fmt).set_properties(**{'text-align': 'right'})

# Comparison and Summary

We can appraise the different models by comparing their quoted odds that Madison wrote each of the Federalist papers:

In [4]:
final_results()

Unnamed: 0,name,Naive Bayes Poisson,(Bayes)^2 Poisson,Decoupled MCMC,Coupled MCMC,Coupled MCMC (juiced),M&W N.B. (depreciated)
0,Federalist #18,">1,000,000,000,000,000:1",">1,000,000,000,000,000:1","180,000,000:1","1,200,000:1","290,000:1","1,100:1"
1,Federalist #19,"29,000,000,000,000:1",">1,000,000,000,000,000:1","3,300,000:1","440,000:1","52,000:1","1,100:1"
2,Federalist #20,"22,000:1","180,000,000:1","1,100:1","8,100:1","1,700:1",20:1
3,Federalist #49,"3,600,000,000:1","24,000,000:1","60,000:1",150:1,55:1,"3,000:1"
4,Federalist #50,"210,000,000,000,000:1","1,400,000,000,000:1","60,000:1",55:1,31:1,"8,100:1"
5,Federalist #51,">1,000,000,000,000,000:1",">1,000,000,000,000,000:1","9,700,000,000:1","60,000:1","9,300:1","440,000:1"
6,Federalist #52,">1,000,000,000,000,000:1",">1,000,000,000,000,000:1","1,200,000:1",400:1,300:1,"22,000:1"
7,Federalist #53,">1,000,000,000,000,000:1",">1,000,000,000,000,000:1","180,000,000:1","3,000:1","3,000:1","22,000:1"
8,Federalist #54,">1,000,000,000,000,000:1","210,000,000,000,000:1","160,000:1",150:1,97:1,"8,100:1"
9,Federalist #55,"1,200,000:1","1,200,000:1",20:1,1:1,3:1,55:1


Interesting, every single model (save the support vector machine) predicts that Madison wrote each and every one of the disputed papers, including the joint ones.  This is precisely as Madison claimed.  I find relatively weak support for Federalist #55, but Madison's notes actually suggest that he borrowed much of the material for that essay from Sir William Temple, so this conclusion is reasonable.  Federalist numbers 62 and 63 are most in doubt by historians; Mosteller and Wallace concluded that the evidence for Madson having written these is "tremendous."  I find substantially lower evidence -- probably beyond a reasonable doubt, but certainly not iron-clad.

In general, it's hard to compare my results to those from Mosteller & Wallace, since I used 243 stop words while they used only 30 words in their final analysis.  Most of the evidence comes not from rare, highly distinctive words, but instead from the more mundane high-frequency words.  This is a true data-mining calculation in the sense that it is the accumulation of lots of tiny pieces of evidence that gives us such strong odds at the end of our calculations.  Since I used almost 10 times as many words as Mosteller and Wallace, with all else held equal I'd naively expect my calculations to yield *substantially* higher certainties; at a minimum, I'd expect to see something like their odds raised to the power 1.5 or 2.

In fact, the odds I find are typically much smaller, even than Mosteller & Wallace's most conservative estimate for their depreciated odds.  Considering my much larger sample of words, I think it's unavoidable to conclude that, even after such a historic and careful study, Mosteller and Wallace significantly over-stated their accuracy.  (This is not a criticism... accomplishing what they did with typewrites, mechanical adding machines, and slide rules was quite a feat!  And the MCMC integration method I used to solve the problem wasn't even invented until 1970.)

Among my own results, I think it's interesting and very telling that the quoted certainty decreased appreciably with every improvement to the model, often by orders of magnitude.  In this sense, statistics has something in common with the Dunning-Krueger effect: it's easy to be certain, but much more difficult to be correct.  I do believe the accuracy of the "Coupled MCMC" model, and I think the tests I've shown confirm its trustworthiness.  Of course it is possible that I've blundered somehow, and that a future improvement to the calculation will decrease the odds even further.

I also think it's also interesting to note that it didn't help at all to try and juice the model to squeeze more information out from it.  The careful Bayesian analysis we did already performed quite well.  I suppose there's a lesson here not to be too greedy!  Nonetheless, there may be opportunities to experiment with different likelihood functions, priors, and stopwords to find even more precision.  There are certainly opportunities to make the code run faster.

Nonetheless, this is the best determination of the authorship of the disputed Federalist papers, at least that I'm aware of, and I'll leave it here for now!