Neil Saunders compiled 2021-03-19 08:11:52
A brief exploration of Florence Nightingale’s Crimean War dataset, inspired by “Florence Nightingale: Data Viz Pioneer”, an episode of Cautionary Tales premiered by 99% Invisible.
We can get the dataset Nightingale
from the R package
histData.
It’s quite small (24 rows) and looks like this:
Date | Month | Year | Army | Disease | Wounds | Other | Disease.rate | Wounds.rate | Other.rate |
---|---|---|---|---|---|---|---|---|---|
1854-04-01 | Apr | 1854 | 8571 | 1 | 0 | 5 | 1.4 | 0.0 | 7.0 |
1854-05-01 | May | 1854 | 23333 | 12 | 0 | 9 | 6.2 | 0.0 | 4.6 |
1854-06-01 | Jun | 1854 | 28333 | 11 | 0 | 6 | 4.7 | 0.0 | 2.5 |
1854-07-01 | Jul | 1854 | 28722 | 359 | 0 | 23 | 150.0 | 0.0 | 9.6 |
1854-08-01 | Aug | 1854 | 30246 | 828 | 1 | 30 | 328.5 | 0.4 | 11.9 |
1854-09-01 | Sep | 1854 | 30290 | 788 | 81 | 70 | 312.2 | 32.1 | 27.7 |
1854-10-01 | Oct | 1854 | 30643 | 503 | 132 | 128 | 197.0 | 51.7 | 50.1 |
1854-11-01 | Nov | 1854 | 29736 | 844 | 287 | 106 | 340.6 | 115.8 | 42.8 |
1854-12-01 | Dec | 1854 | 32779 | 1725 | 114 | 131 | 631.5 | 41.7 | 48.0 |
1855-01-01 | Jan | 1855 | 32393 | 2761 | 83 | 324 | 1022.8 | 30.7 | 120.0 |
1855-02-01 | Feb | 1855 | 30919 | 2120 | 42 | 361 | 822.8 | 16.3 | 140.1 |
1855-03-01 | Mar | 1855 | 30107 | 1205 | 32 | 172 | 480.3 | 12.8 | 68.6 |
1855-04-01 | Apr | 1855 | 32252 | 477 | 48 | 57 | 177.5 | 17.9 | 21.2 |
1855-05-01 | May | 1855 | 35473 | 508 | 49 | 37 | 171.8 | 16.6 | 12.5 |
1855-06-01 | Jun | 1855 | 38863 | 802 | 209 | 31 | 247.6 | 64.5 | 9.6 |
1855-07-01 | Jul | 1855 | 42647 | 382 | 134 | 33 | 107.5 | 37.7 | 9.3 |
1855-08-01 | Aug | 1855 | 44614 | 483 | 164 | 25 | 129.9 | 44.1 | 6.7 |
1855-09-01 | Sep | 1855 | 47751 | 189 | 276 | 20 | 47.5 | 69.4 | 5.0 |
1855-10-01 | Oct | 1855 | 46852 | 128 | 53 | 18 | 32.8 | 13.6 | 4.6 |
1855-11-01 | Nov | 1855 | 37853 | 178 | 33 | 32 | 56.4 | 10.5 | 10.1 |
1855-12-01 | Dec | 1855 | 43217 | 91 | 18 | 28 | 25.3 | 5.0 | 7.8 |
1856-01-01 | Jan | 1856 | 44212 | 42 | 2 | 48 | 11.4 | 0.5 | 13.0 |
1856-02-01 | Feb | 1856 | 43485 | 24 | 0 | 19 | 6.6 | 0.0 | 5.2 |
1856-03-01 | Mar | 1856 | 46140 | 15 | 0 | 35 | 3.9 | 0.0 | 9.1 |
The dataset is not tidy.
- each cause has its own column, rather than columns for cause + value
- columns are a mixture of rates and absolute values
We can select the rate columns and use pivot_longer
to convert to long
format.
Date | Month | Year | Cause | Rate |
---|---|---|---|---|
1854-04-01 | Apr | 1854 | Disease | 1.4 |
1854-04-01 | Apr | 1854 | Wounds | 0.0 |
1854-04-01 | Apr | 1854 | Other | 7.0 |
1854-05-01 | May | 1854 | Disease | 6.2 |
1854-05-01 | May | 1854 | Wounds | 0.0 |
1854-05-01 | May | 1854 | Other | 4.6 |
The help page, ?Nightingale
provides some R code to generate polar
area and line charts but it’s somewhat dated and cumbersome. Let’s give
it the tidyverse treatment.
The “rose chart”, also called (incorrectly) a Coxcomb chart, or polar area chart, is a bar chart projected onto polar coordinates.
We can generate something very similar to Nightingale’s original chart like this:
We can’t simply remove the polar coordinates, as this will place some
months in the wrong position on the basic column chart. So now we use
Date
on the x-axis.
We can also indicate the period before the arrival of the Sanitary Commission using grey shading.
We can also show the data as a line chart.
The Cautionary Tales podcast episode concludes that deaths from disease were falling before the arrival of the Sanitary Commission, and that this is obscured - perhaps deliberately - by the choice of the polar area chart.
It’s a fair point. However, what we can’t know is what would have happened through 1855 in the absence of the Sanitary Commission. Is there a hint of the same “double peak”, with a seasonal cycle, but smaller? Is that evidence for the effect of sanitation improvement?