You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A line plot of all life expectancies shows a dramatic drop for one country in 1977 and another in 1992—the first corresponds to Cambodia, but the value (31.2) is not consistent with the actual life expectancy in Cambodia during the crisis in the 70s, which was around 20 years old!
It's an unexecuted Jupyter notebook (as we push with outputs only when finalized to avoid diff bloat).
When I look at the text data in this repo, I find the same: Cambodia in 1977 = 31.2
However, various sources report a life expectancy there in 1977 that was < 20.
For example: https://data.worldbank.org/country/cambodia
This is definitely the data on the Gapminder website at the time of download. This repo is transparent about where the data came from and traces how the current data frames arise from Excel spreadsheets. See the data-raw directory for detail.
However, I don't doubt there could be data quality problems! It should definitely NOT be used as an authoritative source for life expectancy. Others have pointed out similar problems in other issues.
The package is offered as a dataset for teaching and exampling data wrangling & vis. I, for one, have a lot of resources built around it. Altering a couple data points would cause huge diffs in many web resources, with questionable pedagogical gain.
With some hand-wringing, I've concluded that package stability does more good than updating it whenever someone finds a better or different estimate for specific data points.
Hello! I was pointed to this repo by this Twitter exchange:
https://twitter.com/R_Graph_Gallery/status/920074231269941248
I'm working on a lesson for my students, and took some inspiration from
https://python-graph-gallery.com/341-python-gapminder-animation/
... which uses your data.
A line plot of all life expectancies shows a dramatic drop for one country in 1977 and another in 1992—the first corresponds to Cambodia, but the value (31.2) is not consistent with the actual life expectancy in Cambodia during the crisis in the 70s, which was around 20 years old!
Have a look at my draft:
http://go.gwu.edu/engcomp2lesson4
It's an unexecuted Jupyter notebook (as we push with outputs only when finalized to avoid diff bloat).
When I look at the text data in this repo, I find the same: Cambodia in 1977 = 31.2
However, various sources report a life expectancy there in 1977 that was < 20.
For example: https://data.worldbank.org/country/cambodia
The other dip is Rwanda in 1992 = 23.6
But the World Bank gives 28.1
https://data.worldbank.org/country/rwanda
So I wonder: did something go awry when preparing this data set?
The text was updated successfully, but these errors were encountered: