Map of Contemporaries
Check it out: https://ybogdanov.github.io/history-timeline/
To stay updated, like us on Facebook.
It is inspired by Wait But Why's blog post about Horizontal History — the idea of taking a "horizontal" slice of time and tracing the lifetimes of all the famous people living at that time. It certainly gives you a fresh perspective on some particular era (a feel of that time, so to say), unlike the conventional “vertical” approach of learning who came after whom and what happened after what. I can imagine, how much fun the blog’s author had drawing all of those lifetime rectangles in the Numbers spreadsheet, but simple graphics have their limitations, and a lot of famous people simply didn’t make “the cut”. I wanted to play with the concept in a bigger scale. The motivation of this project is to make the idea expandable, interactive, and crowd-sourced, by leveraging from the modern software engineering tools and approaches.
The timeline uses the data set from Pantheon, which is a project from the Macro Connections group at The MIT Media Lab. They provide an excellent list of 15,343 historical figures each marked with a Historical Popularity Index (HPI) that helps us to show the most famous people at the top of the timeline. I scraped the Wikipedia offline dump to get the death dates.
What you see is an open-source MVP that illustrates what's possible with relatively small software engineering effort. It is fun to use as is, but there is a ton of ideas of how the project can be improved further. I will gradually commit to the project, proportionally to the interest of the community.
I hope to engage other people in contributing and helping to evolve the visualization and data mining aspects of the tool. Here's what you can help with:
- Visualizations. Having the data already provided to the browser, we can improve the way it is displayed.
- Data mining. There is a common data mining pipeline setup; adding new sources was taken into account. The quality of the existing sources scraping also can be improved.
- Content. Besides the programmatic part of data collection, there's a possibility to tweak the data manually, making corrections, adding new historical figures, etc.
There are a lot of glitches right now, such as zero aged people and some obviously popular figures missing on the timeline. The whole point is that you can fix all of it by yourself!
Ideas for improvement
You can follow the issues link and subscribe for notifications so you'll know when the feature is ready, or there is some discussion.
- Issue #3 Plot the most impactful historical events from here, so you can refer the events to the people lived at that time
- Issue #4 Possibly, we can also display some personal events right on the people's rectangles
- Issue #5 Select a particular person so that her lifespan is zoomed 90% horizontally. In such way we can better inspect the life and its intersection with other people
- Issue #6 Interactive overlay dialog that shows quick info of the person
- Issue #7 Hovering on a ruler can display a vertical cursor going down that highlights the people alive at that particular time; it may also show the age near every person at a point
- Issue #8 We can show interactively how people relate to each other in terms of different generations
Things that have to be improved:
- Issue #10 Country filter is awful. There are too many countries and coloring them or filtering by the desired country is very inconvenient at the moment. Any ideas?
Other ideas I've heard that I’m not really into, but curious to see the implementations:
- Have an interactive map that shows people who are visible in the viewport
- iPad etc. native implementation
- Displaying avatars over the timeline rectangles
Note: at the early stage of the project development do not expect to see a fancy stuff like Gulp, Webpack, ReactJS, Babel, Elm, etc. I made it simple and stupid, focusing on the functionality first. The infrastructure may be improved in the future if the project will grow.
How it works - data pipeline
First, there is a data mining pipeline — а set of Python scripts that manipulate files (mostly JSON ones) through a multiple steps. Here is a diagram that illustrates the process:
- import_pantheon.py transforms Pantheon data format into the internal one. It also attempts the normalization of names using the large map of redirects extracted from Wikipedia (
- txt_to_json.py converts list of people that are listed manually in manual.txt
- union.py combines multiple lists, in our case the data from Pantheon and the manual list of people
- intersect.py maps the list of people we've got from the sources with the data we scraped from Wikipedia (currently, we map death dates and birth dates if they are missing)
- final.py does final sorting by popularity and end normalization, also has optional limiting
- wrap_jsonp.py prepares the data to be safely deliverable to a web browser
I will describe the process in more detail once there will be people who are interested to contribute.
For development, I use this one:
ruby -run -ehttpd . -p8000
- There is a single index.html file that renders the main page
- less is used to render css
- d3 for data visualization
- The data is loaded using JSONP and is available the in
In case you notice that some important figure is missing, or birth/death dates are incorrect, you can edit the data/sources/manual.txt file to fix it and then open a pull request.
The project uses a few data files that are too large to store in Git. I use git-lfs for larger files.
If you wanted to work with data and saw something like this, it means you don't have a git-lfs plugin installed:
$ make data bzip2 -dk data/sources/pantheon.json.bz2 bzip2: data/sources/pantheon.json.bz2 is not a bzip2 file. make: *** [data/sources/pantheon.json] Error 2
If you have git-lfs installed, then the data files will be downloaded right when the repository is cloned. Otherwise, there will be a text "pointers" in place of data files. To download them, you have to fetch them with the git-lfs tool. In case you install git-lfs after you cloned the repository, you can use
git lfs pull command to replace "pointers" with the actual files.
$ git lfs ls-files 1cc7f8e11e - data/redirects_wiki.json.bz2 a7003eb432 - data/sources/pantheon.json.bz2 097d0890e6 - data/wiki.json.bz2
The Wikipedia data is downloaded from here.
The dump is about 52G, so most probably you will want to store it on an external hard drive. You can safely make a symlink of that large file to a project directory,
data/sources/wikipedia.xml is gitignored.
ln -sf YOUR_STORAGE/enwiki-20151201-pages-articles.xml `pwd`/data/sources/wikipedia.xml
Licensed under GPLv2
Map of Contemporaries is a visualization of the world in famous people’s lifespans. Copyright (C) 2016 Yuriy Bogdanov This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.