Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 1.51 KB

File metadata and controls

21 lines (15 loc) · 1.51 KB

wikipedia-degrees-of-sep-philosophy

Degrees of freedom from 'philosophy' for Wikipedia articles. Looks at random articles, the top 100 articles, and the top articles in each category. See https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosophy for more info on the concept.

I used the code in this repository to create this blog post: https://petermattia.com/articles/2017/10/14/kevin-bacon-and-wikipedia.html.

There are two main python files:

  • wiki-philosophy.py: Scrapes Wikipedia using bs4 to create CSVs of 500 random Wikipedia pages, the top 100 pages, and the top 30 pages in selected categories (top pages taken from this list)
  • wiki-philosophy-analysis.py: Reads the csvs generated by wiki-philosophy.py and plots results in matplotlib and bokeh.

Sample results:

Degrees from "philosophy" by category

Top pages by category

Oct 16, 2017 update: As an exercise, I created a new branch in which I refactored the code to be more "pythonic". After re-running the script to test it, I found that my results were quite different from when I last ran the script a week ago. Looking at the wikipedia edit history, the pages are constantly changing. Thus, the CSVs and graphical results are different for each branch.