Pantheon biography data

Wang Cheng-Jun edited this page Dec 19, 2016 · 1 revision

计算传播学是计算社会科学的重要分支。它主要关注人类传播行为的可计算性基础,以传播网络分析、传播文本挖掘、数据科学等为主要分析工具,(以非介入地方式)大规模地收集并分析人类传播行为数据,挖掘人类传播行为背后的模式和法则,分析模式背后的生成机制与基本原理,可以被广泛地应用于数据新闻和计算广告等场景,注重编程训练、数学建模、可计算思维。

Clone this wiki locally

Amy Zhao Yu, Shahar Ronen, Kevin Hu, Tiffany Lu & César A. Hidalgo , Pantheon 1.0, a manually verified dataset of globally famous biographies, Scientific Data 3, Article number: 150075 (2016) ​doi:10.1038/sdata.2015.75 link

Table of Contents

method

http://pantheon.media.mit.edu/methods

download data:

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28201

pageviews_2008-2013.tsv

Plain Text - 142.3 MB - Jan 4, 2016 - 18 Downloads MD5: 1fa693514cdefd9d71315604f69c4d8d; Monthly pageviews for all individuals, across all languages, 2008-2013 TSV

pantheon.tsv

Tab-Delimited - 2.0 MB - Jan 4, 2016 - 23 Downloads MD5: c5ba6e1e5e5352f5469801f883e1559a; flattened data file with all individuals in Pantheon.

wikilangs.tsv

Plain Text - 13.3 MB - Jan 4, 2016 - 18 Downloads MD5: a2af0f1c125ba3d0202393e1ecaf2444; Language table linking Pantheon biographies to language editions of Wikipedia TSV