New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets to include #1

Open
leeper opened this Issue Jun 16, 2017 · 8 comments

Comments

Projects
None yet
6 participants
@leeper
Copy link
Owner

leeper commented Jun 16, 2017

Original twitter thread: https://twitter.com/thosjleeper/status/875668146358714368

@leeper leeper added the help wanted label Jun 16, 2017

@adamlauretig

This comment has been minimized.

Copy link

adamlauretig commented Jun 16, 2017

For event data, you could do a yearly slice from ICEWS (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28075), which has nice monthly (or daily) slices in it, and it seems like there's an R package for dealing with it: https://github.com/ahalterman/phoxy.

Additionally, since ICEWS is both intra- and interstate, you could do some neat network modeling of the interstate data.

@andrewheiss

This comment has been minimized.

Copy link

andrewheiss commented Jun 16, 2017

V-Dem is excellent for country-level data. There's an R package for accessing the WDI API too, which makes it super easy to get World Bank data

@andrewheiss

This comment has been minimized.

Copy link

andrewheiss commented Jun 16, 2017

Will Lowe plays around with SOTU addresses for teaching text analysis.

@adamlauretig

This comment has been minimized.

Copy link

adamlauretig commented Jun 16, 2017

Also, spurred by the above, in addition to the SOTU stuff (in quanteda), the comparative manifestos project has an R package w/API, ManifestoR.

@briatte

This comment has been minimized.

Copy link

briatte commented Jun 17, 2017

@adam3smith

This comment has been minimized.

Copy link

adam3smith commented Jun 17, 2017

Qualitative: Elizabether Saunders's JFK chapter active citations: http://doi.org/10.5064/F68G8HMM write-up of teaching with that here: https://qdr.syr.edu/qdr-blog/teaching-qualitative-data-example

@briatte

This comment has been minimized.

Copy link

briatte commented Jun 17, 2017

Oh, you seem to like cosponsorship data.

There's a lot of cosponsorship data for European parliaments in this repo:

https://github.com/briatte/parlnet

… but it's probably not the kind of (messy, complex, not-standard) data that you want to use with e.g. students in a teaching setting.

@conjugateprior

This comment has been minimized.

Copy link

conjugateprior commented Jun 17, 2017

On the text side:

  • UK parliamentary speeches and various other bits of Hansard are available via twfy.
  • New York Times leads and subjects, and also Congressional Bill titles for text classification examples are available as inside the maxent package or directly here.
  • I also teach with this Commons debate on abortion bara-data, analyzed by Bara et al. 2007.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment