Skip to content
Two-day workshop on scraping legislative data, organised by URFIST Bordeaux in 2018.
R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
1-lagasafn
2-jorf init Nov 21, 2018
3-cop21
4-qosd
README.md
dependencies.r

README.md

R code examples to teach basic Web scraping with rvest and related packages.

Used at a two-day workshop in November 2018: refer to the introductory slides, in French, for details.

Please report any bugs or errors in the issues of this repository, or email me.

DEMOS

  1. lagasafn · legal cross-references in Icelandic law
  2. jorf · XML field extraction from the French Official Journal
  3. cop21 · word extraction from the UNCC Paris Accord
  4. qosd · keyword co-occurrence in French parliamentary questions

Projects mentioned but not included in the repository:

Slides shown but not included in the repository (available on request):

  • "Large-scale legislative data collection from online sources" (2016)
  • "Web scraping et APIs avec R" (2017)

HOWTO

  1. Run the dependencies.r script to install all required packages.
  2. Run each code folder separately. Each has its own .Rproj file.

THANKS

  • Sabrina Granger and Isabelle Scarpat-Bouvet for excellent logistics.
  • Thomas J. Leeper for his word_count function, used in the cop21 example.
  • Emiliano Grossman for inspiring the qosd example.
You can’t perform that action at this time.