Skip to content

jjelosua/DOGA_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Galician Official journal (DOGA) Scraper

Introduction

It seems that the easiest way to access the DOGA dispositions is through the DOGA search page filtering the desired dates.

This scraper was created and used as a socurce for the publication of this data journalism story

Script description

The script expects a year as an input parameter and scrapes all the available documents to the data folder (automatically created). It creates a folder with the year passed as an argument and stores the documents in two formats PDF and HTML.

If some unexpected behaviour is found the script logs the details inside the logs folder (automatically created)

Script requirements

Ruby script

  • require 'mechanize'
  • require 'fileutils'

Rake file

  • require 'pty' # To buffer out the stdout

Execution of the script

  • To run the script

    $ rake scrape:DOGA[2014]

About

Galician Official journal scraper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages