Skip to content

Commit

Permalink
Add first scraper which scrapes the archieve for test purposes
Browse files Browse the repository at this point in the history
  • Loading branch information
milafrerichs committed Jan 28, 2015
1 parent 2e7b0a3 commit 4567251
Showing 1 changed file with 12 additions and 22 deletions.
34 changes: 12 additions & 22 deletions scraper.rb
@@ -1,24 +1,14 @@
# This is a template for a Ruby scraper on Morph (https://morph.io)
# including some code snippets below that you should find helpful
require 'wombat'
require 'scraperwiki'

# require 'scraperwiki'
# require 'mechanize'
#
# agent = Mechanize.new
#
# # Read in a page
# page = agent.get("http://foo.com")
#
# # Find somehing on the page using css selectors
# p page.at('div.content')
#
# # Write out to the sqlite database using scraperwiki library
# ScraperWiki.save_sqlite(["name"], {"name" => "susan", "occupation" => "software developer"})
#
# # An arbitrary query against the database
# ScraperWiki.select("* from data where 'name'='peter'")
class BuergerbueroScraper
include Wombat::Crawler

# You don't have to do things with the Mechanize or ScraperWiki libraries. You can use whatever gems are installed
# on Morph for Ruby (https://github.com/openaustralia/morph-docker-ruby/blob/master/Gemfile) and all that matters
# is that your final data is written to an Sqlite database called data.sqlite in the current working directory which
# has at least a table called data.
base_url "http://web.archive.org"
path "/web/20131226080706/http://www.muenster.de/stadt/buergeramt/mobil-wartezeit.shtml"

wartezeit 'xpath=//*[@id="seite"]/div[2]/p[2]/strong'
wartende 'xpath=//*[@id="seite"]/div[2]/p[1]/strong'
naechste_nummer 'xpath=//*[@id="seite"]/div[2]/p[3]/strong'
end
ScraperWiki.save_sqlite(["naechste_nummer"], BuergerbueroScraper.new.crawl)

0 comments on commit 4567251

Please sign in to comment.