Skip to content

Commit

Permalink
Merge pull request #1 from everypolitician-scrapers/initial-scraper
Browse files Browse the repository at this point in the history
Initial scraper
  • Loading branch information
tmtmtmtm committed Jan 3, 2017
2 parents 64306ac + b7fcf7b commit b9bf2ef
Show file tree
Hide file tree
Showing 6 changed files with 55 additions and 0 deletions.
9 changes: 9 additions & 0 deletions .rubocop.yml
@@ -0,0 +1,9 @@
AllCops:
Exclude:
- 'Vagrantfile'
- 'vendor/**/*'
TargetRubyVersion: 2.3

inherit_from:
- https://raw.githubusercontent.com/everypolitician/everypolitician-data/master/.rubocop_base.yml
- .rubocop_todo.yml
Empty file added .rubocop_todo.yml
Empty file.
5 changes: 5 additions & 0 deletions .travis.yml
@@ -0,0 +1,5 @@
language: ruby
rvm:
- 2.3.3
sudo: false
cache: bundler
15 changes: 15 additions & 0 deletions Gemfile
@@ -0,0 +1,15 @@
# frozen_string_literal: true

source 'https://rubygems.org'

ruby '2.3.3'

git_source(:github) { |repo_name| "https://github.com/#{repo_name}.git" }

gem 'pry'
gem 'rake'
gem 'rubocop'
gem 'scraped', github: 'everypolitician/scraped'
gem 'scraped_page_archive', github: 'everypolitician/scraped_page_archive'
gem 'scraperwiki', github: 'openaustralia/scraperwiki-ruby',
branch: 'morph_defaults'
6 changes: 6 additions & 0 deletions Rakefile
@@ -0,0 +1,6 @@
# frozen_string_literal: true
require 'rubocop/rake_task'

RuboCop::RakeTask.new

task default: %w(rubocop)
20 changes: 20 additions & 0 deletions scraper.rb
@@ -0,0 +1,20 @@
#!/bin/env ruby
# encoding: utf-8
# frozen_string_literal: true

require 'pry'
require 'scraped_page_archive/open-uri'
require 'scraped'

class String
def tidy
gsub(/[[:space:]]+/, ' ').strip
end
end

def scrape_list(url)
puts "Opening #{url}"
Scraped::HTML.new(response: Scraped::Request.new(url: url).response)
end

scrape_list('http://www.palemene.ws/new/members-of-parliament/members-of-the-xvi-parliament/')

0 comments on commit b9bf2ef

Please sign in to comment.