Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
examples		examples
lib		lib
spec		spec
.gitignore		.gitignore
.rspec		.rspec
.simplecov		.simplecov
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
Guardfile		Guardfile
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
Rakefile		Rakefile
scrapula.gemspec		scrapula.gemspec

Repository files navigation

[

Scrapula

Scrapula is a library for scraping web pages that simplifies some of the common actions that are involved.

It has a very simple API that can be used in several ways and contexts, and another, shorter, that facilitates processing pages when characters are scarce, like irb / pry, or quick and dirty scripts.

Requirements

It uses Mechanize and Nokogiri to obtain and extract the information and RSpec for testing.

Configuration

If you want to show the output of some steps:

Scrapula.verbose = true

API

Perform requests:

page = Scrapula.get 'example.net' #=> Scrapula::Page object

page = Scrapula.post 'example.net', { q: 'a query' }   #=> Scrapula::Page object

Extract information from the page:

# Using a CSS selector (all elements)
page.search! 'a'

# Using a CSS selector (fist element)
page.at! 'h1'

# Using XPath (fist element)
page.at! '//'

Perform a GET request:

Scrapula.get 'example.net

S interface

This API is not required by default, so it is up to you to use it:

require 'scrapula/s'

It provides the method and its shortcut For all HTTP verbs:

S.get 'example.net'
S.g 'example.net'

S.post 'example.net'
S.p 'example.net'

S.put 'example.net'
S.u 'example.net'

S.patch 'example.net'
S.a 'example.net'

S.delete 'example.net'
S.d 'example.net'

S.head 'example.net'
S.h 'example.net'

Additionally, GET requests, can be performed with through the shortest invocation:

S 'example.net'

Examples

There are more examples in the examples folder.

Changelog

You can read previous changes in CHANGELOG.md

Contributing

Authors

Juan A. Martín Lucas (https://github.com/j-a-m-l)

License

This project is licensed under the MIT license. See LICENSE for details.

License

j-a-m-l/scrapula

Folders and files

Latest commit

History

Repository files navigation

Scrapula

Requirements

Configuration

API

S interface

Examples

Changelog

Contributing

Authors

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages