Skip to content

jikkujose/xp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xp

Ruby gem that adds some methods to String class for intuitive HTML/XML scraping.

Installation

$ gem install xp

Usage

In command line usage, xp filters HTML/XML documents provided via STDIN:

$ curl -s 'https://news.ycombinator.com' | xp --text '//td[class="title"]/a'

OR

$ curl -s 'https://news.ycombinator.com' | xp --text 'td.title > a'

Require (require 'xp') the gem to use in Ruby scripts. Following one liner can download all Dribbble shots in its home page:

'https://dribbble.com/'.css('.dribbble-link img').xpath('//img/@src').map(&:text).map(&:download)

API

xp adds the following methods to the String class:

Method Return type Remarks
to_nokogiri Nokogiri::XML::Document Converts a url or a page source to Nokogiri object
css(selector) String Filters a url or html string based on the selector
xpath(selector) Strng Filters a url or html string based on the selector
download(location: 'downloads', name: nil) String Downloads the url in the string (can be customized via the optional parameters)
page_source(user_agent_alias: :mac_firefox, user_agent: nil) String Gets the page source of a url (can be customized via optional parameters)
url? Boolean Checks whether current string is a url

About

Ruby gem for intuitive web scraping

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages