Skip to content

Initial commit for a simple Ruby on Rails scraper app

Notifications You must be signed in to change notification settings

carlosconnected/scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Ruby on Rails web tag scraper

Description

The scraper provides an input box where you can provide a URL. The web scraper will return the list of tags from the given URL response. This implementation relies on the Nokogiri library for parsing the HTML content, so the order of the tags is based on the Nokogiri's parser behavior, which is post-order (you can check its implementation) because the parent tags are registered when the parent closing tags are visited.

Content

The files that contain the juice of the app are (in order of relevance):

  1. app/controller/url_tag_lists_controller.rb
  2. app/views/url_tag_lists/search.html.erb
  3. config/routes.rb
  4. db/migrate/20170111230240_create_url_tag_lists.rb

Run it locally

  1. git clone git@github.com:carlos-peru/scraper.git
  2. cd scraper
  3. bundle install --without production
  4. rails s

Demo

https://rails-html-tags-scraper.herokuapp.com/

License

This project is licensed under the MIT license, Copyright (c) 2017 Carlos Castro.

About

Initial commit for a simple Ruby on Rails scraper app

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages