Skip to content

sathia27/crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This web crawler written in ruby. Nokogiri is used as scrapping tool. This code crawls all links from given url and save it in file.

Prerequisite

  • Ruby
  • RubyGems
  • Nokogiri

Installation

Nokogiri

gem install nokogiri

How to run?

git clone https://github.com/sathia27/crawler.git
cd crawler
ruby bin/crawl.rb

Enter link to crawl: https://python.org

Allow external site to crawl? (y/n): n
Spider on https://python.org
You have already crawled this site. Do you want to continue? y/n: y
You can look in crawled links in data dir: data/python.org

Check output

Once program terminates or you manually terminate program. You can get output from

cd crawler/data
tail -f python.org

About

Web spider written in ruby

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages