Skip to content
/ asimov Public
forked from enferex/asimov

Collect 'disallow' entries from robots.txt

License

Notifications You must be signed in to change notification settings

silky/asimov

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

asimov: Scanning the robots of the universe

asimov is a utility which scans a given URL and places all Disallow locations (from the robots.txt at the URL) into a database named 'sites.sql3'

Setup

Running the setup and install procedures below will create a sandbox environment fulfilling all of the dependencies necessary to build and run asimov. I have wrapped all of the utility functions to setup/install/build into a Makefile. Initially run the following before trying to build anything:

  • make setup
  • make install

The setup target will create a sandbox, and install will download all of the dependencies required to build asimov.

Usage

Run asimov with a single url as the only argument. Since we are rocking a sandbox, the execution can occur from that environment: for example

./cabal-sandbox/bin/asimov www.google.com

Alternatively, you can use cabal to execute:

cabal run www.google.com

Notes

  • Main.hs contains all of the source for asimov.
  • It might be easier to just build Main.hs via ghc instead of running make build

Contact

enferex: https://github.com/enferex

About

Collect 'disallow' entries from robots.txt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Haskell 92.9%
  • Makefile 7.1%