Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Rotto-Links-Scraper

A web crawler/scraper to find the broken links in the targeted seed url based on the keywords matched in the broken links contained page .

##Installation

  1. Redis
  2. Fabric
  3. Python 2.7+

##Instructions

  1. First install all dependencies listed in requirements.txt using pip package manager :
    $ pip install -r requirements.txt
  1. Set the DATABASE_PATH environment variables (i.e SMTP_USER, SMTP_PASSWORD) in your shell config file(i.e .bashrc , .zshrc or etc)
    # your shell config file
    export DATABASE_PATH='/path/to/database/'
  1. Also, set the two more environment variables required for SMTP Server for sending email to users in your shell config file.
    # your shell config file
    export SMTP_USER='smtp-username'
    export SMTP_PASSWORD='smtp-password'
  1. Also, set the one more environmnet variable to save Logs of the app in defined location.
    # your shell config file
    export LOGS_DIR='path/to/logs'

##Commands Note:- First install Fabric to run below commands

To run a gui app :

    $ fab app

To run a dispatcher :

    $ fab dispatcher

To run a worker :

    $ fab worker

##Developer

  1. Akshay Pratap Singh
  2. Sunny Gupta

About

A web crawler/scraper to find the broken links in the targeted seed url based on the keywords matched around that broken links.

Resources

License

Packages

No packages published
You can’t perform that action at this time.