Skip to content

kodekracker/Rotto-Links-Scraper

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

Rotto-Links-Scraper

A web crawler/scraper to find the broken links in the targeted seed url based on the keywords matched in the broken links contained page .

##Installation

  1. Redis
  2. Fabric
  3. Python 2.7+

##Instructions

  1. First install all dependencies listed in requirements.txt using pip package manager :
    $ pip install -r requirements.txt
  1. Set the DATABASE_PATH environment variables (i.e SMTP_USER, SMTP_PASSWORD) in your shell config file(i.e .bashrc , .zshrc or etc)
    # your shell config file
    export DATABASE_PATH='/path/to/database/'
  1. Also, set the two more environment variables required for SMTP Server for sending email to users in your shell config file.
    # your shell config file
    export SMTP_USER='smtp-username'
    export SMTP_PASSWORD='smtp-password'
  1. Also, set the one more environmnet variable to save Logs of the app in defined location.
    # your shell config file
    export LOGS_DIR='path/to/logs'

##Commands Note:- First install Fabric to run below commands

To run a gui app :

    $ fab app

To run a dispatcher :

    $ fab dispatcher

To run a worker :

    $ fab worker

##Developer

  1. Akshay Pratap Singh
  2. Sunny Gupta

About

A web crawler/scraper to find the broken links in the targeted seed url based on the keywords matched around that broken links.

Resources

License

Stars

Watchers

Forks

Packages

No packages published