Checks for links to a sandbox site that have made it to live.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitattributes
.gitignore
DownloadWorker.py
FirstNames.txt
LICENSE
LastNames.txt
PythonWebCheck.py
README.md
Setup.py
WorkerService.py
handleAPI.py
requirements.txt
spellcheck.py
words_en.txt
words_en_evenless.txt
words_en_less.txt

README.md

Python Web Check

Multithreaded Python web crawler that checks every page in a site for various issues.

Checks for:

  • Links to dev enviromnent
  • Links that are 404
  • Spelling errors made in the page text

How To's

  • Set up on a new machine
    • Install python 3.
    • Install the requirements using pip
    • Open a command window in the folder with the bot
  • Run the program
    • Enter the command "python PythonWebCheck.py"
  • Change how it operates
    • Open PythonWebCheck.py in a text editor
    • Edit the environmental variables
  • View logs
    • Go into the logs/ directory and open the logs