check links in web documents or full websites
Clone or download
Latest commit 3eb7cbf Oct 11, 2018
Permalink
Failed to load latest commit information.
.github add github issue template Mar 26, 2018
cgi-bin Updated homepage URL. Apr 9, 2013
config Move GUI files to separate project Jan 23, 2016
doc Merge pull request #138 from anarcat/std-contrib Jun 21, 2018
linkcheck Added whatsapp:// to ignored protocols Aug 9, 2018
po Move GUI files to separate project Jan 23, 2016
scripts Code cleanup Jul 13, 2014
tests make tests pass on IPv6 hosts Apr 11, 2018
windows Move GUI files to separate project Jan 23, 2016
.gitattributes Add .gitattributes Dec 4, 2013
.gitignore Add a tox.ini Feb 1, 2017
.project Add Eclipse Pydev project files. May 18, 2011
.pydevproject Updated pydev settings. Dec 17, 2011
.travis.yml update .travis.yml to test in Pyhon3 in allow-failures mode Jan 10, 2018
CODE_OF_CONDUCT.md split code of conduct and contributing guidelines in two Mar 26, 2018
CONTRIBUTING.mdwn Additional typo correction and URL updates Aug 3, 2018
COPYING Moved some files into the doc/ subdirectory. Mar 6, 2010
Dockerfile Dockerfile fix, Documentation updated Mar 25, 2018
MANIFEST.in Move GUI files to separate project Jan 23, 2016
Makefile remove third party packages and use them as dependency Jan 9, 2018
README.rst Fix Travis badge in README Sep 3, 2018
dev-requirements.txt List dependencies alphabetically Oct 4, 2018
install-rpm.sh Fix RPM installer generation. Apr 11, 2012
linkchecker Avoid info log 'Checking intern URLs only; use --check-extern to chec… Sep 11, 2017
linkchecker.freecode Set release date. Jul 16, 2014
requirements.txt keep the previous requirements limit Apr 13, 2018
robots.txt Add non-ascii values to test robots.txt Jul 13, 2008
setup.cfg Remove platform-specific installer stuff and ensure a build .whl whee… Jan 17, 2016
setup.py same for setup.py, gah Apr 13, 2018
tox.ini Merge pull request #121 from PetrDlouhy/tests-parser-divided Feb 12, 2018

README.rst

LinkChecker

Build Status License

Check for broken links in web sites.

Features

  • recursive and multithreaded checking and site crawling
  • output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
  • HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
  • restrict link checking with regular expression filters for URLs
  • proxy support
  • username/password authorization for HTTP, FTP and Telnet
  • honors robots.txt exclusion protocol
  • Cookie support
  • HTML5 support
  • a command line and web interface
  • various check plugins available, eg. HTML syntax and antivirus checks.

Installation

See doc/install.txt in the source code archive for general information. Except the given information there, please take note of the following:

Python 2.7.2 or later is needed. It doesn't work with Python 3 yet, see #40 for details.

The version in the pip repository is old. Instead, use the current git master version via pip install git+https://github.com/linkchecker/linkchecker.git. See #4.

Windows builds are seriously lagging behind the Linux releases, see #53 for details. For now, the only two options are to install from source or use Docker for Windows.

Usage

Execute linkchecker http://www.example.com. For other options see linkchecker --help.

Docker usage

If you do not want to install any additional libraries/dependencies you can use the Docker image.

Example for external web site check: ` docker run --rm -it -u $(id -u):$(id -g) linkchecker/linkchecker --verbose https://google.com `

Local HTML file check: ` docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt linkchecker/linkchecker --verbose index.html `