Link checker for BBC News & World Services sites.
The idea is to quickly check a page for broken links by doing a status check on all the relative URL's on the page.
There are 4 parts to this tool, the URL, the base URL, the regex and the filename.
- URL is the page that you want to check for broken links, e.g
- Base URL is used with the relative URL from the regex to create a full URL, e.g
- Regex is the point of the URL that you want to keep from the regex, e.g
- Filename is markdown (.md) file where all the page links are stored, this can be useful for manual checks, e.g
gem install linkey
linkey check <url> <base_url> <regex> <filename>
linkey check http://www.bbc.co.uk/arabic http://www.bbc.co.uk /arabic arabic.md
linkey check http://www.theguardian.com/technology/2014/feb/15/year-of-code-needs-reboot-teachers http://theguardian.com /technology news.md
Once running, you'll see either a 200 with
Status is 200 for <URL> or
Status is NOT GOOD for <URL>.
require 'linkey' url = 'http://www.live.bbc.co.uk/arabic' base = 'http://www.live.bbc.co.uk' reg = '/arabic' filename = 'arabic.md' page = Linkey::SaveLinks.new(url, filename) status = Linkey::CheckResponse.new(url, base, reg, filename) page.capture_links status.check_links
From a File
If you have a lot of URLs that you want to check all the time using from a file is an alternative option. This will utilise the smoke option, then point to a YAML file with the extension. In some situations, we are deploying applications that we don't want public facing, so ensuring they 404 is essential. There is a status code option to allow a specific status code to be set against a group of URL's, ensuring builds fail if the right code conditions are met.
linkey smoke test.yaml
Example YAML Config:
base: 'http://www.bbc.co.uk' concurrency: 100 headers: - X-content-override: 'https://example.com' status_code: 200 paths: - /news - /news/uk
Via a Ruby script:
require 'linkey' tests = Linkey::Checker.new("path/to.yaml") tests.smoke