Skip to content
Link checker
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
lib
.gitignore
.rubocop.yml
.ruby-version
.travis.yml
Gemfile
LICENSE
README.md
linkey.gemspec

README.md

Linkey

gem_version.png gem_downloads.png travis.png code_climate.png

Link checker for BBC News & World Services sites.

The idea is to quickly check a page for broken links by doing a status check on all the relative URL's on the page.

There are 4 parts to this tool, the URL, the base URL, the regex and the filename.

  • URL is the page that you want to check for broken links, e.g www.bbc.co.uk/news/uk-29928282
  • Base URL is used with the relative URL from the regex to create a full URL, e.g www.bbc.co.uk
  • Regex is the point of the URL that you want to keep from the regex, e.g bbc.co.uk/news/uk, specifying /news would create /news/uk.
  • Filename is markdown (.md) file where all the page links are stored, this can be useful for manual checks, e.g file.md

Installation

gem install linkey

Usage

Command Line

linkey check <url> <base_url> <regex> <filename>

Examples

linkey check http://www.bbc.co.uk/arabic http://www.bbc.co.uk /arabic arabic.md
linkey check http://www.theguardian.com/technology/2014/feb/15/year-of-code-needs-reboot-teachers http://theguardian.com /technology news.md

Output

Once running, you'll see either a 200 with Status is 200 for <URL> or Status is NOT GOOD for <URL>.

Script It

require 'linkey'

url = 'http://www.live.bbc.co.uk/arabic'
base = 'http://www.live.bbc.co.uk'
reg = '/arabic'
filename = 'arabic.md'

page = Linkey::SaveLinks.new(url, filename)
status = Linkey::CheckResponse.new(url, base, reg, filename)

page.capture_links
status.check_links

From a File

If you have a lot of URLs that you want to check all the time using from a file is an alternative option. This will utilise the smoke option, then point to a YAML file with the extension. In some situations, we are deploying applications that we don't want public facing, so ensuring they 404 is essential. There is a status code option to allow a specific status code to be set against a group of URL's, ensuring builds fail if the right code conditions are met.

linkey smoke test.yaml

Example YAML Config:

base: 'http://www.bbc.co.uk'

concurrency: 100

headers:
 -
   X-content-override: 'https://example.com'

status_code: 200

paths:
  - /news
  - /news/uk

Via a Ruby script:

require 'linkey'

tests = Linkey::Checker.new("path/to.yaml")
tests.smoke
You can’t perform that action at this time.