Permalink
Commits on May 1, 2012
  1. More informative NSURLErrorCancelled handling (see #3)

    This is more targeted and logs better than the previous change
    committed May 1, 2012
  2. wk_bench: better handling when frames fail to load

    This happens often with e.g. http://googleads.g.doubleclick.net
    
    See #3
    committed May 1, 2012
Commits on Jan 21, 2012
  1. Updated http_bench to use requests

    committed Jan 21, 2012
Commits on Jan 20, 2012
  1. refactored log_replay

    committed Jan 20, 2012
Commits on Jan 17, 2012
  1. Initial refactor

    * Removed Retriever as requests is by default easy enough to remove
      most of the benefit from a separate class
    * Refactored Spider class to use requests.async
    committed Jan 16, 2012
Commits on Jun 2, 2010
  1. Clients: better response error handling

    * Won't run response processors at all on error pages (arguably there should be a separate error processor for people who care)
    * Spider requests will include a referer header, which can be included in errors
    committed Jun 2, 2010
  2. check_site: expand user paths in report files

    This allows you to get the expected result if you use e.g. ~/Desktop/mysite.html
    committed Jun 2, 2010
Commits on May 5, 2010
  1. Added a default timeout to Retriever, Spider

    This makes it easy for things like check_site.py to have a command-line option to change the request timeout.
    committed May 5, 2010
Commits on Apr 3, 2010
  1. Clients: spider now has minimal circular redirect handling

    This should become nicer than an assertion error at some point
    committed Apr 3, 2010
  2. clients: Spider.queue now accepts kwargs

    This allows callers to pass in options for the underlying client
    committed Apr 3, 2010
  3. Bugfix: now possible to save page/resource lists

    Missed during the last refactor
    committed Apr 3, 2010
  4. check_site: better HTML report errors

    Now we give a friendlier error if the report filename contains ".htm" but the format wasn't set to HTML.
    committed Apr 3, 2010
Commits on Feb 25, 2010
  1. Better handling of external redirects

    Now we won't blindly follow redirects but we still need an option to follow them for reporting broken off-site links
    committed Feb 25, 2010
Commits on Feb 18, 2010
  1. Minor cleanup

    committed Feb 18, 2010
Commits on Feb 17, 2010
  1. Initial setup.py for use on PyPI

    committed Feb 17, 2010
  2. Module docstring

    committed Feb 17, 2010
  3. Track response time per-request

    committed Feb 17, 2010
Commits on Feb 7, 2010
Commits on Feb 5, 2010
  1. Made it easier to configure HTTP Request options

    Now Fetcher/Spider.queue accepts kwargs which are passed directly to the HTTPRequest object, allowing you to configure things like request timeouts.
    committed Feb 5, 2010
Commits on Jan 30, 2010
  1. Updated requirements.pip

    committed Jan 30, 2010
Commits on Jan 28, 2010
Commits on Jan 24, 2010
  1. check_site: major reporting overhaul

    * Switched to Jinja2 templating for HTML report, with substantial cleanup for everything related
    * Enabled better reporting for features needed in the report
    * Code & doc maintenance
    
    --HG--
    rename : lib/red_spider_template.html => webtoolbox/templates/red_spider_template.html
    committed Jan 24, 2010
Commits on Jan 23, 2010
  1. Doc cleanup

    Started some docs using Sphinx and renamed the tools for consistency
    committed Jan 23, 2010