Permalink
Switch branches/tags
Nothing to show
Commits on Feb 6, 2012
Commits on Jan 24, 2012
  1. Adding the ability to relax the baseURl sitemap rules. Flipkart does …

    …not follow it currently.
    committed Jan 24, 2012
  2. Fixing maven issues. Adding a build plugin configuration. Adding a du…

    …mmy test to TestUtils.java for maven to work (else it throws an iniatializeError
    committed Jan 24, 2012
Commits on Jul 25, 2011
  1. added CHANGES.txt + refactoring of SiteMap objects (thanks to Hannes …

    …Schwarz)
    digitalpebble committed Jul 25, 2011
Commits on Jul 21, 2011
  1. Added simple support for the file: protocol.

    Cleaned up packaging.
    
    Added "install" target.
    kkrugler_lists@transpac.com committed Jul 21, 2011
Commits on Jul 12, 2011
  1. package : copy build files to dist dir

    digitalpebble committed Jul 12, 2011
Commits on Jul 6, 2011
  1. changing version to 0.2-SNAPSHOT

    digitalpebble@googlemail.com committed Jul 6, 2011
  2. Changed year to 2011 + distribute jar containing resources + copy lic…

    …ense to root of distributed package
    digitalpebble committed Jul 6, 2011
  3. Added Apache License 2.0

    digitalpebble committed Jul 6, 2011
  4. reformat pom.xml + added stage task to build.xml

    digitalpebble committed Jul 6, 2011
  5. pre-initial release : added dev info to pom.xml + ANT tasks for deplo…

    …yment to Maven public repository
    digitalpebble@googlemail.com committed Jul 6, 2011
Commits on Jul 1, 2011
  1. Add jar that's only in (currently unavailable) 101tec Nexus repo, so …

    …at least users can manually install it
    kkrugler_lists@transpac.com committed Jul 1, 2011
  2. Remove unneeded dependency on 101tec and Apache snapshot repositories

    kkrugler_lists@transpac.com committed Jul 1, 2011
Commits on Jun 4, 2011
  1. Added missing license headers

    digitalpebble@googlemail.com committed Jun 4, 2011
Commits on Jun 3, 2011
  1. Test code for robots.txt processing code, HTTP fetcher

    kkrugler_lists@transpac.com committed Jun 3, 2011
  2. Test code for robots.txt processing code, HTTP fetcher

    kkrugler_lists@transpac.com committed Jun 3, 2011
  3. Preliminary versions of robots.txt processing code, HTTP fetcher

    kkrugler_lists@transpac.com committed Jun 3, 2011
  4. Preliminary versions of robots.txt processing code, HTTP fetcher

    kkrugler_lists@transpac.com committed Jun 3, 2011
Commits on Jun 4, 2010
  1. unified logging with slf4j

    digitalpebble committed Jun 4, 2010
Commits on Apr 26, 2010
Commits on Feb 9, 2010
  1. improved list of compound tlds - see NUTCH-786

    digitalpebble committed Feb 9, 2010
Commits on Dec 12, 2009
  1. Rolled in Ian's patches to pom.xml and build.xml

    Rolled in Ian's EffectiveTldFinder code & test cases.
    
    Fixed "dist" target for build.
    kkrugler_lists@transpac.com committed Dec 12, 2009
Commits on Dec 4, 2009
  1. Change name of format from "Bixo" to "Crawler-commons"

    kkrugler_lists@transpac.com committed Dec 4, 2009
  2. Initial commit of build system, plus some paid-level domain extractio…

    …n code from Bixo.
    kkrugler_lists@transpac.com committed Dec 4, 2009