Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added rethrows for better error handling from hadoop.
Corrected path for using cached data. Added settings for hadoop queues Code cleanup and javadoc Tidies up the UX for domain matching Experiment with url-parsing for domain matching on source tag Experiment with url-parsing for domain matching Experiment with re2j pattern-matching for crawl logs. Added a comment on a needed improvement in the QA. Found additonal places where hadoop code was trying to read from netarkivet settings. Optimised local regex'ing of crawl log Ensure hdfs caching for all interactive jobs. Fixed configuration logic + minor optimisation Mapping metadata now uses caching First attempt at caching for MetadataCDXMapper Make sure cache dir is created Removed hard-codes and enabled caching utility Optimised wrt to crawl log extraction Fixed npe Fixed cache path to be writable. Experimental hdfs caching processor for crawl logs only
- Loading branch information
Showing
13 changed files
with
325 additions
and
39 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.