Skip to content

Release Notes Heritrix 3.4.0 20190418

Andy Jackson edited this page Apr 18, 2019 · 1 revision

Summary of changes since Release Notes - Heritrix 3.4.0-20190207 -- See the full changelog for more details.

Bugfixes

  • Invalid format exception in scanJobLog #239
  • Allow failed DNS lookups to expire, for #234. #235 (anjackson)
  • fix some trough dedup bugs #251 (nlevitt)

Additions

  • set of frontier management changes to support CrawlHQ module #253 (dvanduzer)
  • Remove suffix from warcWriter since it is no longer used. #249 (ruebot)
  • Handle commas more compliantly when parsing srcset #243 (ato)
  • Trough dedup #242 (nlevitt)

Heritrix

Structured Guides:

Wiki index

FAQs

User Guide

Knowledge Base

Known Issues

Background Reading

Users of Heritrix

How To Crawl

Development

Clone this wiki locally