Permalink
Browse files

NUTCH-1341 NotModified time set to now but page not modified

git-svn-id: https://svn.apache.org/repos/asf/nutch/trunk@1401288 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information...
1 parent cf2b782 commit a6b4acc182d01b9bc186ff93ff7cf9d440ce46f6 Markus Jelsma committed Oct 23, 2012
Showing with 6 additions and 0 deletions.
  1. +2 −0 CHANGES.txt
  2. +4 −0 src/java/org/apache/nutch/crawl/CrawlDbReducer.java
View
@@ -2,6 +2,8 @@ Nutch Change Log
(trunk) Current Development:
+* NUTCH-1341 NotModified time set to now but page not modified (markus)
+
* NUTCH-1215 UpdateDB should not require segment as input (markus)
* NUTCH-1383 IndexingFiltersChecker to show error message instead of null pointer exception (snagel)
@@ -221,6 +221,10 @@ public void reduce(Text key, Iterator<CrawlDatum> values,
// set the result status and signature
if (modified == FetchSchedule.STATUS_NOTMODIFIED) {
result.setStatus(CrawlDatum.STATUS_DB_NOTMODIFIED);
+
+ // NUTCH-1341 The page is not modified according to its signature, let's reset lastModified as well
+ result.setModifiedTime(prevModifiedTime);
+
if (oldSet) result.setSignature(old.getSignature());
} else {
switch (fetch.getStatus()) {

0 comments on commit a6b4acc

Please sign in to comment.