Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

Some tinkering with values

  • Loading branch information...
commit 38d7a870e213e101a9c9d101ff5bd7a37cc090c1 1 parent 82db8cd
Dusko Jordanovski authored March 15, 2012

Showing 1 changed file with 9 additions and 2 deletions. Show diff stats Hide diff stats

  1. 11  extractor.js
11  extractor.js
@@ -592,7 +592,7 @@ function analyze(dom, options){
592 592
         node.textNodes += 1;
593 593
       }
594 594
     }
595  
-    node.avgScore = (total + node.score) / ((node.nodes || 0) + 1);
  595
+    node.avgScore = (total + (node.score || 0)) / ((node.nodes || 0) + 1);
596 596
     
597 597
     return total;
598 598
   })(bodyNode);  
@@ -691,7 +691,9 @@ function analyze(dom, options){
691 691
    * If the winner contains relatively few direct children, the content is probably inside one of them.
692 692
    * We check this by looking at the directChildren / totalTextNodes ratio.
693 693
   **/
  694
+
694 695
   var wnode, rnode;
  696
+  printTree(winner)
695 697
   while(winner.children.length / winner.textNodes < 0.1){
696 698
     wnode = rnode = null;
697 699
     winner.children.forEach(function(child){
@@ -714,7 +716,12 @@ function analyze(dom, options){
714 716
         }
715 717
       });
716 718
     }
717  
-    winner = wnode;
  719
+    if(wnode && wnode.words/winner.words > 0.1){
  720
+      winner = wnode;
  721
+    }
  722
+    else {
  723
+      break;
  724
+    }
718 725
   }
719 726
   
720 727
   /**

0 notes on commit 38d7a87

Please sign in to comment.
Something went wrong with that request. Please try again.