jsdom cannot analyse html file with size larger than 5M ? #528

Closed
anic opened this Issue Nov 6, 2012 · 1 comment

Comments

Projects
None yet
2 participants

anic commented Nov 6, 2012

i had a local html file (with size of 5M) to parse. when i use the following code, the program stuck, dont show any exception or error.

function extractContent(path, filename) {
    console.log("EXTRACT:" + path);
    var jsdom = require("jsdom");
    var content = [];

    jsdom.env(path, [], jsdom.level(1, "core"), function(errors, window) {
        console.log("DONE:" + path);

        if(errors) {
            console.log(errors);
            return;
        }

        var document = window.document;

               //analyse here...
    });
};

var path = "e:\\sample.html";
var file = "sample.html";
extractContent(path, file);
Collaborator

domenic commented May 11, 2013

Might be worth giving this a retry with the latest version now that we have a new parser... But yeah, not sure where the limitation is here, could be somewhere in Node, could be that we have some O(n^2) operation that makes it take forever with large documents, ...

domenic closed this Aug 25, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment