generating 100k rows in excel crashes the browser #18

rivlinehud · 2017-02-28T12:21:15Z

Hi :-) I love this project !

Is it possible to stream the file, instead of inflating the browsers cash ?

I attached some code bellow

thanks a lot :-)

`function doExport(page, options) {
return generate(page, options)
.then(function (blob) {
if (window.navigator && window.navigator.msSaveOrOpenBlob) {
//ie
window.navigator.msSaveOrOpenBlob(blob, page + ".xlsx");
} else {
//not ie
var url = window.URL.createObjectURL(blob);
var a = document.createElement("a");
document.body.appendChild(a);
a.href = url;
a.download = page + ".xlsx";
a.click();
window.URL.revokeObjectURL(url);
document.body.removeChild(a);
}
})
.catch(function (err) {
alert(err.message || err);
throw err;
});
}

function generate(page,options) {
options = options || {};

options.fileName = options.fileName || page;
options.headers = BTB.pages[page].exportData.headers;
options.fields = BTB.pages[page].exportData.fields;
options.data = BTB.pages[page].exportData.rows;
options.lang = options.lang || BTB.language.get();

//prepare headers
var translatedHeaders = []
options.headers.forEach(function (item) {
	var header = BTB.language.dictionary[item] ? BTB.language.dictionary[item][options.lang] : "";
	translatedHeaders.push(header);
});
options.headers = [translatedHeaders];

//prepare data structure for excel
var newData = [];
var newDataLine = [];

options.data.forEach(function (item) {
	newDataLine = [];
	options.fields.forEach(function (field) {
		var newItem = item[field] || "";
		newDataLine.push(newItem);
	})
	newData.push(newDataLine);
});

for (var i = 0; i <= 100000; i++) {
	newData.push(newData[0]);
}

options.newData = newData;

return getWorkbook(options.lang)
	.then(function (workbook) {

		workbook.sheet(0).range("A1:Z1").value(options.headers);
		workbook.sheet(0).range("A2:Z100000").value(options.newData);
		return workbook.outputAsync();//type
	})

}

function getWorkbook(lang) {
lang = lang || "En";
return new Promise(function (resolve, reject) {

	var req = new XMLHttpRequest();
	var url = "/templates/template" + lang + ".xlsx"; //urlInput.value;
	req.open("GET", url, true);
	req.responseType = "arraybuffer";
	req.onreadystatechange = function () {
		if (req.readyState === 4) {
			if (req.status === 200) {
				resolve(XlsxPopulate.fromDataAsync(req.response));
			} else {
				reject("Received a " + req.status + " HTTP code.");
			}
		}
	};

	req.send();
});

}`

The text was updated successfully, but these errors were encountered:

dtjohnson · 2017-02-28T14:48:22Z

At which point is it crashing? Which browser(s)?

Could you provide a full working code example as an attachment that I could run?

rivlinehud · 2017-02-28T16:31:11Z

Hi ... here is a repository with the project.
hope its helpfull..

Im developing on chrome and the browser memory is inflated on line 55 --> export.js

thanks :-)

dtjohnson · 2017-03-01T02:05:52Z

Thanks. The problem is that in line 55 you are setting the value for 2 million cells. Whenever you set the value for a cell, an object gets created for it that looks something like this:

{
    attributes: { r: "A1", s: "1", t: "s" },
    children: [{
        name: "v",
        children: [5]
    }]
}

Dumping a heap snapshot, it looks like that object takes around 600 bytes. Multiply that by 2 million and you run into the V8 heap limit. While I'm sure it's possible to find ways to minimize the memory, it's not easy and is unlikely to happen. The good news is that the limit in Chrome/Node.js does seem to be right about 2 million cells. That seems to be a healthy amount for most use cases.

rivlinehud · 2017-03-01T08:26:09Z

I see ...
Is there a way to stream like the file in order to free the browser memory ?

dtjohnson · 2017-03-01T17:44:11Z

Nope. The file has to be held in memory in the browser. Do you need more than 2 million cells?

reviewher · 2017-03-02T08:52:42Z

@dtjohnson @rivlinehud there's actually a lower limit to be aware of: V8 strings cannot be larger than 256 MB nodejs/node#3175 (comment)

rivlinehud · 2017-03-02T08:59:43Z

@dtjohnson, yepp ... i need the capability to export large stock files ...
if i cant do it in the browser i will have to do it on the server and that is less convenient,
because of separation of code...

dtjohnson · 2017-03-03T01:00:36Z

@reviewher, thanks for that. I wasn't aware of the string limit. That's not the issue in this particular problem, but I'm sure I'll run into it at some point.

@rivlinehud, how many cells do you need? We should be able to get to 4 million cells without too much difficulty...

rivlinehud · 2017-03-05T08:15:27Z

@dtjohnson, 4 million cells would be amaizing !!! :-)

dtjohnson · 2017-03-07T13:34:04Z

OK. So this is going to be harder than I thought. With the current setup you are actually limited to about 650k cells if you want to output a file. While you can certainly get 2M cells in memory, the downstream steps to turn it into a file consume additional memory. I do think 3-4 million cells should be possible, but it will require a lot more effort.

Here is the basic process the xlsx-populate uses:

Opens a file using JSZip. A file with 4 million cells takes only 120kB uncompressed so I don't think there is any memory concerns at this stage.
Uses sax js to stream the XML text in the files into a corresponding object model. Since sax js uses streaming and the final object model must be held entirely in memory I don't think there is any memory savings here.
As you manipulate the workbook, xlsx-populate manipulates the object model. Since the point of xlsx-populate is to be minimally invasive and fairly ignorant about the full Open XML spec, there is a limit to what we can do here to cut memory. One obvious big savings is to eliminate child arrays if there is only a single child. That's pretty easy to do and seems to cut memory usage by 40% or so.
To write to file it must first convert the object model back to XML. We currently use xmlbuilder-js to create an XML document and then convert it to a string. The issue is that the intermediate XML document is itself pretty large and needs to be held in memory at the same time as the object model. To save memory, we'd need to eliminate this and go straight from object model to an XML string. It's definitely doable, but non-trivial.
Use JSZip to zip up all of the strings into an XLSX file. Again, the size of the strings/zips are not significant so I don't think there is a concern here.

So to summarize, I do think it's possible to get to 3-4 million cells, but there's a decent amount of work to get there. You might be better off looking at another library like Sheet JS for the time being.

dtjohnson · 2017-03-10T02:34:11Z

OK. So I just published v1.2.0, which is as far as I'm going to take this at this point. I shaved the memory of the data structure a bit and a rewrote the XML builder from scratch, which saved a huuuuge amount of memory. Right now you can get just over 2 million cells in Chrome. Now CPU is the big bottleneck. Also, performance in IE is terrible in general, and this doesn't fair well either. I'm hesitant to push performance too much more at this stage. I'd rather wait for more completeness of the API before. So hopefully this is good enough for you.

rivlinehud · 2017-03-20T13:09:47Z

thanks so much !! :-)

i will do some tests soon and let you know my experience with the uppgrade

dtjohnson · 2017-04-22T14:56:45Z

I just published v1.6.0 that has much better performance. In tests in V8 I was able to generate 9.5 million cells.

unchewyglees · 2021-02-24T14:31:06Z

Amazing! I've tried a few libraries that would all run out of memory and crash the browser before finding this one. This was able to do approximately 3.75 million cells in a file I just generated. Great work!

arthanariramesh · 2021-07-28T07:12:56Z

Is there a way to over come this memory crash issue. I am trying to load a 60 mb xlsx.

dtjohnson closed this as completed Mar 10, 2017

dtjohnson mentioned this issue Apr 4, 2017

Applying a lot of styles causes spreadsheet to need repair #32

Closed

dtjohnson mentioned this issue Oct 9, 2017

Reading big size files crash the app. #91

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generating 100k rows in excel crashes the browser #18

generating 100k rows in excel crashes the browser #18

rivlinehud commented Feb 28, 2017

dtjohnson commented Feb 28, 2017

rivlinehud commented Feb 28, 2017

dtjohnson commented Mar 1, 2017

rivlinehud commented Mar 1, 2017

dtjohnson commented Mar 1, 2017

reviewher commented Mar 2, 2017

rivlinehud commented Mar 2, 2017

dtjohnson commented Mar 3, 2017

rivlinehud commented Mar 5, 2017

dtjohnson commented Mar 7, 2017

dtjohnson commented Mar 10, 2017

rivlinehud commented Mar 20, 2017

dtjohnson commented Apr 22, 2017

unchewyglees commented Feb 24, 2021

arthanariramesh commented Jul 28, 2021

generating 100k rows in excel crashes the browser #18

generating 100k rows in excel crashes the browser #18

Comments

rivlinehud commented Feb 28, 2017

dtjohnson commented Feb 28, 2017

rivlinehud commented Feb 28, 2017

dtjohnson commented Mar 1, 2017

rivlinehud commented Mar 1, 2017

dtjohnson commented Mar 1, 2017

reviewher commented Mar 2, 2017

rivlinehud commented Mar 2, 2017

dtjohnson commented Mar 3, 2017

rivlinehud commented Mar 5, 2017

dtjohnson commented Mar 7, 2017

dtjohnson commented Mar 10, 2017

rivlinehud commented Mar 20, 2017

dtjohnson commented Apr 22, 2017

unchewyglees commented Feb 24, 2021

arthanariramesh commented Jul 28, 2021