Improve documentation regarding !ref #82

Closed
lszhu opened this Issue Jul 9, 2014 · 8 comments

Projects

None yet

4 participants

lszhu commented Jul 9, 2014

when I create a sheet with 49 columns, it can only fill the first 20000 rows.
this is the code:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
Owner

I ran your script against a really dumb template and it took roughly 2 minutes:

$ node --version
v0.10.29
$ xlsx --version
0.7.7
$ cat t.js
var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
console.log(new Date());
sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
$ node t.js
Tue Jul 08 2014 22:24:55 GMT-0400 (EDT)
Tue Jul 08 2014 22:25:10 GMT-0400 (EDT)
Tue Jul 08 2014 22:26:47 GMT-0400 (EDT)

This is the generated file -- I warn you, it is big! The file is 262.5MB, which may explain why you are seeing issues? On a side note, it took more time to upload to dropbox than to generate the file.

Can you test the following:

  1. Add the line with the comment to your script and run

  2. Change the template so that it only has the header row, then run the script from above

  3. Generate CSV instead of XLSX

Also, can you share the template?

lszhu commented Jul 9, 2014

Thank you , it's ok now with the line you added.
but can you explain the line you add in detail? I can't find any documents yet.

Owner

If this explanation makes sense, I'll add it to the readme:

The keys of a worksheet object are A-1 style references to the cell. For example, to get the cell object for B2, you would just access worksheet.B2 or worksheet['B2'].

There are special keys (all starting with !) that correspond to sheet metadata. In particular, !ref is the A-1 style cell range for the worksheet. For example, if the worksheet's range is A1:B2, there are 4 cells (A1, B1, A2, B2). The output functions iterate through the rows and columns as specified in the range. The relevant code segment is in the write_ws_xml_data function: https://github.com/SheetJS/js-xlsx/blob/master/bits/67_wsxml.js#L233

    ... range = safe_decode_range(ws['!ref']) ... // <-- this decodes the range into a range object
    for(var R = range.s.r; R <= range.e.r; ++R) { // <-- walk through each row.  range.s is the start cell (upper-left corner) and range.e is the end cell (lower-right corner)
...
        for(var C = range.s.c; C <= range.e.c; ++C) { // <-- walk through each column

I suspect your template had 20000 rows (or at least, that was the internal range storage). Your original script added cells but didn't update the range, which is why the output functions ignored them. If you want to see the range from your template, print it in the script:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');
var sheet = workbook.Sheets[workbook.SheetNames[0]];
console.log(sheet["!ref"]);
lszhu commented Jul 9, 2014

awesome!
my template was saved from an xls file which had 20000 rows.
the codes are really long and complex, a more detailed guide is preferred, thanks again.

Owner

@lszhu heh if you think this is complex, the XLS parsing is much more intense

@SheetJSDev SheetJSDev changed the title from can not process more than 20000 rows to Improve documentation regarding !ref Jul 9, 2014
Owner

@lszhu added a short explanation of !ref in the README: https://github.com/SheetJS/js-xlsx#worksheet-object

vikash52 commented Apr 8, 2015

I am trying to convert html table to xls and its getting converted to xsl successfully. However, I am facing issues with formula. Formula in the html table is not visible in xsl file, I just see a value at the place of formula.

Refering this example https://github.com/SheetJS/js-xlsx/blob/master/tests/write.js , it doesn't have any implementation for cell formula in xlsx spreadsheet I tried using cell.f = "=SUM(A1+B1)" for the cell C1 and cell.v as the summation value which was 3. But i didn't succeed. With the exported file, when opened in MS excel, the cell contained just the data and when selected, didn't show any formula which i assigned in f(x) field.

Can someone post me a example which actually uses the functions/property '.f' and 'cellFormula'
Will be very helpful. I just need a working example with static values.

@vikash52 did you solve the issue or find a way to solve it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment