Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples and documentation improvements #64

Closed
nukulb opened this issue May 21, 2014 · 24 comments
Closed

Examples and documentation improvements #64

nukulb opened this issue May 21, 2014 · 24 comments

Comments

@nukulb
Copy link

nukulb commented May 21, 2014

I am looking for an example on writing a JSON into xlxs file.

The read works great but I unable to find a working example. Can someone help me with a link or simply point me in the right direction.

@SheetJSDev
Copy link
Contributor

@nukulb the write process takes a JSON object as described in the Cell Object Description section, which isn't particularly convenient ATM. I am working on a way to take a CSV or row object array and convert to the internal format.

As a simple example using the intermediate format, suppose your data was in a 2d grid:

[1,2,3]
[true, false, <nothing>, "sheetjs"]
["foo","bar","0.3"]
["baz", <nothing>, "qux"]]

http://i.imgur.com/wfijIov.png

Suppose you wanted to write that to a sheet called "SheetJS". Then the code would be:

/* original data */
var data = [[1,2,3],[true, false, null, "sheetjs"],["foo","bar","0.3"], ["baz", null, "qux"]]
var ws_name = "SheetJS";

/* require XLSX */
var XLSX = require('xlsx')

/* set up workbook objects -- some of these will not be required in the future */
var wb = {}
wb.Sheets = {};
wb.Props = {};
wb.SSF = {};
wb.SheetNames = [];

/* create worksheet: */
var ws = {}

/* the range object is used to keep track of the range of the sheet */
var range = {s: {c:0, r:0}, e: {c:0, r:0 }};

/* Iterate through each element in the structure */
for(var R = 0; R != data.length; ++R) {
  if(range.e.r < R) range.e.r = R;
  for(var C = 0; C != data[R].length; ++C) {
    if(range.e.c < C) range.e.c = C;

    /* create cell object: .v is the actual data */
    var cell = { v: data[R][C] };
    if(cell.v == null) continue;

    /* create the correct cell reference */
    var cell_ref = XLSX.utils.encode_cell({c:C,r:R});

    /* determine the cell type */
    if(typeof cell.v === 'number') cell.t = 'n';
    else if(typeof cell.v === 'boolean') cell.t = 'b';
    else cell.t = 's';

    /* add to structure */
    ws[cell_ref] = cell;
  }
}
ws['!ref'] = XLSX.utils.encode_range(range);

/* add worksheet to workbook */
wb.SheetNames.push(ws_name);
wb.Sheets[ws_name] = ws;

/* write file */
XLSX.writeFile(wb, 'test.xlsx');

In the future, there will be utility functions to handle most of the logistics for you.

@SheetJSDev
Copy link
Contributor

Preemptively closed this -- will deem complete once the documentation is up to snuff

@SheetJSDev SheetJSDev reopened this May 21, 2014
@nukulb
Copy link
Author

nukulb commented May 21, 2014

@SheetJSDev thanks a lot, I will try it out today but looks straightforward.

@SheetJSDev SheetJSDev mentioned this issue May 21, 2014
@SheetJSDev
Copy link
Contributor

@nukulb how is it going?

@nukulb
Copy link
Author

nukulb commented May 28, 2014

I am starting this work tomorrow, I got side tracked with other stuff.

Thanks for checking in.

On Wed, May 28, 2014 at 6:24 PM, SheetJSDev notifications@github.comwrote:

@nukulb https://github.com/nukulb how is it going?


Reply to this email directly or view it on GitHubhttps://github.com//issues/64#issuecomment-44472600
.

Nukul Bhasin

@nukulb
Copy link
Author

nukulb commented May 29, 2014

that seems to be failing on the last line when I got to write the file.

TypeError: Object #<Object> has no method 'writeFile'
    at exports.exportProducts (/Users/nukulb/src/hubba-node/lib/controllers/exportProducts.js:65:10)
    at callbacks (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:164:37)
    at auth (/Users/nukulb/src/hubba-node/lib/middleware.js:16:20)
    at callbacks (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:164:37)
    at param (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:138:11)
    at pass (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:145:5)
    at Router._dispatch (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:173:5)
    at Object.router (/Users/nukulb/src/hubba-node/node_modules/express/lib/router/index.js:33:10)
    at next (/Users/nukulb/src/hubba-node/node_modules/express/node_modules/connect/lib/proto.js:193:15)
    at Object.handle (/Users/nukulb/src/hubba-node/lib/config/express.js:101:13)

Output XLSX

{ read: [Function: readSync],
  readFile: [Function: readFileSync],
  utils:
   { encode_col: [Function: encode_col],
     encode_row: [Function: encode_row],
     encode_cell: [Function: encode_cell],
     encode_range: [Function: encode_range],
     decode_col: [Function: decode_col],
     decode_row: [Function: decode_row],
     split_cell: [Function: split_cell],
     decode_cell: [Function: decode_cell],
     decode_range: [Function: decode_range],
     sheet_to_csv: [Function: sheet_to_csv],
     make_csv: [Function: sheet_to_csv],
     get_formulae: [Function: get_formulae],
     sheet_to_row_object_array: [Function: sheet_to_row_object_array] },
  version: '0.5.17' }

Do I need a different version for the 'writeFile' method?

@nukulb
Copy link
Author

nukulb commented May 29, 2014

seems like npm has a very old version of this.

I ran npm pack myself to get 0.7.3 version and this example works.

@SheetJSDev any chance you publish the latest to npm?

@SheetJSDev
Copy link
Contributor

@nukulb https://www.npmjs.org/package/xlsx shows that the latest version is 0.7.3 and has been there for the last 4 days. What version of node/npm are you using? Also, can you show the line for xlsx in your package.json?

@nukulb
Copy link
Author

nukulb commented May 29, 2014

you are correct, npm has the right version problem seems to be on my end. I will figure it out on my end.

Your code seems to be work fine. I will start doing some more complicated stuff tomorrow and I might ask more questions here. I really appreciate your help. Thanks.

@SheetJSDev
Copy link
Contributor

The README has a few browser and nodejs examples and the test suite also includes ad-hoc and round trip tests. Closing for now. If you encounter an issue with the examples please raise a new issue.

@nukulb
Copy link
Author

nukulb commented Jun 6, 2014

So far I have only used to write to a new file. And it works great for this use case. Thanks for all the support.

The next thing I am looking at is opening an existing file and then write to that file. Is there an example of this available?

@SheetJSDev
Copy link
Contributor

var wb = XLSX.readFile(file)
// make changes
XLSX.writeFile(wb, 'test.xlsx')

Since they use the same internal format, you can also use https://github.com/SheetJS/js-xls :

var wb = XLS.readFile(file)
// make changes
XLSX.writeFile(wb, 'test.xlsx')    

The test suite has a round trip section just for this: https://github.com/SheetJS/js-xlsx/blob/master/test.js#L630

@SheetJSDev
Copy link
Contributor

Reopening since there is no complete example of editing an existing file in the documentation.

@SheetJSDev SheetJSDev reopened this Jun 6, 2014
@SheetJSDev SheetJSDev changed the title Looking for a write example - JSON to excel Examples and documentation improvements Jun 6, 2014
@elad
Copy link

elad commented Jun 9, 2014

Hello,

Simplifying the read/write interfaces is something I'd like to see as well. I feel the current forms of traversal are too cumbersome.

Any thoughts on what cleaner interfaces should look like?

Are there "common structures" to Excel files the library should support? for example, simple tables?

@SheetJSDev
Copy link
Contributor

@eladxxx I agree that there should be a layer between the raw representation and applications. The interface is a tricky problem for a few reasons:

  1. you've read a file with three cells. A1=1, A2=2, A3=A1+A2. If you make a change to A1 (let's say A1=2 now) do you update the value of A3? You could also just mark the value as dirty.

  2. you delete a cell. How do you update the other cells to be consistent? It's possible (via cross-sheet references) that you would have to look at every sheet. Also, do you shift left or up?

There should be a function interface so that we could easily change the model later (that probably should have been built at the beginning :/ c'est la vie), including:

  • worksheet.get_cell(address) -> single cell object
  • worksheet.get_range(range) -> range object
  • worksheet.set_cell(address, value)
  • worksheet.set_formula(range, formula)
  • worksheet.insert_row(...) and insert_col
  • delete_XXX (cell, range, row, col)
  • set_format(...)

... and a separate function for each fundamental operation.

I'm inclined to keep this in a separate library and make the parsers expect a Workbook object -- there would be a dumb workbook implementation in this and in js-xls and they would use something more intelligent if you included the other script in the browser.

NB: To support opening and editing files, we would probably need a real engine that can parse excel formulae (and, if we are ambitious, evaluate). Since that formula engine is also needed for shared formulae (see #67) that might be done sooner rather than later.

@elad
Copy link

elad commented Jun 9, 2014

You raise a lot of good points. However, what I meant by simple - and it could very well be that I'm looking at this too subjectively - is that we agree on some "common" input format(s) and work with that.

For example, I deal with a lot of files that are simple tables. Some of them have the first row as headers, some don't. Think CSV data imported into Excel. It would be nice if the library had a built-in function to read this type of data so that I could do:

// This is a function I write that receives an array of row data and returns whatever I want.
function parse_row(arr) {
   ...
}

// After this call, rows is an array of whatever I return from parse_row
var rows = xlsx.read_rows(wb, parse_row);

The idea of course isn't to provide read_rows as suggested above but to give something super simple to use in just a couple of lines of code to achieve common goals.

Of course if we can't agree on common goals and/or "common" means formulae-heavy documents and such, then it might be that more thought should be given to how this simplification happens.

That said, the interface functions you suggest seem useful, though I'm not sure they help in iterating over a document in order to parse it for some internal representation.

@SheetJSDev
Copy link
Contributor

@eladxxx As part of this issue from js-xls the JSON output function was overhauled. If you are specifically interested in the data, this should work:

function read_rows(wb, cb) {
  XLSX.utils.sheet_to_json(wb, {header:1}).forEach(cb);
}
var rows = xlsx.read_rows(wb, parse_row);

If need be, sheet_to_json can be modified to optionally return the full object as well as row-level metadata.

@elad
Copy link

elad commented Jun 10, 2014

Is there any documentation for sheet_to_json? Specifically, a way to tell it "there are no headers in the data"? passing no options, or explicitly passing { header: 0 } (or -1) doesn't do the trick...

@SheetJSDev
Copy link
Contributor

@elad this is how i described the header option in the other thread:

  • opts.header = 1 generates an array of arrays (data starts in first row)
  • opts.header = 'A' generates a structure using the column headers as keys (data starts in first row)
  • opts.header = ['header1', 'header2', ...] will use the specified labels (data starts in first row)

(default behavior remains: use the first row as headers)

this clearly needs to be documented properly

@hiwanz
Copy link

hiwanz commented Sep 25, 2014

Is there an API to initiate a workbook now?

@comerc
Copy link

comerc commented Jun 20, 2016

#365

clarketm pushed a commit to clarketm/js-xlsx that referenced this issue Nov 27, 2016
- parse_XFExtGradient stubbed (fixes SheetJS#62, h/t @oakuraape)
- default to UTF16LE (fixes SheetJS#64, h/t @ajuhos)
- ensure that date1904 is propagated to SSF (fixes SheetJS#66, h/t @Martin-Pitt)
- pin CFB version in anticipation of future changes
@blakek
Copy link

blakek commented Dec 7, 2016

It would be great if the documentation listed accepted "type" values for the read() function (and what those options mean). For example: buffer, base64, file, binary, etc.

Are pull requests being looked at? If so, I'd be up for helping out on this.

@SheetJSDev
Copy link
Contributor

We're slowly improving the documentation. The underlying structure is stabilizing, and for most supported features all of the parsers should generate the same fields consistent with the README.

At the moment we're trying to go feature by feature and hammer out documentation while we ensure the main supported parsers (XLS/XLSX/XLSB/SpreadsheetML) produce similar parsed objects. If there is anything unclear in the README please let us know where we can improve.

@blakek if you are interested in contributing documentation, we'd gladly accept a PR! Documentation is the one area that doesn't involve hours of testing :D If you have any demos or projects, please add a link to the wiki page so others can find it!

@SheetJSDev
Copy link
Contributor

We set up a gitbook integration: https://sheetjs.gitbooks.io/docs/

soldemuth pushed a commit to soldemuth/js-xlsx that referenced this issue Nov 23, 2017
Fixed Style Table in README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants