Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid cell coordinate 1 during IOFactory::load #2501

Closed
petruchek opened this issue Jan 14, 2022 · 1 comment · Fixed by #2504
Closed

Invalid cell coordinate 1 during IOFactory::load #2501

petruchek opened this issue Jan 14, 2022 · 1 comment · Fixed by #2504

Comments

@petruchek
Copy link

This is:

- [X] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

$spreadsheet object successfully created

What is the current behavior?

Error: PhpOffice\PhpSpreadsheet\Exception: Invalid cell coordinate 1

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load("./example.xlsx");

Which versions of PhpSpreadsheet and PHP are affected?

Latest (1.21.0) is still affected, tested on PHP 7.4.

File to reproduce

I'm attaching the file - I don't know how it was generated, but I can open it with Excel 2013 without any warnings. When I re-save the file without applying any changes, the problem disappears (but the file is obviously not the same one - even file size differs). It looks like the original file some weird structure that PhpSpreadsheet is unable to handle correctly.
example.xlsx

@oleibman
Copy link
Collaborator

Thank you for supplying your example file. It has a merge range stored as 1:1, i.e. the entire first row is merged as cell A1. PhpSpreadsheet is not expecting this. When you resave the file with Excel, the merge range is now stored as A1:XFD1, and PhpSpreadsheet does understand that.

This is a relatively easy oversight to fix. But, in doing so, it appears that the logic for setting up the merge range within PhpSpreadsheet is very costly in terms of memory and speed, a problem which isn't really exposed until you deal with large ranges like that. So, the fix, which I am working on, will be a little more complicated than merely fixing the reported problem.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Jan 17, 2022
Fix PHPOffice#2501. Merge range can be supplied as entire rows or columns, e.g. `1:1` or `A:C`. PhpSpreadsheet is expecting a row and a column to be specified for both parts of the range, and fails when the unexpected format shows up.

The code to clear cells within the merge range is very inefficient in terms of both memory and time, especially when the range is large (e.g. for an entire row or column). More efficient code is substituted. It is possible that we can get even more efficient by deleting the cleared cells rather than setting them to null. However, that needs more research, and there is no reason to delay this fix while I am researching.

When Xlsx Writer encounters a null cell, it writes it to the output file. For cell merges (especially involving whole rows or columns), this results in a lot of useless output. It is changed to skip the output of null cells when (a) the cell style matches its row's style, or (b) the row style is not specified and the cell style matches its column's style.
oleibman added a commit that referenced this issue Jan 23, 2022
* Xlsx Reader Merge Range For Entire Column(s) or Row(s)

Fix #2501. Merge range can be supplied as entire rows or columns, e.g. `1:1` or `A:C`. PhpSpreadsheet is expecting a row and a column to be specified for both parts of the range, and fails when the unexpected format shows up.

The code to clear cells within the merge range is very inefficient in terms of both memory and time, especially when the range is large (e.g. for an entire row or column). More efficient code is substituted. It is possible that we can get even more efficient by deleting the cleared cells rather than setting them to null. However, that needs more research, and there is no reason to delay this fix while I am researching.

When Xlsx Writer encounters a null cell, it writes it to the output file. For cell merges (especially involving whole rows or columns), this results in a lot of useless output. It is changed to skip the output of null cells when (a) the cell style matches its row's style, or (b) the row style is not specified and the cell style matches its column's style.

* Scrutinizer

See if these changes appease it.

* Improved CellIterators

Finally figured out how to improve efficiency here, meaning that there is no longer a reason to change Writer/Xlsx, so restore that.

* No Change for CellIterator

I had thought a change was needed for CellIterator, but it isn't.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants