Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error gets thrown when Xlsx zip uses Windows directory separator #3472

Closed
1 of 8 tasks
cameronjohnson-mz opened this issue Mar 21, 2023 · 5 comments
Closed
1 of 8 tasks
Assignees
Labels
reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files

Comments

@cameronjohnson-mz
Copy link

This is:

- [ X]([url](url)) a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

I should be able to parse cells with line breaks.

What is the current behavior?

The package throws and exception because there's a row with multiple cells with line breaks. When I remove the row with line breaks it starts working again.

[2023-03-20 19:19:18] local.ERROR: Could not find zip member zip:///tmp/phpANFbmg#_rels/.rels {"userId":3,"exception":"[object] (PhpOffice\PhpSpreadsheet\Reader\Exception(code: 0): Could not find zip member zip:///tmp/phpANFbmg#_rels/.rels at /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:159)
[stacktrace]
#0 /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php(408): PhpOffice\PhpSpreadsheet\Shared\File::assertFile()
#1 /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/BaseReader.php(166): PhpOffice\PhpSpreadsheet\Reader\Xlsx->loadSpreadsheetFromFile()
#2 /var/www/html/app/Utils/Util.php(382): PhpOffice\PhpSpreadsheet\Reader\BaseReader->load()

What are the steps to reproduce?

Try uploading the attached XLSX file. I don't believe there's a problem with the file or the code, the package just can't parse the cell with line breaks.

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$reader = PhpExcel::createReader('Xlsx');
$spreadsheet = $reader->load($file);

test.xlsx

If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading.

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

Which versions of PhpSpreadsheet and PHP are affected?

PHP 8.1.16
phpoffice/phpspreadsheet * 1.25.2

@MarkBaker
Copy link
Member

There should be no problems with cells containing line breaks, it's just a normal character.

But your sample file loads without any issues:

$ php testing/readerTest.php
File LineBreaks.xlsx Identified as Xlsx
Loading File LineBreaks.xlsx using Xlsx Reader

Call time to load spreadsheet file was 0.0360 seconds
Max Row: 1000, Max Column: Z
Max Data Row: 3, Max Data Column: U
 Current memory usage: 8192 KB
    Peak memory usage: 8192 KB

@cameronjohnson-mz
Copy link
Author

cameronjohnson-mz commented Mar 23, 2023

@MarkBaker that's odd, it worked for me too. Here's the original XLSX which shouldn't work and I think it's because it's not a valid XLSX file.
ffane_email_jpn.xlsx

@MarkBaker
Copy link
Member

It doesn't appear to be anything to do with blank lines.

And if Excel can read it, then it's technically valid; although in this case, highly unusual.

An Xlsx file is a zip file containing a collection of XML files; and normally it contains a relationship file _rels/.rels for each main component file, identifying all its related files. If I unzip the Xlsx, then I can see those relationship files where I'd expect; but for some reason the Zip Reader can't access them.

However, a closer look revealed the reason why PhpSpreadsheet can't read them.
A standard xlsx file uses the unix / as a folder separator inside the zip archive (no matter how or where it was created); this file is using the Windows \ instead. So the relationship file isn't _rels/.rels, it's _rels\.rels; the workbook file isn't xl/workbook.xml, it's xl\workbook.xml; etc.

@MarkBaker MarkBaker self-assigned this Mar 24, 2023
@MarkBaker MarkBaker added the reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files label Mar 24, 2023
@MarkBaker
Copy link
Member

I have a potential fix for this; but it's going to need a lot of testing to ensure that I don't break anything for normal Xlsx files

@MarkBaker MarkBaker changed the title Error gets thrown when some cells contain line-breaks Error gets thrown when Xlsx zip uses Windows directory separator Mar 24, 2023
@oleibman
Copy link
Collaborator

Closing, issue appears to have been resolved five months ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files
Development

No branches or pull requests

3 participants