Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file.parseWorksheet(at: path) dies #94

Closed
leuski opened this issue Mar 10, 2020 · 12 comments · Fixed by #120
Closed

file.parseWorksheet(at: path) dies #94

leuski opened this issue Mar 10, 2020 · 12 comments · Fixed by #120
Assignees
Labels
more info needed Not enough details available to proceed

Comments

@leuski
Copy link

leuski commented Mar 10, 2020

Version
0.9.1

Describe the bug
file.parseWorksheet(at: path) throws CoreXLSXError.archiveEntryNotFound when trying to access a single sheet spreadsheet.

parseWorksheetPaths() returns [ "xl//xl/worksheets/sheet1.xml" ], which looks wrong. I'd guess, because parseWorksheetPaths() does not check that a worksheet.target ("/xl/worksheets/sheet1.xml" in this case) contains a root path and adds the directory prefix ("xl") on top of that.

File for reproduction
Unfortunately, I cannot provide the file in question. Try generating one in Windows version fo Excel.

Additional context
Add any other context about the problem here.

@MaxDesiatov MaxDesiatov added the more info needed Not enough details available to proceed label Apr 5, 2020
@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Apr 5, 2020

Hi @leuski, many thanks for this bugreport and I'm sorry for the delayed reply. Unfortunately, I'm not sure if I'll be able to fix it without a test file, especially it would be hard to add a test case file to expand the test suite. As this seems to be an issue with worksheets, not cell data, I wonder if you could delete all the data from the original file, but keep the worksheets and attach the final result then? I hope there would be no confidential data in worksheet names and even if so, those could be renamed, as long as the worksheets themselves are kept as is. I hope that makes sense, looking forward to your reply!

@leuski
Copy link
Author

leuski commented Apr 7, 2020

Right. Unfortunately, as soon as I re-save the file in Mac Excel 16.35, that issue is gone. I have several excel files that exhibit this behavior (worksheet.target = "/xl/worksheets/sheet1.xml") but a few others from the same source have the default content (worksheet.target = "worksheets/sheet1.xml"). I'll try and track down the source of the files to see if I can reproduce this on an empty file.

@MaxDesiatov
Copy link
Collaborator

Thank you, much appreciated!

@robgtsoftware
Copy link

robgtsoftware commented Jul 7, 2020

I have run into the same (or a similar) issue and was curious if there had been any movement here. At the moment, I can't share my files, but am seeing if I can get permission. I have found that the files I have will open in tools like Numbers or Excel without issue, and as soon as they are saved by those tools, the CoreXLSX library can open them just fine. However, before that save, the file fails to parse the workbook

for wbk in try file.parseWorkbooks() {

For the resaved version, the worksheet paths are as follows (["xl/worksheets/sheet1.xml"]).
For the problematic file, the worksheet paths are the same (so probably not the same issue) ["xl/worksheets/sheet1.xml"] and the error obtained looks like this

▿ DecodingError
  ▿ valueNotFound : 2 elements
    - .0 : CoreXLSX.Workbook.Views
    ▿ .1 : Context
      ▿ codingPath : 1 element
        - 0 : CodingKeys(stringValue: "bookViews", intValue: nil)
      - debugDescription : "Expected Views value but found null instead."
      - underlyingError : nil

If you try to explicitly parse the worksheet at the path

let paths = try file.parseWorksheetPaths()
try file.parseWorksheet(at: paths.first!)

You get this error:

▿ DecodingError
  ▿ keyNotFound : 2 elements
    - .0 : CodingKeys(stringValue: "count", intValue: nil)
    ▿ .1 : Context
      ▿ codingPath : 2 elements
        - 0 : CodingKeys(stringValue: "mergeCells", intValue: nil)
        - 1 : CodingKeys(stringValue: "count", intValue: nil)
      - debugDescription : "No attribute or element found for key CodingKeys(stringValue: \"count\", intValue: nil) (\"count\")."
      - underlyingError : nil

I have also found that on the problematic file, you can successfully parse the SharedStrings, so it is just something in the structure of the workbook.

Happy to refile as new issue if you prefer and to provide anything else I can in terms of information. Hopefully, I can figure out a way to share the actual files. If I can get permission, would there be a way I can share them just with you and to not post here?

@MaxDesiatov
Copy link
Collaborator

Thanks for the detailed report @robgtsoftware, #119 may fix this, but it's going to be hard to verify on my side without having access to the file. Feel free to send it to hello@corexlsx.org. Otherwise, could you try with the optional-mergecells-count branch I used in that PR?

MaxDesiatov added a commit that referenced this issue Jul 7, 2020
Attempt to fix some of the issues reported in #94.
@MaxDesiatov
Copy link
Collaborator

That PR is now merged, so please test with the main branch if you don't mind.

@robgtsoftware
Copy link

robgtsoftware commented Jul 7, 2020

Thx for the quick turn around. The PR fixes things such that this now works:

            let paths = try file.parseWorksheetPaths()
            let worksheet = try file.parseWorksheet(at: paths.first!)
            let sharedStrings = try file.parseSharedStrings()
            for row in worksheet.data?.rows ?? [] {
                for c in row.cells {
                    print(c.stringValue(sharedStrings))
                }
            }

However, the following is still throwing the error it was before, e.g. bookViews

try file.parseWorkbooks()

I have been using this via Carthage, but will try to get the source building so I can see if I can help out a bit trying to see where the issue lies..Thanks for the help so far, appreciate it.

UPDATE

So the error is occurring in the XMLCoder framework. If I put a breakpoint on a swift error, here is the stack trace
image

The path of the keys that are getting decoded are as follows: views -> items ->xWindow

On this last key, we hit this block of code in XMLKeyedDecodingContainer.swift and since elements is an empty array, it throws an error (see below)

 case .elementOrAttribute:
            guard
                let anyBox = elements.isEmpty ? attributes.first : elements as Box?
            else {
                throw DecodingError.keyNotFound(key, DecodingError.Context(
                    codingPath: decoder.codingPath,
                    debugDescription:
                    """
                    No attribute or element found for key \
                    \(_errorDescription(of: key)).
                    """
                ))
            }

Below is the series of errors that are thrown

▿ DecodingError
  ▿ keyNotFound : 2 elements
    - .0 : CodingKeys(stringValue: "xWindow", intValue: nil)
    ▿ .1 : Context
      ▿ codingPath : 5 elements
        - 0 : CodingKeys(stringValue: "bookViews", intValue: nil)
        - 1 : CodingKeys(stringValue: "workbookView", intValue: nil)
        ▿ 2 : XMLKey(stringValue: "0", intValue: 0)
          - stringValue : "0"
          ▿ intValue : Optional<Int>
            - some : 0
        ▿ 3 : XMLKey(stringValue: "0", intValue: 0)
          - stringValue : "0"
          ▿ intValue : Optional<Int>
            - some : 0
        - 4 : CodingKeys(stringValue: "xWindow", intValue: nil)
      - debugDescription : "No attribute or element found for key CodingKeys(stringValue: \"xWindow\", intValue: nil) (\"xWindow\")."
      - underlyingError : nil

▿ DecodingError
  ▿ valueNotFound : 2 elements
    - .0 : CoreXLSX.Workbook.View
    ▿ .1 : Context
      ▿ codingPath : 5 elements
        - 0 : CodingKeys(stringValue: "bookViews", intValue: nil)
        - 1 : CodingKeys(stringValue: "workbookView", intValue: nil)
        ▿ 2 : XMLKey(stringValue: "0", intValue: 0)
          - stringValue : "0"
          ▿ intValue : Optional<Int>
            - some : 0
        ▿ 3 : XMLKey(stringValue: "0", intValue: 0)
          - stringValue : "0"
          ▿ intValue : Optional<Int>
            - some : 0
        ▿ 4 : XMLKey(stringValue: "0", intValue: 0)
          - stringValue : "0"
          ▿ intValue : Optional<Int>
            - some : 0
      - debugDescription : "Expected View but found null instead."
      - underlyingError : nil

▿ DecodingError
  ▿ valueNotFound : 2 elements
    - .0 : Swift.Array<CoreXLSX.Workbook.View>
    ▿ .1 : Context
      ▿ codingPath : 2 elements
        - 0 : CodingKeys(stringValue: "bookViews", intValue: nil)
        - 1 : CodingKeys(stringValue: "workbookView", intValue: nil)
      - debugDescription : "Expected Array<View> value but found null instead."
      - underlyingError : nil

▿ DecodingError
  ▿ valueNotFound : 2 elements
    - .0 : CoreXLSX.Workbook.Views
    ▿ .1 : Context
      ▿ codingPath : 1 element
        - 0 : CodingKeys(stringValue: "bookViews", intValue: nil)
      - debugDescription : "Expected Views value but found null instead."
      - underlyingError : nil

Not really knowing what to expect here, I just wanted to pass on everything I saw to see if that helps.

@MaxDesiatov
Copy link
Collaborator

Thank you, this is super helpful! I'll have a look now.

MaxDesiatov added a commit that referenced this issue Jul 7, 2020
Another attempt to resolve #94
@MaxDesiatov
Copy link
Collaborator

I've made yet another stab in the dark in #120, the branch is view-optional-properties. @robgtsoftware if you could test with this one that would be amazing! Thanks.

@robgtsoftware
Copy link

Your stab in the dark was spot on, try file.parseWorkbooks() now works as it should for my file. Thanks so much.

MaxDesiatov added a commit that referenced this issue Jul 7, 2020
@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Jul 7, 2020

Splendid! The fix will be included in the next 0.13.0 version of the library 👍

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Jul 7, 2020

Thanks again for providing very detailed error messages, these were crucial for coming up with the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more info needed Not enough details available to proceed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants