Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex Excel Structure -> Cells location jumbled upon saving #259

Closed
sergio-rivas opened this issue Feb 24, 2021 · 13 comments
Closed

Complex Excel Structure -> Cells location jumbled upon saving #259

sergio-rivas opened this issue Feb 24, 2021 · 13 comments

Comments

@sergio-rivas
Copy link

When parsing a complex structure excel file (i.e. lots of merges, etc), then modifying a few specific cells, the output file changes the cell locations of many unrelated cells around the sheet (both row/column position are being shifted for other cells).

Is there anything I should provide to help you be able to replicate the issue?

@zdavatz
Copy link
Owner

zdavatz commented Feb 26, 2021

what software created the original file?

@sergio-rivas
Copy link
Author

sergio-rivas commented Feb 27, 2021

The file was created with Windows version of Microsoft Excel, it's a .xls file too. I think it might have to do with the fact that the file has many use of merged cells.

I tried two simple tests to create an easy replication:

  1. open the file and use spreadsheet to write to new file -> no problem
  2. open the file and use spreadsheet to edit only one cell -> issue replicated

Would I be able to privately send the XLS file so you can confirm the issue?
I don't mind helping out by sponsoring the bugfix too. Let me know how much.

@zdavatz
Copy link
Owner

zdavatz commented Feb 27, 2021

Please try to open and save the file with LibreOffice and then try to edit it again using spreadsheet. Does that work?

@sergio-rivas
Copy link
Author

sergio-rivas commented Mar 1, 2021

After saving w/ LibreOffice, now spreadsheet gem cannot read the excel.
Error:

Spreadsheet::Errors::UnknownCodepage: Unknown Codepage 0xac8d
from .../ruby/gems/2.6.0/gems/spreadsheet-1.2.7/lib/spreadsheet/excel/reader.rb:75:in `block in encoding'

@zdavatz
Copy link
Owner

zdavatz commented Mar 2, 2021

Did you save the file as UTF-8? Which version of LibreOffice and which OS are you using?

@sergio-rivas
Copy link
Author

sergio-rivas commented Mar 2, 2021

LibreOffice 7.1.0.3, Mac OSX.

There is no option to choose UTF-8 when saving as xls. The resulting file can be opened in LibreOffice, but is not readable by spreadsheet gem. Can I email you the original file and you try to open/save with your libre office version, then read it in spreadsheet gem? This way we can eliminate the possibility of environmental factors?

Screen Shot 2021-03-02 at 1 23 08 AM

@zdavatz
Copy link
Owner

zdavatz commented Mar 2, 2021

Please share the Ruby script with which you want to open the file. Also look at this commit here: https://github.com/zdavatz/spreadsheet/blob/master/History.md#111--03012016 where we had a similar error. Maybe you can send a Pull Request. This is the commit: e59b337

@sergio-rivas
Copy link
Author

sergio-rivas commented Mar 13, 2021

I tried digging into the code myself, but was unable to get anywhere. Is there a way to know exactly what encoding LibreOffice is using when saving?

I tried setting the unknown codepage to MACROMAN, but would get invalid byte sequence errors, I tried UTF-8, and didn't get invalid byte sequence errors, but in BIFF8, it kept encountering a lot of errors due to nil values here and there.

@zdavatz
Copy link
Owner

zdavatz commented Mar 13, 2021

For me the question is which language/character uses that "strange" characterset in your XLS?

@sergio-rivas
Copy link
Author

sergio-rivas commented Mar 13, 2021

It's all alphanumeric characters. Language is set to English (US), I changed all items to font "Times New Roman"

I even tried running a macros to enforce all alphanumeric characters only.

REM  *****  BASIC  *****
Function AlphaNumericOnly(strSource As String) As String
    Dim i As Integer
    Dim strResult As String

    For i = 1 To Len(strSource)
        Select Case Asc(Mid(strSource, i, 1))
            Case 32 To 32, 48 To 57, 65 To 90, 97 To 122: 'include 32 if you want to include space
                strResult = strResult & Mid(strSource, i, 1)
        End Select
    Next
    AlphaNumericOnly = strResult
End Function

Function CleanSheet(sheetName As String) As Boolean
	Dim sh As Object, z As Object, c As Object
    Dim qCells As Object, enuCells As Object
    sh = ThisComponent.Sheets.getByName(sheetName)
    z = sh.getCellRangeByName("A1:U300")
    qCells = z.queryContentCells(-1)
    enuCells = qCells.Cells.createEnumeration
    Do While enuCells.hasMoreElements
      c = enuCells.nextElement
      If Left(c.Formula, 1) = "=" Then 
		
      Else
		c.string = AlphaNumericOnly(c.String)
	  End If  
    Loop
    CleanSheet = true
End Function


Sub CleanAll()
	CleanSheet("GRID")
	CleanSheet("RESULTS")
	CleanSheet("PICTURES")
End Sub

@zdavatz
Copy link
Owner

zdavatz commented Mar 13, 2021

Which Windows Version and which MS Excel Version created the orginial file? Can these people try to create the same file using LibreOffice on Windows?

@sergio-rivas
Copy link
Author

After some messing around with different settings, the following combination fixed my issue:

  1. Uncheck "Protect Sheet"
  2. Uncheck "Protect Spreadsheet Structure"

Screen Shot 2021-03-13 at 1 43 52 PM

@zdavatz
Copy link
Owner

zdavatz commented Mar 15, 2021

Ok, good to know, thank you for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants