New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

free() on invalid pointer #163

Closed
wrathematics opened this Issue Mar 24, 2016 · 2 comments

Comments

Projects
None yet
3 participants
@wrathematics

wrathematics commented Mar 24, 2016

Trying to read this file with readxl produces the following:

> readxl::read_excel("exp.xlsx")
*** Error in `/usr/lib/R/bin/exec/R': free(): invalid pointer: 0x00000000028ccaa8 ***
Aborted

The valgrind report doesn't look very useful to me, but I haven't taken a close look at the codebase.

The data is apparently just some random exponential, but was generated by and exported from JMP (it's for a homework problem, so blame my professor, not me). The exported file might be mal-formed somehow, but maybe not. As I mentioned on twitter, the clang static analyzer scan-build (which I use to compile everything) throws several significant warnings when building readxl, though they appear to be coming from libxls.

@nacnudus

This comment has been minimized.

Contributor

nacnudus commented Jun 14, 2016

The cells don't have the 'r' attribute that gives the cell address. At the moment, readxl depends on that address, and doesn't guess by the cell's position in the xml.

@jennybc

This comment has been minimized.

Member

jennybc commented Jan 20, 2017

@nacnudus Agree.

https://github.com/hadley/readxl/blob/55760cae7dfe9625cab3b5fcecc1893555cea574/src/XlsxCell.h#L40

Note to self: indicative xml in @wrathematics 's example:

<worksheet xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<dimension ref="A1:B31"/>
<sheetViews>...</sheetViews>
<sheetFormatPr defaultRowHeight="12.75"/>
<sheetData>
<row>
<c t="s">
<v>0</v>
</c>
<c t="s">
<v>1</v>
</c>
</row>
MORE ROWS
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>

More typical and what readxl implicitly expects:

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mx="http://schemas.microsoft.com/office/mac/excel/2008/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:mv="urn:schemas-microsoft-com:mac:vml" xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xm="http://schemas.microsoft.com/office/excel/2006/main">
<sheetViews>
<sheetView workbookViewId="0"/>
</sheetViews>
<sheetFormatPr customHeight="1" defaultColWidth="14.43" defaultRowHeight="15.75"/>
<sheetData>
<row r="1">
<c r="A1" s="1" t="s">
<v>0</v>
</c>
<c r="B1" s="1" t="s">
<v>1</v>
</c>
<c r="C1" s="1" t="s">
<v>2</v>
</c>
<c r="D1" s="1" t="s">
<v>3</v>
</c>
<c r="E1" s="1" t="s">
<v>4</v>
</c>
<c r="F1" s="1" t="s">
<v>5</v>
</c>
</row>
MORE ROWS
</sheetData>
<drawing r:id="rId1"/>
</worksheet>

nacnudus added a commit to nacnudus/tidyxl that referenced this issue Jan 22, 2017

Test for graceful JMP import failure
File from tidyverse/readxl#163 (comment)
Lacks 'r' (ref/address) attribute of cells

jennybc added a commit to jennybc/excelgesis that referenced this issue Jan 26, 2017

@jennybc jennybc closed this in #240 Jan 31, 2017

jennybc added a commit that referenced this issue Jan 31, 2017

Load cells at xlsx worksheet ingest; handle skipping and/or blank row… (
#240)

* Load cells at xlsx worksheet ingest; handle skipping and/or blank rows; fixes #224

* Actually these *should* be the same

* Make skipping tests more challenging (blank row btwn col names and data, plus another embedded blank row)

* Simplify return of 0x0 tibble for completely empty worksheet

* Mark cells to start reading from at the time of worksheet construction

* Remove vestigial, internal cell printing function

* Be quiet about empty worksheet

* More tests of nothingness

* Inform worksheet about its own name

* Tighten up the NEWs bullet for this PR

* Make some accessor member functions const

* Use skip() to issue note-to-future-self

Expedient place to park a summary of what this entire PR does.

* Improved parsing of sheet geometry for xlsx. (#240, @jennybc).

    - Better handling of leading and embedded blank rows and explicit row skipping. (#224, #194, #178, #156, #101)
    - Worksheets that are completely empty or that contain only column names no longer error, but return a tibble with zero rows. (#222, #144, #65)
    - Location is inferred for cells that do not declare their location (e.g. xlsx written by JMP). (#163, #102)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment