New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream does not support seeking. #17
Comments
The way excel binary files are organised the data is not necessary in a sequential order, this is the reason that the stream has to be seekable. It would be good however to remove this need for a seekable stream as it has impacted a few people (processing http streams - necessitating saving to file first). So, I am open for a discussion on how to achieve this. |
I thought that data in Excel files were organized sequentially. I.E. I unzip an .xlsx file and I look at an individual XML worksheet which is clearly structured. If there are strings then those are stored in sharedStrings.xml and those aren't necessarily in order, but they can be looked up and data can still be retrieved sequentially from the original worksheet. |
Yes but that is xlsx. The binary format is not stored sequentially, in fact it's quite a (overly) complicated file format. |
XLS compound document streams are stored in sectors of 512 bytes chunks which could be physically located anywhere in the file. The order of the sectors/chunks is dictated by the compound document FAT tables, which themselves are not always stored sequenctially. (slightly simplified :) ) This makes it really hard to parse anything but specially crafted xls as forward-only streams (without loading everything into memory) XSLX is essentially a .zip file which has a file system inside; there is no guarantee the compressed files are organized in the order expected by ExcelDataReader. Additionally System.IO.Compression.ZipArchive does not support forward-only unzipping well. Likewise, this makes it really hard to parse anything but specially crafted xlsx as forward-only streams. For these reasons I think supporting forward-only streams is not within the scope of ExcelDataReader. ExcelDataReader should continue to focus on low memory footprint and thus require seekable streams for input. (it might be worth considering an option to support forward-only streams by loading everything into memory, but that was explicitly not requested here) |
https://exceldatareader.codeplex.com/workitem/11988
"I am using a shell that calls a program that decrypts a file to StandardOutput.
Process process = new Process();
//... setup to run decrypter
process.Start();
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(process.StandardOutput.BaseStream);
Can there be a constructor that will handle non-seekable (StreamReader) Streams as well?
My constraints...
I don't want to write the sensitive data to file and the solution must handle large files (reading to memory first will not work)."
The text was updated successfully, but these errors were encountered: