Reading multiple files #97

surtamikalai · 2018-11-29T08:59:58Z

It is no an issue, but I just don't know where is the best place to discuss project related question. Do you have in plans to add feature to read multiple xlsx files in directory, not only one at once? Spark supports doing it while working with .csv files by specifying wildcards/regexp. Then you "input_file_name()" function that returns column with source filename of every record.

nightscape · 2018-11-29T10:04:32Z

Duplicate of #74

* register data source for .format("excel") * ignore .vscode * V2 with new Spark Data Source API, uses FileDataSourceV2 * set header default to true, got 1st test passed * ExcelHelper become options awareness * handle string type for error-formula * PlainNumberReadSuite is good now. Also fixed the issue in #285. This introduces a breaking change (good, I think) * test-case for issue_285 * Handling Error Cells and Undefined Rows * Test cases for #52 #74 #97 issues * format & test cases for column pruning (projection) * Added more test-cases for numerical types * Stricter numerical types (Integer, Long and Double) in schema inferring. Issue #162 * preparing for final push on writing * Apply format & Writing is working * Added excel-row-number column for issues #40 #59 #115 and refactoring * refactoring unit-tests * preparing for MR * Update all test-cases with ScalaTest 3.x * Writing aware about dataAddress * writing with dataAddress; No change on dependencies nor build script * Schema Infering Improvement: {Iterator instead of Seq; Use both samplingRatio and excerptSize} * added more recent spark version to CI/CD * support from spark 2.4.1 up * Fix scalastyle check & enable non-ascii character due to native of unit-tests * Update src/main/2.4/scala/com/crealytics/spark/v2/excel/ExcelDataSource.scala Co-authored-by: Martin Mauch <martin.mauch@gmail.com> * Update src/main/2.4/scala/com/crealytics/spark/v2/excel/ExcelDataSource.scala Co-authored-by: Martin Mauch <martin.mauch@gmail.com> * spark-excel examples in Jupyter Notebook Co-authored-by: Martin Mauch <martin.mauch@gmail.com>

nightscape marked this as a duplicate of #74 Nov 29, 2018

nightscape closed this as completed Nov 29, 2018

quanghgx mentioned this issue May 2, 2021

Co-maintainers wanted #191

Open

quanghgx added a commit to quanghgx/spark-excel that referenced this issue Jun 20, 2021

Test cases for crealytics#52 crealytics#74 crealytics#97 issues

4d17c34

quanghgx mentioned this issue Jun 27, 2021

#210 File format v2 #389

Merged

quanghgx added a commit to quanghgx/spark-excel that referenced this issue Aug 12, 2021

Test cases for crealytics#52 crealytics#74 crealytics#97 issues

6e71b76

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading multiple files #97

Reading multiple files #97

surtamikalai commented Nov 29, 2018

nightscape commented Nov 29, 2018

Reading multiple files #97

Reading multiple files #97

Comments

surtamikalai commented Nov 29, 2018

nightscape commented Nov 29, 2018