Replies: 1 comment
-
Thanks for the message. You read them in simple mode - this requires that all sheets have the same number of columns and the same type. Spark requires the data in tabular format. https://github.com/ZuInnoTe/hadoopoffice/blob/main/examples/scala-spark3-excel-in-ds/src/main/scala/org/zuinnote/spark/office/example/excel/SparkScalaExcelInDataSource.scala shows how you can process this. An alternative would be that you ensure that your Excel has the same columns in all sheets. |
Beta Was this translation helpful? Give feedback.
-
Here we are using the format("org.zuinnote.spark.office.excel").option("hadoopoffice.read.header.read", "true").option("read.locale.bcp47", "en").option("read.spark.simpleMode", true).load(path);
We need to read the multiple sheets data from an excel file. We are using above approach, but it is returning clubbed and half of the information from all the sheets present in it.
.xlsx we are using having 3 sheets in it. It is giving first sheet cells information and it is clubbing the information of other two sheets cell information.
Beta Was this translation helpful? Give feedback.
All reactions