-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sheetNames to WorkbookReader #196
Conversation
Lgtm 😄 |
1 similar comment
Lgtm 😄 |
Seems to work okay for me, just waiting for an official merge and bump in version number. |
@qfjp forgot to ask that before merging: Could you add a little documentation/example to |
Just requested to merge my additions in the readme and changelog. Let me know if it needs any changes. |
Released as |
No problem, thanks for having such a quick turn-around |
@qfjp, Could you share the pyspark syntax to return the sheet names? |
Hi, Is there a way to do the same (get a list of sheetNames) in Python ? |
Hi @E-HO, you would probably need to use an approach similar to this (on a phone, so can't test): reader = spark._jvm.com.crealytics.spark.excel.WorkbookReader(
{"path": "Worktime.xlsx"},
spark.sparkContext.hadoopConfiguration
)
sheetnames = reader.sheetNames() Alternatively, you could try reading the Excel file with a Python-based Excel reader to get the sheet names and use spark-excel to read the contents. |
@Fingolfin123 a quick Google search yielded this: |
Hello! I'm trying to use the WorkbookReader to read the sheet names of an Excel in Python to programmatically read each sheet into a DataFrame.
Below is a snippet of the code.
Thanks in advance! |
Ah, a Python dict does not get converted into a Scala Map, but a Java one... |
That's unfortunate; thanks for the clarification! |
Added a PR for this - @nightscape can you verify? |
Merged 👍 |
I still getting
may i know is the merge available in |
Hmm, it should definitely be in 0.18.5: https://github.com/crealytics/spark-excel/blob/main/src/main/scala/com/crealytics/spark/excel/WorkbookReader.scala#L58 |
hello @nightscape i use this https://repo1.maven.org/maven2/com/crealytics/spark-excel_2.12/3.3.1_0.18.5/spark-excel_2.12-3.3.1_0.18.5.jar , anyone else faced similar issue? |
anyone tested this ? |
Hi, am trying to use the workbook reader to dynamically obtain multiple sheet names from the same excel file, but ran into this error: "An error occurred while calling None.com.crealytics.spark.excel.WorkbookReader. Trace:\npy4j.Py4JException: Constructor com.crealytics.spark.excel.WorkbookReader([class java.util.HashMap, class org.apache.hadoop.conf.Configuration]) does not exist\n\tat py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)\n\tat py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)\n\tat py4j.Gateway.invoke(Gateway.java:237)\n\tat py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)\n\tat py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)\n\tat py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)\n\tat py4j.ClientServerConnection.run(ClientServerConnection.java:106)\n\tat java.lang.Thread.run(Thread.java:750)\n\n" Any suggestion on how to resolve this? |
Ah, I think I know what the issue is. The method you need to call is actually not a constructor, but a static method. You can also check the required signature like this javap jar:file:///path/to/downloaded/spark-excel_2.12-3.3.1_0.18.5.jar!/com/crealytics/spark/excel/WorkbookReader.class |
Thanks nightscape! The static method works! |
Great! Would you mind creating a PR to enhance the documentation? |
Targeting #42