Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-50] Removing logging configuration as it causes errors on databricks. #530

Merged
merged 1 commit into from Jun 13, 2021

Conversation

skorski
Copy link
Contributor

@skorski skorski commented Jun 8, 2021

Is this PR related to a proposed Issue?

https://issues.apache.org/jira/browse/SEDONA-50

What changes were proposed in this PR?

Importing this file causes logging issues within databricks.

image

Full error:

--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.7/logging/__init__.py", line 1025, in emit
    msg = self.format(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 869, in format
    return fmt.format(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 611, in format
    s = self.formatMessage(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 580, in formatMessage
    return self._style.format(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 422, in format
    return self._fmt % record.__dict__
ValueError: unsupported format character '%' (0x25) at index 11
Call stack:
  File "/local_disk0/tmp/1623158547967-0/PythonShell.py", line 30, in <module>
    launch_process()
  File "/local_disk0/tmp/1623158547967-0/PythonShellImpl.py", line 1907, in launch_process
    shell.executor.run()
  File "/local_disk0/tmp/1623158547967-0/PythonShellImpl.py", line 285, in run
    self.shell.shell.run_cell(command_id, cmd, store_history=True)
  File "/local_disk0/tmp/1623158547967-0/PythonShellImpl.py", line 1233, in run_cell
    super(IPythonShell, self).run_cell(raw_cell, store_history, silent, shell_futures)
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2858, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2886, in _run_cell
    return runner(coro)
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3063, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3254, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/databricks/python/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<command-3221151447083464>", line 3, in <module>
    logging.warning("Test2")
Message: 'Test2'
Arguments: ()

How was this patch tested?

I manually removed this line and the logging will work again. It is unclear why the library is doing this.

@skorski
Copy link
Contributor Author

skorski commented Jun 10, 2021

@jiayuasu Should all of these tests pass? It seems odd that this change would cause some of the other failures.

@jiayuasu
Copy link
Member

@yitao-li Hi Yitao, could you please check why the R build failed at Spark 3.1.1?

@yitao-li
Copy link
Contributor

@jiayuasu I don't see how this change would break the R build.

It could be the case that there was some transient failure in fetching some MVN dependencies for Sedona (e.g., org.wololo:jts2geojson:0.14.31 or similar), in which case re-running the same workflow again might be useful.

@yitao-li
Copy link
Contributor

yitao-li commented Jun 11, 2021

I guess I'll replicate this change in my own fork of the sedona repo, create a PR (for debugging purpose only), and then see what happens in the R build.

The Spark-3.1.1-specific errors were the following:

Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at sparklyr.Shell$.main(shell.scala:9)
at sparklyr.Shell.main(shell.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
which is weird.

However, everything is still working as expected with Spark 3.0.0 which is built with Scala 2.12, it was just Spark version 3.1.1 (also built with Scala 2.12) that was problematic. This means Scala version mismatch is unlikely an issue here.

@jiayuasu jiayuasu changed the title removing logging configuration as it causes errors on databricks. [SEDONA-50] Removing logging configuration as it causes errors on databricks. Jun 13, 2021
@jiayuasu
Copy link
Member

jiayuasu commented Jun 13, 2021

@skorski @yitao-li Not sure why R fails at Spark 3.1.1. But I think this PR itself has nothing to do with this issue. I will merge this PR first.

I also re-ran the CI several times. It always failed.

@jiayuasu jiayuasu added this to the sedona-1.1.0 milestone Jun 13, 2021
@jiayuasu jiayuasu merged commit d8c2aae into apache:master Jun 13, 2021
@yitao-li
Copy link
Contributor

@jiayuasu I figured out the errors with Spark 3.1.1 turned out to be a minor issue on sparklyr side. The fix has been merged to the main branch of sparklyr and all the R builds should be suceeding now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants