Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: pyspark.connect fails when using databricks-connect #9060

Closed
1 task done
jstammers opened this issue Apr 26, 2024 · 1 comment · Fixed by #9061
Closed
1 task done

bug: pyspark.connect fails when using databricks-connect #9060

jstammers opened this issue Apr 26, 2024 · 1 comment · Fixed by #9061
Labels
bug Incorrect behavior inside of ibis pyspark The Apache PySpark backend
Milestone

Comments

@jstammers
Copy link
Contributor

What happened?

I am unable to create a connection using ibis.pyspark.connect using a remote spark session created using DatabricksSession

from databricks.connect import DatabricksSession
import ibis

spark = DatabricksSession.builder.getOrCreate()
con = ibis.pyspark.connect(spark) #raises PySparkAttributeError

From what I can see, this is because the session.sparkContext attribute is trying to be accessed within pyspark.Backed.do_connect

I have tested this by commenting out this call and can successfully create a connection. I can't see any place where this context is specifically used, so I'm not sure if it's strictly required

What version of ibis are you using?

9.0.0

What backend(s) are you using, if any?

PySpark

Relevant log output

File "/home/****/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-6ab88d163e5c>", line 8, in <module>
    con = ibis.pyspark.connect(spark)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/****/lib/python3.11/site-packages/ibis/__init__.py", line 108, in connect
    return backend.connect(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/****/lib/python3.11/site-packages/ibis/backends/__init__.py", line 848, in connect
    new_backend.reconnect()
  File "/home/****/lib/python3.11/site-packages/ibis/backends/__init__.py", line 862, in reconnect
    self.do_connect(*self._con_args, **self._con_kwargs)
  File "/home/****/lib/python3.11/site-packages/ibis/backends/pyspark/__init__.py", line 197, in do_connect
    self._context = session.sparkContext
                    ^^^^^^^^^^^^^^^^^^^^
  File "/home/****/lib/python3.11/site-packages/pyspark/sql/connect/session.py", line 772, in __getattr__
    raise PySparkAttributeError(
pyspark.errors.exceptions.base.PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `sparkContext` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session. Visit https://spark.apache.org/docs/latest/sql-getting-started.html#starting-point-sparksession for creating regular Spark Session in detail.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@jstammers jstammers added the bug Incorrect behavior inside of ibis label Apr 26, 2024
@cpcloud
Copy link
Member

cpcloud commented Apr 26, 2024

Yep, this property looks entirely unused. PR incoming.

@cpcloud cpcloud added this to the 9.0 milestone Apr 26, 2024
@cpcloud cpcloud added the pyspark The Apache PySpark backend label Apr 26, 2024
cpcloud added a commit that referenced this issue Apr 26, 2024
…ct (#9061)

Remove unused of `sparkContext` attribute, use of which prevents using
spark connect.

Closes #9060.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis pyspark The Apache PySpark backend
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants