Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hidden requirement for geopandas in apache-sedona[spark] 1.5.2 #1404

Closed
joonaspessi opened this issue May 8, 2024 · 4 comments · Fixed by #1405
Closed

Hidden requirement for geopandas in apache-sedona[spark] 1.5.2 #1404

joonaspessi opened this issue May 8, 2024 · 4 comments · Fixed by #1405

Comments

@joonaspessi
Copy link

Expected behavior

Installing Sedona for pyspark with package apache-sedona[spark] we expect that all package dependencies are installed correctly and geopandas is not needed when not using kepler or pydeck.

Actual behavior

After installing apache-sedona[spark] and trying to import from sedona.spark import * we see failure ModuleNotFoundError: No module named 'geopandas'

$ pip install "apache-sedona[spark]"==1.5.2
$ python
Python 3.8.18 (default, Feb 13 2024, 15:47:05)
[Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sedona.spark import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/spark/__init__.py", line 44, in <module>
    from sedona.maps.SedonaKepler import SedonaKepler
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/maps/SedonaKepler.py", line 18, in <module>
    from sedona.maps.SedonaMapUtils import SedonaMapUtils
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/maps/SedonaMapUtils.py", line 19, in <module>
    import geopandas as gpd
ModuleNotFoundError: No module named 'geopandas'

Steps to reproduce the problem

Create clean python environment and run commands:

$ pip install "apache-sedona[spark]"==1.5.2
$ python
Python 3.8.18 (default, Feb 13 2024, 15:47:05)
[Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sedona.spark import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/spark/__init__.py", line 44, in <module>
    from sedona.maps.SedonaKepler import SedonaKepler
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/maps/SedonaKepler.py", line 18, in <module>
    from sedona.maps.SedonaMapUtils import SedonaMapUtils
  File "/python/versions/3.8.18/lib/python3.8/site-packages/sedona/maps/SedonaMapUtils.py", line 19, in <module>
    import geopandas as gpd
ModuleNotFoundError: No module named 'geopandas'

Settings

Sedona version = 1.5.2

Apache Spark version = 3.5.1

Apache Flink version = ?

API type = Python

Scala version = N/A

JRE version = 1.8

Python version = 3.8

Environment = Standalone

@jiayuasu
Copy link
Member

jiayuasu commented May 8, 2024

@joonaspessi Sorry, this PR accidentally introduces this issue: #1229

To bypass this problem, instead of use from sedona.spark import *, please use from sedona.spark.SedonaContext import SedonaContext

@joonaspessi
Copy link
Author

joonaspessi commented May 8, 2024

Hello, thanks for the fast response!

I think that the python will load the __init__.py file for the sedona.spark module even when importing sub file from the given module sedona.spark.

@jiayuasu
Copy link
Member

jiayuasu commented May 8, 2024

This is very sad. We will make a follow up release to fix this bug.

@jiayuasu
Copy link
Member

We have released 1.5.3 to fix this bug! @joonaspessi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants