Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31382][BUILD] Show a better error message for different python and pip installation mistake #28152

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions python/pyspark/find_spark_home.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def is_spark_home(path):
paths = ["../", os.path.dirname(os.path.realpath(__file__))]

# Add the path of the PySpark module if it exists
import_error_raised = False
if sys.version < "3":
import imp
try:
Expand All @@ -49,7 +50,7 @@ def is_spark_home(path):
paths.append(os.path.join(module_home, "../../"))
except ImportError:
# Not pip installed no worries
pass
import_error_raised = True
else:
from importlib.util import find_spec
try:
Expand All @@ -59,7 +60,7 @@ def is_spark_home(path):
paths.append(os.path.join(module_home, "../../"))
except ImportError:
# Not pip installed no worries
pass
import_error_raised = True

# Normalize the paths
paths = [os.path.abspath(p) for p in paths]
Expand All @@ -68,6 +69,19 @@ def is_spark_home(path):
return next(path for path in paths if is_spark_home(path))
except StopIteration:
print("Could not find valid SPARK_HOME while searching {0}".format(paths), file=sys.stderr)
if import_error_raised:
print(
"\nDid you install PySpark via a package manager such as pip or Conda? If so,\n"
"PySpark was not found in your Python environment. It is possible your\n"
"Python environment does not properly bind with your package manager.\n"
"\nPlease check your default 'python' and if you set PYSPARK_PYTHON and/or\n"
"PYSPARK_DRIVER_PYTHON environment variables, and see if you can import\n"
"PySpark, for example, 'python -c 'import pyspark'.\n"
"\nIf you cannot import, you can install by using the Python executable directly,\n"
"for example, 'python -m pip install pyspark [--user]'. Otherwise, you can also\n"
"explicitly set the Python executable, that has PySpark installed, to\n"
"PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON environment variables, for example,\n"
"'PYSPARK_PYTHON=python3 pyspark'.\n", file=sys.stderr)
sys.exit(-1)

if __name__ == "__main__":
Expand Down