Skip to content
This repository has been archived by the owner on Feb 3, 2021. It is now read-only.

Master and Worker Node on different Python Versions #723

Closed
manthanthakker opened this issue Jun 18, 2020 · 1 comment
Closed

Master and Worker Node on different Python Versions #723

manthanthakker opened this issue Jun 18, 2020 · 1 comment

Comments

@manthanthakker
Copy link

manthanthakker commented Jun 18, 2020

Docker Image: aztk/spark:v0.1.0-spark2.2.0-miniconda-base

(Tried with different docker images too, the issue still persists)

My cluster has 2 dedicated nodes and 3 low priority nodes. I have enabled jupyterlab and jupyter plugin as mentioned in the documentation:

plugins:
  - name: jupyterlab
  - name: jupyter

When I try to execute the sample Jupyter Calculate PI notebook, I get the following error:

Exception: Python in worker has different version 3.6 then that in driver 3.7, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

I tried to see the python version on each node


aztk spark cluster run --id myclustername "python --version" 

and realized the master node to be on Python 3.7.0 vs worker nodes on Python 3.6.4 :: Anaconda, Inc.

This looks like a bug in installing a plugin. How can I fix this?

@manthanthakker
Copy link
Author

This issue is due to Jupyterlab plugin,

image
When a customized docker image is deployed, the Jupyterlab installation script somehow upgrades the python version only
on the head node. Removing jupyterlab from the plugin list resolves this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant