-
Notifications
You must be signed in to change notification settings - Fork 130
Open
Description
Please fill out the form below.
System Information
- Spark or PySpark: PySpark
- SDK Version: 1.4.2
- Spark Version: 2.4.0
- Algorithm (e.g. KMeans): N/A
Describe the problem
By following the code snippet from https://github.com/aws/sagemaker-spark/tree/master/sagemaker-pyspark-sdk#local-spark-on-sagemaker-notebook-instances to run local spark on sagemaker notebook instance (platform identifier notebook-al2-v1
) conda_python3
kernal, import sagemaker_spark
failed. I started another sagemaker notebook instance (platform identifier notebook-al1-v1
) it works well.
Minimal repo / logs
Please provide any logs and a bare minimum reproducible test case, as this will be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
- Exact command to reproduce:
import sagemaker_pyspark
traceback
import sagemaker_pyspark
from pyspark.sql import SparkSession
classpath = ":".join(sagemaker_pyspark.classpath_jars())
spark = SparkSession.builder.config("spark.driver.extraClassPath", classpath).getOrCreate()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_18565/3057191668.py in <cell line: 1>()
----> 1 import sagemaker_pyspark
2 from pyspark.sql import SparkSession
3
4 classpath = ":".join(sagemaker_pyspark.classpath_jars())
5 spark = SparkSession.builder.config("spark.driver.extraClassPath", classpath).getOrCreate()
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker_pyspark/__init__.py in <module>
17 """
18
---> 19 from .wrapper import SageMakerJavaWrapper, Option
20 from .IAMRoleResource import IAMRole, IAMRoleFromConfig
21 from .SageMakerClients import SageMakerClients
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker_pyspark/wrapper.py in <module>
16 from abc import ABCMeta
17
---> 18 from pyspark import SparkContext
19 from pyspark.ml.common import _java2py
20 from pyspark.ml.wrapper import JavaWrapper
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/__init__.py in <module>
49
50 from pyspark.conf import SparkConf
---> 51 from pyspark.context import SparkContext
52 from pyspark.rdd import RDD, RDDBarrier
53 from pyspark.files import SparkFiles
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/context.py in <module>
29 from py4j.protocol import Py4JError
30
---> 31 from pyspark import accumulators
32 from pyspark.accumulators import Accumulator
33 from pyspark.broadcast import Broadcast, BroadcastPickleRegistry
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/accumulators.py in <module>
95 import socketserver as SocketServer
96 import threading
---> 97 from pyspark.serializers import read_int, PickleSerializer
98
99
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/serializers.py in <module>
69 xrange = range
70
---> 71 from pyspark import cloudpickle
72 from pyspark.util import _exception_message
73
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/cloudpickle.py in <module>
143
144
--> 145 _cell_set_template_code = _make_cell_set_template_code()
146
147
~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/cloudpickle.py in _make_cell_set_template_code()
124 )
125 else:
--> 126 return types.CodeType(
127 co.co_argcount,
128 co.co_kwonlyargcount,
TypeError: an integer is required (got type bytes)
Metadata
Metadata
Assignees
Labels
No labels