Skip to content

import sagemaker_spark failed on sagemaker notebook instance (platform identifier notebook-al2-v1) #144

@ohfloydo

Description

@ohfloydo

Please fill out the form below.

System Information

  • Spark or PySpark: PySpark
  • SDK Version: 1.4.2
  • Spark Version: 2.4.0
  • Algorithm (e.g. KMeans): N/A

Describe the problem

By following the code snippet from https://github.com/aws/sagemaker-spark/tree/master/sagemaker-pyspark-sdk#local-spark-on-sagemaker-notebook-instances to run local spark on sagemaker notebook instance (platform identifier notebook-al2-v1) conda_python3 kernal, import sagemaker_spark failed. I started another sagemaker notebook instance (platform identifier notebook-al1-v1) it works well.

Minimal repo / logs

Please provide any logs and a bare minimum reproducible test case, as this will be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

  • Exact command to reproduce: import sagemaker_pyspark

traceback

import sagemaker_pyspark

from pyspark.sql import SparkSession

​

classpath = ":".join(sagemaker_pyspark.classpath_jars())

spark = SparkSession.builder.config("spark.driver.extraClassPath", classpath).getOrCreate()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_18565/3057191668.py in <cell line: 1>()
----> 1 import sagemaker_pyspark
      2 from pyspark.sql import SparkSession
      3 
      4 classpath = ":".join(sagemaker_pyspark.classpath_jars())
      5 spark = SparkSession.builder.config("spark.driver.extraClassPath", classpath).getOrCreate()

~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker_pyspark/__init__.py in <module>
     17 """
     18 
---> 19 from .wrapper import SageMakerJavaWrapper, Option
     20 from .IAMRoleResource import IAMRole, IAMRoleFromConfig
     21 from .SageMakerClients import SageMakerClients

~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker_pyspark/wrapper.py in <module>
     16 from abc import ABCMeta
     17 
---> 18 from pyspark import SparkContext
     19 from pyspark.ml.common import _java2py
     20 from pyspark.ml.wrapper import JavaWrapper

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/__init__.py in <module>
     49 
     50 from pyspark.conf import SparkConf
---> 51 from pyspark.context import SparkContext
     52 from pyspark.rdd import RDD, RDDBarrier
     53 from pyspark.files import SparkFiles

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/context.py in <module>
     29 from py4j.protocol import Py4JError
     30 
---> 31 from pyspark import accumulators
     32 from pyspark.accumulators import Accumulator
     33 from pyspark.broadcast import Broadcast, BroadcastPickleRegistry

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/accumulators.py in <module>
     95     import socketserver as SocketServer
     96 import threading
---> 97 from pyspark.serializers import read_int, PickleSerializer
     98 
     99 

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/serializers.py in <module>
     69     xrange = range
     70 
---> 71 from pyspark import cloudpickle
     72 from pyspark.util import _exception_message
     73 

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/cloudpickle.py in <module>
    143 
    144 
--> 145 _cell_set_template_code = _make_cell_set_template_code()
    146 
    147 

~/anaconda3/envs/python3/lib/python3.8/site-packages/pyspark/cloudpickle.py in _make_cell_set_template_code()
    124         )
    125     else:
--> 126         return types.CodeType(
    127             co.co_argcount,
    128             co.co_kwonlyargcount,

TypeError: an integer is required (got type bytes)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions