Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError in zip.write() #2

Open
athewsey opened this issue Sep 16, 2020 · 0 comments
Open

UnicodeEncodeError in zip.write() #2

athewsey opened this issue Sep 16, 2020 · 0 comments

Comments

@athewsey
Copy link

I'm attempting to get this sample (which builds a container image from notebook in the "Define a SageMaker Model Monitor schedule" section) running in SageMaker Studio, using the new CLI.

Essentially there is a ./docker/ folder next to my notebook containing just a Dockerfile and evaluation.py script.

However when I run:

!sm-docker build ./docker --file ./docker/Dockerfile --repository sagemaker-processing-container:latest

(Or same without specifying the --file or --repository options, or omitting the :latest tag) I get the following error:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/zipfile.py", line 432, in _encodeFilenameFlags
    return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-31: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/bin/sm-docker", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_studio_image_build/cli.py", line 92, in main
    args.func(args, unknown)
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_studio_image_build/cli.py", line 53, in build_image
    args.repository, get_role(args), args.bucket, extra_args, log=not args.no_logs
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_studio_image_build/builder.py", line 68, in build_image
    bucket, key = upload_zip_file(repository, bucket, " ".join(extra_args))
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_studio_image_build/builder.py", line 39, in upload_zip_file
    zip.write(f"{dirname}/{file}")
  File "/opt/conda/lib/python3.6/zipfile.py", line 1622, in write
    with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
  File "/opt/conda/lib/python3.6/zipfile.py", line 1355, in open
    return self._open_to_write(zinfo, force_zip64=force_zip64)
  File "/opt/conda/lib/python3.6/zipfile.py", line 1468, in _open_to_write
    self.fp.write(zinfo.FileHeader(zip64))
  File "/opt/conda/lib/python3.6/zipfile.py", line 422, in FileHeader
    filename, flag_bits = self._encodeFilenameFlags()
  File "/opt/conda/lib/python3.6/zipfile.py", line 434, in _encodeFilenameFlags
    return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 11-31: surrogates not allowed

It's a weird error so I could well be doing something stupid - but am wondering if there's an implicitly assumed encoding somewhere which is clashing with this kernel's environment?

I don't have any special chars in filenames, and am running Studio kernel Python 3 (PyTorch CPU Optimized).

Any ideas or insights greatly appreciated!

Full steps to reproduce

(From the referenced public sample above)

  • Add this package to the set of pip installs at the top
  • Replace the ! unzip ... command with something like the following (since Studio kernels don't have unzip installed by default)
import zipfile
with zipfile.ZipFile('GTSRB_Final_Test_Images.zip', 'r') as zip_ref:
    print('Unzipping...')
    zip_ref.extractall()
  • Split the cell containing # Create ECR repository and push docker image: Execute just the first (Python) half and run the above sm-docker command instead of the sample's !docker build ... line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant