Skip to content
This repository has been archived by the owner on May 23, 2024. It is now read-only.

Out-of-sync Python version (code/requirements.txt fails to install matplotlib) #133

Closed
athewsey opened this issue Apr 19, 2020 · 3 comments

Comments

@athewsey
Copy link

I'm creating and deploying a TensorFlow v1.12 estimator via the SageMaker SDK, which creates code/ folder with inference.py and requirements.txt in the model.tar.gz folder. Simplified setup as below:

estimator = TensorFlow(
    entry_point="train.py",
    source_dir="src",
    framework_version="1.12",
    py_version="py3",
    input_mode="Pipe",  # (PipeModeDataset only supports up to TFv1.12 atm)
)
estimator.fit()
predictor = estimator.deploy(
    endpoint_type="tensorflow-serving",
)

My requirements include matplotlib (using their RGB/HSV conversion utilities to pre-process model inputs), and I get the following error when calling deploy():

Collecting matplotlib (from -r /opt/ml/model/code/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/4a/30/eb8e7dd8e3609f05c6920fa82f189302c832e5a0f6667aa96f952056bc0c/matplotlib-3.2.1.tar.gz (40.3MB)

Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-u7sdl9if/matplotlib/setup.py", line 139
        raise IOError(f"Failed to download jquery-ui.  Please download "
                                                                       ^
    SyntaxError: invalid syntax
    
    ----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-u7sdl9if/matplotlib/
You are using pip version 8.1.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
ERROR:__main__:failed to install required packages, exiting.

I understand from matplotlib/matplotlib#17075 that this was due to bumping the required Python version up to 3.6: But I'm using the same requirements.txt at training time and it's working fine there.

Since users don't get fine-grained control over the Python version (just py3), IMHO the training and serving containers should at least sync to the same version to prevent weird compatibility issues like this?

It looks like the training container takes some specific steps to ensure particular Python version, whereas this inference container just takes what it inherits from tensorflow/serving and nvidia/cuda

@laurenyu
Copy link
Contributor

sorry for the delayed response here. https://github.com/aws/sagemaker-tensorflow-extensions supports more recent versions of TF, including 1.15 and 2.1 - does the issue persist with those versions?

@athewsey
Copy link
Author

I tested the other TFv1.x framework containers today, but it's a little trickier to adapt my script for 2.x so haven't looked yet:

  • TF 1.12 trains with Python 3.6.8 (SMTF 1.12.0.1.0.1); deploys with Python 3.5.2
  • TF 1.13 trains with Python 3.6.6 (SMTF 1.13.1.1.0.0); deploys with Python 3.6.8
  • TF 1.14 trains with Python 3.6.6 (SMTF 1.14.0.1.0.0); deploys with Python 3.6.8
  • TF 1.15 trains with Python 3.6.9 (SMTF 1.15.0.1.1.0); deploys with Python 3.6.9

(...also I think I noticed that tensorflow is importable on the TFv1.12 and 1.15 inference containers, but not 1.13 or 1.14? Kinda weird...)

So 1.15 looks in sync, and while 1.13/1.14 have some discrepancies they're only in minor version so less likely to be breaking.

Would happily upgrade to 1.15, but seem to be blocked by aws/sagemaker-tensorflow-extensions#46

@ajaykarpur
Copy link
Contributor

Hi @athewsey, the TF 1.12 image is no longer being maintained. We'll look at addressing the issue blocking you from upgrading to 1.15.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants