Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tfx pipeline compile --engine=vertex fails in 1.14.0 #6335

Closed
IzakMaraisTAL opened this issue Sep 28, 2023 · 3 comments
Closed

tfx pipeline compile --engine=vertex fails in 1.14.0 #6335

IzakMaraisTAL opened this issue Sep 28, 2023 · 3 comments

Comments

@IzakMaraisTAL
Copy link
Contributor

IzakMaraisTAL commented Sep 28, 2023

System information

  • Environment in which the code is executed: Local Linux.
  • TFX Version: 1.14.0
  • Python version: 3.8.16
  • Python dependencies:
    I started with a clean virtual env and did pip install tfx=1.14.0 followed by pip install kfp.
pip list --format=freeze

absl-py==1.4.0
anyio==4.0.0
apache-beam==2.50.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
astunparse==1.6.3
attrs==21.4.0
backcall==0.2.0
beautifulsoup4==4.12.2
bleach==6.0.0
cachetools==5.3.1
certifi==2023.7.22
cffi==1.15.1
charset-normalizer==3.2.0
click==8.1.7
cloudpickle==2.2.1
comm==0.1.4
crcmod==1.7
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.1.1
dm-tree==0.1.8
dnspython==2.4.2
docker==4.4.4
docopt==0.6.2
docstring-parser==0.15
entrypoints==0.4
exceptiongroup==1.1.3
fastavro==1.8.3
fasteners==0.19
fastjsonschema==2.18.0
flatbuffers==23.5.26
fqdn==1.5.1
gast==0.4.0
google-api-core==2.12.0
google-api-python-client==1.12.11
google-apitools==0.5.31
google-auth==2.23.0
google-auth-httplib2==0.1.1
google-auth-oauthlib==1.0.0
google-cloud-aiplatform==1.33.1
google-cloud-bigquery==2.34.4
google-cloud-bigquery-storage==2.22.0
google-cloud-bigtable==2.21.0
google-cloud-core==2.3.3
google-cloud-datastore==2.18.0
google-cloud-dlp==3.12.3
google-cloud-language==2.11.1
google-cloud-pubsub==2.18.4
google-cloud-pubsublite==1.8.3
google-cloud-recommendations-ai==0.10.5
google-cloud-resource-manager==1.10.4
google-cloud-spanner==3.40.1
google-cloud-storage==2.11.0
google-cloud-videointelligence==2.11.4
google-cloud-vision==3.4.4
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.6.0
googleapis-common-protos==1.60.0
grpc-google-iam-v1==0.12.6
grpcio==1.58.0
grpcio-status==1.48.2
h5py==3.9.0
hdfs==2.7.2
httplib2==0.22.0
idna==3.4
importlib-metadata==6.8.0
importlib-resources==6.1.0
ipykernel==6.25.2
ipython==7.34.0
ipython-genutils==0.2.0
ipywidgets==7.8.1
isoduration==20.11.0
jedi==0.19.0
Jinja2==3.1.2
joblib==1.3.2
jsonpointer==2.4
jsonschema==4.17.3
jupyter_client==7.4.9
jupyter_core==5.3.2
jupyter-events==0.6.3
jupyter_server==2.7.3
jupyter_server_terminals==0.4.4
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.7
keras==2.13.1
keras-core==0.1.5
keras-tuner==1.4.2
kfp==2.3.0
kfp-pipeline-spec==0.2.2
kfp-server-api==2.0.1
kt-legacy==1.0.5
kubernetes==12.0.1
libclang==16.0.6
Markdown==3.4.4
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.1
ml-metadata==1.14.0
ml-pipelines-sdk==1.14.0
namex==0.0.7
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==7.8.0
nbformat==5.9.2
nest-asyncio==1.5.8
notebook==6.5.6
notebook_shim==0.2.3
numpy==1.24.3
oauth2client==4.1.3
oauthlib==3.2.2
objsize==0.6.1
opt-einsum==3.3.0
orjson==3.9.7
overrides==6.5.0
packaging==20.9
pandas==1.5.3
pandocfilters==1.5.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==10.0.1
pip==22.0.4
pkgutil_resolve_name==1.3.10
platformdirs==3.10.0
portpicker==1.6.0
prometheus-client==0.17.1
prompt-toolkit==3.0.39
proto-plus==1.22.3
protobuf==3.20.3
psutil==5.9.5
ptyprocess==0.7.0
pyarrow==10.0.1
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pydot==1.4.2
pyfarmhash==0.3.2
Pygments==2.16.1
pymongo==4.5.0
pyparsing==3.1.1
pyrsistent==0.19.3
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==24.0.1
regex==2023.8.8
requests==2.31.0
requests-oauthlib==1.3.1
requests-toolbelt==0.10.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.5.3
rsa==4.9
scipy==1.10.1
Send2Trash==1.8.2
setuptools==56.0.0
Shapely==1.8.5.post1
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
sqlparse==0.4.4
tabulate==0.9.0
tensorboard==2.13.0
tensorboard-data-server==0.7.1
tensorflow==2.13.1
tensorflow-data-validation==1.14.0
tensorflow-estimator==2.13.0
tensorflow-hub==0.13.0
tensorflow-io-gcs-filesystem==0.34.0
tensorflow-metadata==1.14.0
tensorflow-model-analysis==0.45.0
tensorflow-serving-api==2.13.0
tensorflow-transform==1.14.0
termcolor==2.3.0
terminado==0.17.1
tfx==1.14.0
tfx-bsl==1.14.0
tinycss2==1.2.1
tornado==6.3.3
traitlets==5.10.1
typing_extensions==4.5.0
uri-template==1.3.0
uritemplate==3.0.1
urllib3==1.26.16
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.3
Werkzeug==2.3.7
wheel==0.41.2
widgetsnbextension==3.6.6
wrapt==1.15.0
zipp==3.17.0
zstandard==0.21.0

Describe the current behavior
tfx compile fails:

tfx pipeline compile --engine=vertex --pipeline_path=minimal_vertex_runner.py
2023-09-28 14:00:05.243828: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-28 14:00:06.779336: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
CLI
Compiling pipeline
Traceback (most recent call last):
  File "/home/izakmarais/.pyenv/versions/minimaltfx/bin/tfx", line 8, in <module>
    sys.exit(cli_group())
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/decorators.py", line 92, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/tfx/tools/cli/commands/pipeline.py", line 315, in compile_pipeline
    handler_factory.create_handler(ctx.flags_dict).compile_pipeline()
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/tfx/tools/cli/handler/handler_factory.py", line 103, in create_handler
    from tfx.tools.cli.handler import vertex_handler  # pylint: disable=g-import-not-at-top
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/tfx/tools/cli/handler/vertex_handler.py", line 27, in <module>
    from tfx.tools.cli.handler import kubeflow_handler
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 26, in <module>
    from tfx.orchestration.kubeflow import kubeflow_dag_runner
  File "/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/tfx/orchestration/kubeflow/kubeflow_dag_runner.py", line 24, in <module>
    from kfp import gcp
ImportError: cannot import name 'gcp' from 'kfp' (/home/izakmarais/.pyenv/versions/3.8.16/envs/minimaltfx/lib/python3.8/site-packages/kfp/__init__.py)

Describe the expected behavior
It should not fail.

Standalone code to reproduce the issue

  1. Create minimal_vertex_runner.py:
import tfx
from tfx.orchestration.kubeflow.v2 import kubeflow_v2_dag_runner


def create_pipeline():
    components = [
        tfx.extensions.google_cloud_big_query.BigQueryExampleGen(query="SELECT 1 as x")
    ]

    return tfx.dsl.Pipeline(
        pipeline_name="pipeline_name",
        pipeline_root="<PIPELINE_ROOT>",
        components=components,
        enable_cache=False,
        metadata_connection_config=None,
        beam_pipeline_args=[
            "--project=<GOOGLE_CLOUD_PROJECT>",
            "--temp_location=<TEMP_LOCATION>",
        ],
        ai_platform_training_args={
            "project": "<GOOGLE_CLOUD_PROJECT>",
            "region": "<GOOGLE_CLOUD_REGION>",
            "masterConfig": {
                "imageUri": "<PIPELINE_IMAGE>",
                "acceleratorConfig": {
                    "count": "1",
                    "type": "<GPU_TYPE>",
                },
            },
            "scaleTier": "CUSTOM",
            "masterType": "<MASTERTYPE>",
        },
    )


runner_config = kubeflow_v2_dag_runner.KubeflowV2DagRunnerConfig("<PIPELINE_IMAGE>")

PIPELINE_DEFINITION_FILE = "pipeline_name.json"
runner = kubeflow_v2_dag_runner.KubeflowV2DagRunner(
    config=runner_config, output_filename=PIPELINE_DEFINITION_FILE
)

runner.run(create_pipeline())
  1. Execute tfx pipeline compile --engine=vertex --pipeline_path=minimal_vertex_runner.py

Name of your Organization (Optional)
Takealot

@singhniraj08
Copy link
Contributor

@IzakMaraisTAL,

I tried to reproduce this with an example notebook and tfx compile command works successfully without any issues.
Ref: Gist

Below is the setup I used with running the notebook. Can you try installing TFX and KFP with
!pip install --upgrade "tfx[kfp]<2" and try running tfx compile command. Please let us know if you face any issues. Thank you!

TensorFlow version: 2.13.1
TFX version: 1.14.0
KFP version: 1.8.22

@IzakMaraisTAL
Copy link
Contributor Author

IzakMaraisTAL commented Oct 5, 2023

Thank you very much, that resolves the issue.

The key was constraining the version of kfp < 2.

(We build upon the official tensorflow/tfx image, this has tfx already installed, but no kfp, so we need to install kfp separately if we want to run tfx compile inside the image).

@github-actions
Copy link

github-actions bot commented Oct 5, 2023

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants