Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executor Attribute Error when trying to run custom python function-based components with Kubeflow Runner #3628

Closed
bkhuong opened this issue Apr 27, 2021 · 6 comments

Comments

@bkhuong
Copy link

bkhuong commented Apr 27, 2021

Environment: Google Cloud (Uploading pipeline yaml to Kubeflow Pipelines instance)
TFX Version: 0.29
Python version: 3.7.3

Describe the current behavior
I get the following error:


2021-04-27 23:41:30.195332: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 364, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 319, in main
    component = json_utils.loads(args.serialized_component)
  File "/opt/conda/lib/python3.7/site-packages/tfx/utils/json_utils.py", line 193, in loads
    return json.loads(s, cls=_DefaultDecoder)
  File "/opt/conda/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/opt/conda/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/conda/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/opt/conda/lib/python3.7/site-packages/tfx/utils/json_utils.py", line 173, in _dict_to_object
    return _extract_class(dict_data)
  File "/opt/conda/lib/python3.7/site-packages/tfx/utils/json_utils.py", line 163, in _extract_class
    return getattr(importlib.import_module(module_name), class_name)
AttributeError: module '__main__' has no attribute 'PathGeneratorComponent_Executor'

Describe the expected behavior
The PathGeneratorComponent successfully outputs the string artifact "some string".

Standalone code to reproduce the issue

from tfx.dsl.component.experimental.annotations import OutputArtifact
from tfx.dsl.component.experimental.decorators import component
from tfx.types.standard_artifacts import String
from tfx.orchestration import pipeline
from tfx.orchestration.kubeflow import kubeflow_dag_runner

@component
def PathGeneratorComponent(
        output_path: OutputArtifact[String]
):
    output_path.value = "some string"


def create_pipeline():

    path_generator = PathGeneratorComponent()

    return pipeline.Pipeline(
        pipeline_name="test",
        pipeline_root="test",
        components=[path_generator]
    )

path_pipeline = create_pipeline()

config = kubeflow_dag_runner.KubeflowDagRunnerConfig(
    kubeflow_metadata_config=kubeflow_dag_runner.
        get_default_kubeflow_metadata_config(),
    tfx_image='tensorflow/tfx:0.29.0'
)

kfp_runner = kubeflow_dag_runner.KubeflowDagRunner(
    output_filename='test.yaml', config=config
)

kfp_runner.run(path_pipeline)
@ruoyu90
Copy link
Contributor

ruoyu90 commented May 3, 2021

@charlesccychen any thoughts about the code packaging solution?

@charlesccychen
Copy link
Contributor

Hi @bkhuong, we have an issue where it is not possible to use function-based components if they are defined in the same file as the pipeline, which we will clarify. Can you refactor your code so that the component is imported from a different module instead? See https://www.tensorflow.org/tfx/guide/custom_function_component for a working notebook example.

@codesue
Copy link
Contributor

codesue commented Sep 9, 2021

I ran into the same issue and tried the workaround in the notebook example. It worked for InteractiveContext, but my pipeline failed in Kubeflow Pipelines with a ModuleNotFoundError. I fixed that by building a Docker image that includes my module and tfx deps and setting this image as the tfx_image in my config. Would the packaging solution mean not having to write the component in a different module and not having to build a new image?

@gaikwadrahul8
Copy link

Hi, @bkhuong

Apologies for the delay and I would suggest you to please follow the section of Python Function-Based Components in this article and we always recommend to go with latest version of TFX and KFP and you can refer Compatibility Matrix here for TFX and KFP and you can also refer to these resources for custom component in TFX [1],[2],[3],[4]

If you're looking for complete end to end example of TFX with Kubeflow then you can look into this article

I am able to reproduce the same code without any error and for your reference, I have added gist file here so please use latest versions of TFX and KFP with below commands in Google Colab:

!pip install tfx
!pip3 install kfp --upgrade

If above workaround is working fine for you, Could you please close this issue and If issue still persists please let us know ?

Thank you!

@gaikwadrahul8
Copy link

Hi, @bkhuong

Closing this issue due to lack of recent activity for couple of weeks. Please feel free to reopen the issue or post comments, if you need any further assistance or update. Thank you!

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants