Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK/Components - Reworked the component model structures. #642

Conversation

Ark-kun
Copy link
Contributor

@Ark-kun Ark-kun commented Jan 7, 2019

  • Rewrote parsing, type checking and serialization code.
  • Improved the graph component structures.
  • Added most of the needed k8s structures.
  • Added model validation (input/output existence etc).
  • Added task cycle detection and topological sorting to GraphSpec.
  • All container component tests now work.
  • Added some graph component tests.

This change is Reviewable

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 7, 2019

/assign @qimingj @gaoning777

@Ark-kun Ark-kun force-pushed the SDK/Components---Refactored-component-model-structures branch from 478a22c to a23ad59 Compare January 7, 2019 08:52
Copy link
Contributor

@qimingj qimingj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending a few comments now. Will finish reviewing the rest later.

sdk/python/kfp/components/_structures.py Show resolved Hide resolved
sdk/python/kfp/components/_components.py Outdated Show resolved Hide resolved
sdk/python/kfp/components/_structures.py Outdated Show resolved Hide resolved
Ark-kun and others added 6 commits January 7, 2019 18:49
Rewrote parsing, type checking and serialization code.
Improved the graph component structures.
Added most of the needed k8s structures.
Added model validation (input/output existence etc).
Added task cycle detection and topological sorting to GraphSpec.
All container component tests now work.
Added some graph component tests.
@Ark-kun Ark-kun force-pushed the SDK/Components---Refactored-component-model-structures branch from d622f60 to d4c7ccb Compare January 8, 2019 02:55
Copy link
Contributor

@qimingj qimingj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ark-kun, I have a few high level questions and need your help to clarify.

Also, the change is fairly big. It includes some separable features such as K8s wrapper, ModelBase class, etc. It would be easier for reviewers if it is separated into multiple changes.

sdk/python/kfp/components/_structures.py Show resolved Hide resolved
sdk/python/kfp/components/modelbase.py Outdated Show resolved Hide resolved
sdk/python/kfp/components/modelbase.py Show resolved Hide resolved
sdk/python/kfp/components/_structures.py Show resolved Hide resolved
@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 8, 2019

Also, the change is fairly big. It includes some separable features such as K8s wrapper, ModelBase class, etc. It would be easier for reviewers if it is separated into multiple changes.

ModelBase

I've refactored/extracted the boilerplate serialization/deserialization code into that base class. All object model structures depend on it, so it must be included.

K8s wrapper

The k8s structures are used by the Task class. I got lots of comments from our teams about the need for full kubernetes support, so it made sense to add those structures too. I could separate this one, but I did not want the tasks to lack volumeMount support that is essential for all our sample pipelines.

etc

There are also code changes to use the new structures and unit tests. Without those small code changes in component_builder the code will just be broken.

As for the unit tests, I consider them to be a positive addition and they also demonstrate the usage for the classes that were not properly used and tested before.

I might have omitted the validation code that fixed the previously skipped component tests though.

I'll try making the PRs smaller in future.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 9, 2019

/test kubeflow-pipeline-e2e-test

sdk/python/kfp/components/modelbase.py Outdated Show resolved Hide resolved
sdk/python/kfp/components/modelbase.py Show resolved Hide resolved
sdk/python/kfp/components/modelbase.py Outdated Show resolved Hide resolved
sdk/python/kfp/components/modelbase.py Show resolved Hide resolved
@qimingj
Copy link
Contributor

qimingj commented Jan 9, 2019

A few high level comments:

  1. I am concerned about doing Python arguments type checking, serialization and deserialization. It feels like a low level Python work and by itself can be a big project. I am not sure we need to own those code in the long run. I think we might be able to leverage an existing libraries at least for part of the work:
  • Schema validation against serialized yaml/json instead of doing type checking in code.
  • Once type checking is moved out, it might be possible to simplify the serialization and deserialization and even use an existing library which supports custom serialization.

But it seems you have explored all options and nothing works to your satisfaction (e.g. existing serialization/deserialization doesn't work well with Union types).

  1. I hope we can still reuse kubernetes package for the objects since it is a long list. But if not, given we already exposed parameters as Kubernetes objects in kubernetes python package, we need to continue supporting them.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 9, 2019

I am concerned about doing Python arguments type checking, serialization and deserialization. It feels like a low level Python work and by itself can be a big project.

Good thing I managed to make this big project a reality. Now we can benefit from a good serialization and deserialization library for Python which we control. We do not need to wait years for fixes to the existing libraries and we get all the features that we need. In future we can make this a separate OSS library.

Once type checking is moved out, it might be possible to simplify the serialization and deserialization

As I showed you offline, removing the few if ...: raise TypeError lines wont dramatically check a function that already fits a single screen. Serialization function won't shrink by a single line.

I hope we can still reuse kubernetes package for the objects since it is a long list.

Same thing. I hope that in future the kubernetes package becomes useful for this kind of job. I'm adding some fixes to the python-swagger code generator so maybe in some months there will be a slightly fixed kubernetes python package.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 9, 2019

Gentle ping.
I think I resolved all concerns about this data classes refactoring.

Copy link
Contributor

@qimingj qimingj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm


class InputSpec(ModelBase):
'''Describes the component input specification'''
def __init__(self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add TODO for making it a data class in the future.

if output_name not in self._outputs_dict:
raise TypeError('Unconfigurable output entry "{}" references non-existing output.'.format({output_name: path}))

def verify_arg(arg):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: functions inside a function and an if block is harder to read.

obj41 = TestModel1.from_struct(struct41)
self.assertEqual(obj41.prop_4['val 4'], val4)
self.assertDictEqual(obj41.to_struct(), struct41)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: one space

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Jan 9, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 83e9ffe into kubeflow:master Jan 9, 2019
@Ark-kun Ark-kun deleted the SDK/Components---Refactored-component-model-structures branch January 21, 2019 11:16
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
…ces (kubeflow#637)" (kubeflow#642)

Partial revert of commit afd9f95.

The change in the self-signed logic breaks presubmits with the following error

E           kubernetes.client.rest.ApiException: (409)
E           Reason: Conflict
E           HTTP response headers: HTTPHeaderDict({'Audit-Id': '85305103-054e-4795-a45e-f802de1ae4ea', 'Content-Type': 'application/json', 'Date': 'Wed, 15 Apr 2020 00:31:03 GMT', 'Content-Length': '352'})
E           HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Operation cannot be fulfilled on ingresses.extensions \"envoy-ingress\": the object has been modified; please apply your changes to the latest version and try again","reason":"Conflict","details":{"name":"envoy-ingress","group":"extensions","kind":"ingresses"},"code":409}

* Only changes to util.py in the original PR are reverted.
HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this pull request Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants