-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure pipeline JSON to prevent custom components from breaking #1882
Conversation
Thanks for making a pull request to Elyra! To try out this branch on binder, follow this link: |
Based on some offline discussions, we're updating the pipeline JSON sent from the frontend to the backend in the following way. The first example is a generic component and the second example is a runtime-specific component. {
"id": "4fcb7b56-7450-454f-9f4a-be454847c6cf",
"type": "execution_node",
"op": "execute-notebook-node",
"app_data": {
"label": "",
- "filename": "elyra/my_pipeline/node1.ipynb",
- "runtime_image": "continuumio/anaconda3:2020.07",
- "cpu": "",
- "gpu": "",
- "memory": "",
- "outputs": [],
- "env_vars": [],
- "dependencies": [],
- "include_subdirectories": false
+ "component_parameters": {
+ "filename": "elyra/my_pipeline/node1.ipynb",
+ "runtime_image": "continuumio/anaconda3:2020.07",
+ "cpu": "",
+ "gpu": "",
+ "memory": "",
+ "outputs": [],
+ "env_vars": [],
+ "dependencies": [],
+ "include_subdirectories": false
+ },
"ui_data": {
"label": "node1.ipynb",
"description": "Notebook"
}
},
{
"id": "ca95fd8a-776b-43ae-a73e-e99906b80935",
"type": "execution_node",
"op": "serve-pytorch-model-seldon-core",
"app_data": {
"label": "",
- "model_id": "dddd",
- "deployment_name": "dddd",
- "model_class_name": "ModelClass",
- "model_class_file": "model_class.py",
- "serving_image": "aipipeline/seldon-pytorch:0.1",
+ "component_parameters": {
+ "model_id": "dddd",
+ "deployment_name": "dddd",
+ "model_class_name": "ModelClass",
+ "model_class_file": "model_class.py",
+ "serving_image": "aipipeline/seldon-pytorch:0.1"
+ }
"ui_data": {
"label": "Serve PyTorch Model - Seldon Core",
"description": "Serve PyTorch Models remotely as web service using Seldon Core"
}
} |
Additionally, the properties JSON sent from the backend to the frontend for display on the canvas is changing in the following way. The This is to prevent collisions between our 'system-wide properties' and the properties given in the runtime-specific component. When displaying these properties and when sending them to the backend on pipeline submit/export, the |
2fbd4c1
to
fb885a5
Compare
@kiersten-stokes oh is "operation name" supposed to be the label? If so, you should check |
@kiersten-stokes and I just had a sidebar and have discovered that the pipeline being submitted in this particular test is invalid. It's version is listed as '4', yet it still holds an "nodes": [
{
"id": "a6d8506d-2faf-46f1-8b9e-bee82895fc29",
"type": "execution_node",
"op": "execute-notebook-node",
"app_data": {
"component_parameters": {
"filename": "examples/data 2.ipynb",
"runtime_image": "amancevice/pandas:1.1.1",
"outputs": [],
"env_vars": [],
"dependencies": [],
"include_subdirectories": false
},
"label": "",
"ui_data": {
"label": "data 2.ipynb",
"image": "data:image/svg+xml;utf8,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%2216%22%20viewBox%3D%220%200%2022%2022%22%3E%0A%20%20%3Cg%20class%3D%22jp-icon-warn0%20jp-icon-selectable%22%20fill%3D%22%23EF6C00%22%3E%0A%20%20%20%20%3Cpath%20d%3D%22M18.7%203.3v15.4H3.3V3.3h15.4m1.5-1.5H1.8v18.3h18.3l.1-18.3z%22%2F%3E%0A%20%20%20%20%3Cpath%20d%3D%22M16.5%2016.5l-5.4-4.3-5.6%204.3v-11h11z%22%2F%3E%0A%20%20%3C%2Fg%3E%0A%3C%2Fsvg%3E%0A",
"x_pos": 102,
"y_pos": 105.5,
"description": "Run notebook file"
}
}, We also discussed having the server go and dereference the Since the server must assume that all pipeline payloads are at the current version and all current versions have a value for There's also the distinct possibility that we're simply caught in between of evolving software and may need to ignore this until all the dust has settled. cc: @ajbozarth |
|
value=default_value, | ||
description=description, | ||
control_id=control_id)) | ||
properties.append(ComponentParameter(ref=arg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we calling this ref? and not id? I know this might be overriding things, etc but we should be consistent...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a local change to rename this back to id, but will not push before discussing it.
@@ -62,22 +79,22 @@ def __init__(self, component_registry_location: str, parser: ComponentParser): | |||
def registry_location(self) -> str: | |||
return self._component_registry_location | |||
|
|||
def get_all_components(self) -> List[Component]: | |||
def get_all_components(self) -> Tuple[List[Component], List[Dict]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is being returned here? Is this really the best design to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really the best design to implement?
This seems a little harsh - no? Here's why this was changed to a tuple is due to the late-breaking discovery of a malformed palette that existed since custom components were introduced into master a few weeks ago: #1882 (comment)
However, I didn't fully convey how best to implement this workaround. Now, after looking closer, the solution will be more involved due to the additional parsing that is required for components and the "adjusted_id" - that appears to only apply to airflow components. That said, since only the CachedComponentRegistry
is used in this "design", the callers get the correct result despite the tuple. I think what should happen is the parser code necessary only in get_all_components()
but not get_all_categories()
should be moved into the read method so that it returns "final results". Then the base ComponentRegistry
will just continue reading the file, once for each call to get_all_components()
and get_all_categories()
, while the CachedComponentRegistry
will call the read method directly so that the file only gets read once (every 60 seconds). What is causing trouble with that "workaround" is with the way get_component()
works with its need to handle adjusted_id
.
I do have a question: do we also need to handle adjusted_id
in get_all_components()
?
This is worth further discussion, preferably via a virtual call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't mean to be harsh, but the API seemed a little strange, and the unpacking requirement... wanted basically to start a discussion and agree we could handle it on a call (maybe as an item on the scrum one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, but want to understand the choice of design with the support for categories
Overall functionality looks good, have tested some scenarios with CLI and fixed an issue there and have tested a few pipelines created in 2.2.4 and migrated and run locally and on kfp. I had a few questions above around some implementation details. |
if not name: | ||
raise ValueError("Invalid component: Missing field 'name'.") | ||
|
||
self._ref = ref | ||
self._ref = id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the attribute and property names be updated to reflect id
rather than ref
?
…1882) Co-authored-by: Kevin Bates <kevin.bates@us.ibm.com> Co-authored-by: Alex Bozarth <ajbozart@us.ibm.com> Co-authored-by: Nick Bourdakos <nicholas.bourdakos@ibm.com> Co-authored-by: Luciano Resende <lresende@us.ibm.com>
Fixes #1797
Fixes #1748
Fixes #1773
Fixes #1907
Fixes #1912
Fixes #1892
Fixes #1932
What changes were proposed in this pull request?
This PR will restructure the format of the pipeline JSON and how it is parsed on the backend. Frontend changes and changes to the palette/properties JSON that is sent to the frontend may also require changes. Will definitely require changes to tests.
Removes the runtime-specific parameters
component_source_type
andruntime image
from the UI, and changes thecomponent_source
parameter to a relative path if the component specification is filesystem-based.Changes the component
id
of notebooks fromnotebooks
tonotebook
for consistency.Adds a
GenericOperation
subclass to theOperation
class for generic components, which all have the same attributes/properties (filename
,runtime image
,env_vars
, etc.).How was this pull request tested?
Tests will have to be rewritten.
Acceptance criteria
Please validate by migrating and properly running:
Developer's Certificate of Origin 1.1