Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indication of runtime into operation's environment #1668

Merged
merged 4 commits into from May 19, 2021

Conversation

kevin-bates
Copy link
Member

@kevin-bates kevin-bates commented May 13, 2021

This pull request adds an indication of the current runtime into the operation's environment via the variable ELYRA_RUNTIME_ENV. Although the discussion on the issue suggested this value be obtained from the runtime configuration schema, the value is actually a function of the processor implementation corresponding to the runtime and reflected in the abstract type property. I believe we should document that the value of PipelineProcessor.type should match the schema name of the runtime (as is the case so far with the exception of the special case "local runtime") and is the suggested value for the name of the entrypoint under which the processor implementation is registered. This way the three entities (runtime schema, pipeline processor implementation and corresponding entrypoint entry) are logically associated.

The current set of valid values are:

  • local: indicating the operation (or node) is being processed in the local runtime context
  • kfp: indicating the operation (or node) is being processed in the Kubeflow Pipelines context
  • airflow: indicating the operation (or node) is being processed in the Airflow context

These changes also consolidate the collection of envs which helps with testing.

  • Extend the docs for users like Elyra component and processor authors. This addition should, at a minimum, associate a PipelineProcessor's implementation to its runtime schema name (via the type property) and document the "built-in" values which ELYRA_RUNTIME_ENV can have.

How was this pull request tested?

The approaches for testing the 'local' case are different than the 'kfp' and 'airflow' cases.

For the 'local' test, the generated pipeline files were amended to assert the presence of the environment variable (ELYRA_RUNTIME_ENV) has a value of 'local' during the actual execution of the generated pipeline.

For the 'kfp' and 'airflow' scenarios, a test_collect_envs() test was added to each set of tests where the new _collect_envs() method was called on the respective processor, followed by appropriate assertions for the presence of other runtime-specific environment variables.

Resolves #1663

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@kevin-bates kevin-bates added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label May 13, 2021
@elyra-bot
Copy link

elyra-bot bot commented May 13, 2021

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@kevin-bates kevin-bates marked this pull request as draft May 13, 2021 00:29
@kevin-bates kevin-bates removed the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label May 13, 2021
@kevin-bates kevin-bates marked this pull request as ready for review May 13, 2021 23:00
@kevin-bates
Copy link
Member Author

Note: we need a better story for bringing your own schema than placing the custom schema into elyra's installation area under metadata/schemas.

Copy link
Member

@lresende lresende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, there is still a question to @akchinSTC but otherwise good to go.

@kevin-bates
Copy link
Member Author

Thanks @lresende.

Also, note that I did not remove the previous class-hierarchy image that is now unused. Since it is incomplete, I think it might be best to remove it - assuming you can generate an updated copy if needed. If agreed, I'll submit a commit to remove it.

@akchinSTC akchinSTC added this to the 2.3.0.beta2 milestone May 14, 2021
@akchinSTC akchinSTC self-requested a review May 14, 2021 16:56
@akchinSTC akchinSTC requested a review from ptitzler May 17, 2021 15:38
@@ -104,7 +104,17 @@ elyra-pipeline submit elyra-pipelines/demo-heterogeneous.pipeline \

The `runtime-config` should be a valid [runtime configuration](/user_guide/runtime-conf.md).

#### Detecting the runtime from a component
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to introduce the term component because afaik we haven't used it prior to 2.3.x in the context of pipeline artifacts. This is also needed because the Pipelines documentation topic starts with "Elyra utilizes its canvas component ", which is semantically completely different and can lead to confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we reference "nodes" I'll use that. Also changing 'running' to 'executing'. Thanks.

@staticmethod
def _collect_envs(operation: Operation) -> Dict:
envs = os.environ.copy() # Make sure this process's env is "available" in the kernel subprocess
envs['ELYRA_RUNTIME_ENV'] = "local" # Special case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be safer to reference the _type class variable

instead of hard-coding the value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_type is not on an OperationProcessor - which is where "local" runtimes perform env collection. I could pass this as a parameter, but it (at least today) will never be used outside of a local "runtime". Had we treated "local" as a full-fledged runtime (eh um), then we'd be able to use "type" - as I do in RuntimePipelineProcess._collect_envs().

@ptitzler
Copy link
Member

ptitzler commented May 17, 2021

User test for kfp/run, kfp/export/yaml, kfp/export/dsl, local/run, airflow/export, airflow/run yielded the expected results. Waiting for airflow test env to come back up before airflow/run can be tested.

@akchinSTC akchinSTC requested a review from ptitzler May 18, 2021 21:37
Copy link
Member

@ptitzler ptitzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants