CRD for scheduling Argo workflows (API spec) #2

vicaire · 2018-05-24T01:21:21Z

We named this CRD a "schedule": It schedules workflows for now but we will add support for other K8 resources (e.g., jobs). It supports:

executing pipelines using a cron schedule
executing pipelines using a periodic interval
An optional start date and an end date.
Backfilling (by specifying a start date in the past).
A maximum number of concurrent workflows.

Most of the files are automatically generated except for the files in:

./hack/custom-boilerplate.go.txt
./hack/update-codegen.sh
./hack/verify-codegen.sh
./pkg/apis/schedule/register.go
./pkg/apis/schedule/v1alpha1/doc.go
./pkg/apis/schedule/v1alpha1/register.go
./pkg/apis/schedule/v1alpha1/types.go

This change is

We named this CRD a "schedule": It schedules workflows for now but we will add support for other K8 resources (e.g., jobs). It supports: - executing pipelines using a cron schedule - executing pipelines using a periodic interval - An optional start date and an end date. - Backfilling (by specifying a start date in the past). - A maximum number of concurrent workflows.

jlewi · 2018-05-24T03:46:12Z

/cc @jessesuen

k8s-ci-robot · 2018-05-24T03:46:13Z

@jlewi: GitHub didn't allow me to request PR reviews from the following users: jessesuen.

Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @jessesuen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jlewi · 2018-05-24T04:09:47Z

Review status: 0 of 33 files reviewed at latest revision, all discussions resolved.

pkg/apis/schedule/register.go, line 18 at r1 (raw file):

const (
	Kind      string = "Schedule"

"Schedule" seems like an incomplete name? Why not WorkflowSchedule since you use Workflow in other places (e.g. the Type).

pkg/apis/schedule/v1alpha1/types.go, line 27 at r1 (raw file):

// Schedule phases
const (

I believe phases have been dropped in favor of conditions.
kubeflow/training-operator#223

pkg/apis/schedule/v1alpha1/types.go, line 62 at r1 (raw file):

type

What's meant by human readable? Does this mean its not used for anything?

pkg/apis/schedule/v1alpha1/types.go, line 65 at r1 (raw file):

	// If the schedule is disabled, it does not create any new workflow.
	Enabled bool `json:"enabled,omitempty"`

Why would you enable/disable the schedule as opposed to just deleting it to disable it and creating it to enable it?

pkg/apis/schedule/v1alpha1/types.go, line 79 at r1 (raw file):

	// MaxHistory cannot be larger than 100.
	// +optional
	MaxHistory *int64 `json:"maxHistory,omitempty"`

Why is there a limit on history? Whose keeping track of history?

pkg/apis/schedule/v1alpha1/types.go, line 97 at r1 (raw file):

	// [[ScheduledTime]] is substituted by the scheduled time of the workflow (default format)
	// [[CurrentTime]] is substituted by the current time (default format)
	// [[Index]] is substituted by the index of the workflow (e.g. 3 mins that it was the 3rd workflow created)

Why are you exposing Index? Why would operation/parameterization of a workflow depend on this?

pkg/apis/schedule/v1alpha1/types.go, line 124 at r1 (raw file):

	// Create workflows periodically.
	PeriodicSchedule *PeriodicSchedule `json:"periodicSchedule,omitempty"`

Do you really need PeriodicSchedule? Is CronSchedule not sufficient?

pkg/apis/schedule/v1alpha1/types.go, line 134 at r1 (raw file):

not

no end time

pkg/apis/schedule/v1alpha1/types.go, line 207 at r1 (raw file):

	// URL representing this object.
	SelfLink string `json:"selfLink,omitempty"`

What is this for?

pkg/apis/schedule/v1alpha1/types.go, line 213 at r1 (raw file):

	// Phase is a high level summary of the status of the workflow.
	Phase v1alpha1.NodePhase `json:phase,omitempty`

Why do you have phase? Why not just conditions as noted above?

Comments from Reviewable

vicaire · 2018-05-25T02:56:48Z

Review status: 0 of 45 files reviewed at latest revision, 10 unresolved discussions, all commit checks successful.

pkg/apis/schedule/register.go, line 18 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

"Schedule" seems like an incomplete name? Why not WorkflowSchedule since you use Workflow in other places (e.g. the Type).

Done. Renamed to ScheduledWorkflow as we discussed.

pkg/apis/schedule/v1alpha1/types.go, line 27 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

I believe phases have been dropped in favor of conditions.
kubeflow/training-operator#223

Done. Switched to using conditions.

pkg/apis/schedule/v1alpha1/types.go, line 62 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

type
What's meant by human readable? Does this mean its not used for anything?

Removing it. I don't think it belongs in a CRD spec.

pkg/apis/schedule/v1alpha1/types.go, line 65 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Why would you enable/disable the schedule as opposed to just deleting it to disable it and creating it to enable it?

This is needed for the following use case:

Say you create a ScheduledWorkflow to run from now to the end of time, every hour.
If you delete the workflow and re-create it, it will restart scheduling based on the time at which the ScheduledWorkflow was re-created.
If you disable the workflow and re-enable it, it will continue starting from the time at which it was disabled.
Said another way, the enable/disable option backfills the missing hours.

Here is another example:

Say you are backfilling. You need to run a workflow for every hour of the previous day (the Argo workflow is parameterized by the scheduled time).
If you delete/re-create, it will restart from scratch.
If you disable it will temporarily stop creating new workflows. When you re-enable, it will continue where it left off.

This is similar to the backfill functionality of Airflow.

pkg/apis/schedule/v1alpha1/types.go, line 79 at r1 (raw file):

Why is there a limit on history? Whose keeping track of history?
The ScheduledWorkflowStatus.WorkflowHistory structure is providing a list of recently executed workflow (both active and completed)

This is very convenient when using kubectl so that the user does not have to query the list of workflows separately with the correct labels.

This is similar to the way the Argo resource returns the graph of PODs so that users can easily investigate the ones that failed.

We need to limit the size of the history returned by the "kubectl get" call since ScheduledWorkflows may run forever

In the future, we may also decide to garbage collect the workflows that are completed, and only keep the X most recent ones in the K8 key value store. This is what the CronJob does: https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/batch/types.go#L276

pkg/apis/schedule/v1alpha1/types.go, line 97 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Why are you exposing Index? Why would operation/parameterization of a workflow depend on this?

This is one feature explicitly requested by users working on ML workflows.

They want a value, substituted in the parameters of the workflow, that is unique, short, and indicating the order in which the workflows executed.

A unique time can be very verbose: 20180504121645

A UUID is way too long.

This index (suffixed to the workflow name in the Argo workflow spec), is ideal for these users. They use it to generate unique directories to store data in GCS for instance.

pkg/apis/schedule/v1alpha1/types.go, line 124 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Do you really need PeriodicSchedule? Is CronSchedule not sufficient?

We only provided Cron at first, but discovered an issue.

Cron does not let you schedule a job, for instance, every 5 hours.

It can handle every 6 hours: 0, 6, 12, 18, 0, 6, 12

But if you try for every 5 hours, you end up with: 0, 5, 10, 15, 20, <--- NOT 5 HOURS --> 0, 5, 10

This may be why Airflow supports both Cron and an interval as well: https://airflow.apache.org/scheduler.html#scheduling-triggers

pkg/apis/schedule/v1alpha1/types.go, line 134 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

not
no end time

Fixed.

pkg/apis/schedule/v1alpha1/types.go, line 207 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

What is this for?

These are fields copied from the Argo status for convenience (and so that we can have a fancy looking CLI like Argo).

The ScheduledWorkflowStatus provides a WorkflowHistory object.

The WorkflowHistory object has a list of WorkflowStatuses for active Argo Workflows.

The Workflow Status contains fields copied directly from the Argo status (except for the scheduledTime).

pkg/apis/schedule/v1alpha1/types.go, line 213 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Why do you have phase? Why not just conditions as noted above?

The field is copied from Argo so we cannot change the type. It is very convenient for the user to have "kubectl get ScheduledWorkflow" return a list of recently executed workflows and their status.

I changed to a Condition above for our own CRD.

Comments from Reviewable

jlewi · 2018-05-25T06:10:16Z

Review status: 0 of 45 files reviewed at latest revision, 4 unresolved discussions.

pkg/apis/schedule/v1alpha1/types.go, line 27 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

Done. Switched to using conditions.

Done.

pkg/apis/schedule/v1alpha1/types.go, line 62 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

Removing it. I don't think it belongs in a CRD spec.

Done.

pkg/apis/schedule/v1alpha1/types.go, line 65 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

This is needed for the following use case:

Say you create a ScheduledWorkflow to run from now to the end of time, every hour.

If you delete the workflow and re-create it, it will restart scheduling based on the time at which the ScheduledWorkflow was re-created.

If you disable the workflow and re-enable it, it will continue starting from the time at which it was disabled.
Said another way, the enable/disable option backfills the missing hours.

Here is another example:

Say you are backfilling. You need to run a workflow for every hour of the previous day (the Argo workflow is parameterized by the scheduled time).

If you delete/re-create, it will restart from scratch.

If you disable it will temporarily stop creating new workflows. When you re-enable, it will continue where it left off.

This is similar to the backfill functionality of Airflow.

But I thought the whole point of the CRD and the database was that you are persisting actual job runs to a database.

So when you recreate it you should check the database for existing runs.

I think this goes to the point about the CRD being edge triggered vs. level triggered
https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d

Suppose a ScheduledWorkflow has a schedule of
Daily at 7am
Starting on May1

If you're level triggered then on May 10 we know we want the following Workflows to exist
May 1 7am
May 2 7am
....
May 10 7am

It doesn't matter who/when/or how those runs were created. If any of them are missing we know that we should go and start them.

Thus, it doesn't matter whether the schedule was deleted and then recreated.

I think your implementation is currently edge triggered. You are relying on observing transitions from one time to another.

I think this is going to be more brittle.

If you with a level design then I don't think you need the on/off bit.

pkg/apis/schedule/v1alpha1/types.go, line 97 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

This is one feature explicitly requested by users working on ML workflows.

They want a value, substituted in the parameters of the workflow, that is unique, short, and indicating the order in which the workflows executed.

A unique time can be very verbose: 20180504121645

A UUID is way too long.

This index (suffixed to the workflow name in the Argo workflow spec), is ideal for these users. They use it to generate unique directories to store data in GCS for instance.

Why not use a random number? Or it least make it look random?

If you return a monotonically increasing number users will likely start to make inferences about what that index means and that will limit your implementation.

e.g. users will likely make assumptions like the numbers should be continuous and that they should match the trigger times of the runs.

Comments from Reviewable

vicaire · 2018-05-25T16:26:20Z

Review status: 0 of 45 files reviewed at latest revision, 4 unresolved discussions.

pkg/apis/schedule/v1alpha1/types.go, line 65 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

But I thought the whole point of the CRD and the database was that you are persisting actual job runs to a database.

So when you recreate it you should check the database for existing runs.

I think this goes to the point about the CRD being edge triggered vs. level triggered
https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d

Suppose a ScheduledWorkflow has a schedule of
Daily at 7am
Starting on May1

If you're level triggered then on May 10 we know we want the following Workflows to exist
May 1 7am
May 2 7am
....
May 10 7am

It doesn't matter who/when/or how those runs were created. If any of them are missing we know that we should go and start them.

Thus, it doesn't matter whether the schedule was deleted and then recreated.

I think your implementation is currently edge triggered. You are relying on observing transitions from one time to another.

I think this is going to be more brittle.

If you with a level design then I don't think you need the on/off bit.

We want this implementation to be independent of the DB.

Additionally, we need it to scale. The user could have a ScheduledWorkflow running for 3 years for every hour. As time progresses, verifying the complete list would become less and less efficient and slow down the controller (particularly if many instances of ScheduledWorkflow are running).

Finally, the workflow resources may be garbage collected (they should be). If the ScheduledWorkflow runs for 3 years, I don't think the correct functioning of this CRD should require that all the workflow objects be preserved in the K8 key-value store.

This is why we implemented this in a way similar to CronJob, keeping track only of scheduledTime of the latest workflow.

We can implement something to address your use case later on, but that does not seem to fit this CRD very well. I would see it more as a CRD that executes workflows based on a list of possible values for the parameters of the workflow, where these values are times. The number of items in the list would have to be limited.

pkg/apis/schedule/v1alpha1/types.go, line 97 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Why not use a random number? Or it least make it look random?

If you return a monotonically increasing number users will likely start to make inferences about what that index means and that will limit your implementation.

e.g. users will likely make assumptions like the numbers should be continuous and that they should match the trigger times of the runs.

Returning a monotonically increasing number is here the requirement.

Say the user creates a unique directory for each run of the workflow. They want the directories to be:

my-directory-1
my-directory-2
my-directory-3
etc.

We can easily add an substitution for a predictable random looking number (based on a hash). But we do need the monotically increasing, guaranteed to be unique index.

Comments from Reviewable

jlewi · 2018-05-25T22:33:10Z

Review status: 0 of 45 files reviewed at latest revision, 4 unresolved discussions.

pkg/apis/schedule/v1alpha1/types.go, line 65 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

We want this implementation to be independent of the DB.

Additionally, we need it to scale. The user could have a ScheduledWorkflow running for 3 years for every hour. As time progresses, verifying the complete list would become less and less efficient and slow down the controller (particularly if many instances of ScheduledWorkflow are running).

Finally, the workflow resources may be garbage collected (they should be). If the ScheduledWorkflow runs for 3 years, I don't think the correct functioning of this CRD should require that all the workflow objects be preserved in the K8 key-value store.

This is why we implemented this in a way similar to CronJob, keeping track only of scheduledTime of the latest workflow.

We can implement something to address your use case later on, but that does not seem to fit this CRD very well. I would see it more as a CRD that executes workflows based on a list of possible values for the parameters of the workflow, where these values are times. The number of items in the list would have to be limited.

As discussed offline the real issue here is making backfilling reliable. Right now we update last scheduled time as soon as a new workflow is started. But we shouldn't update it until that workflow completed (successfully or failed) so that controller can ensure work was actually completed.

As discussed offline please file an issue address this in a future PR.

pkg/apis/schedule/v1alpha1/types.go, line 97 at r1 (raw file):

Previously, vicaire (Pascal Vicaire) wrote…

Returning a monotonically increasing number is here the requirement.

Say the user creates a unique directory for each run of the workflow. They want the directories to be:

my-directory-1

my-directory-2

my-directory-3

etc.

We can easily add an substitution for a predictable random looking number (based on a hash). But we do need the monotically increasing, guaranteed to be unique index.

SGTM

Comments from Reviewable

jlewi · 2018-05-25T22:33:38Z

/lgtm
/approve

Please file an issue to track making backfilling reliable.

k8s-ci-robot · 2018-05-25T22:33:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vicaire · 2018-05-26T00:31:40Z

Issue filed to make backfills reliable:

#5

* add metric sample code in roc component * Change mlpipeline artifacts output path to tmp folder * Revert /tmp path change and add output_dir in notebook sample * Revert /tmp path #2

* add metric sample code in roc component * Change mlpipeline artifacts output path to tmp folder * Revert /tmp path change and add output_dir in notebook sample * Revert /tmp path kubeflow#2

* initial watson pipeline * updates * updates * Update creds * updated dockerhub base files * update watson pipeline (#2) Removing references to wml-base image, and recreating it with python-slim as base image * address review comment (#3) * change s3 to cos (#4)

* initial watson pipeline * updates * updates * Update creds * updated dockerhub base files * update watson pipeline (kubeflow#2) Removing references to wml-base image, and recreating it with python-slim as base image * address review comment (kubeflow#3) * change s3 to cos (kubeflow#4)

� This is the 1st commit message: Move argo installation to docker file. � This is the commit message #2: Remove TODO � This is the commit message #3: Fix test.

� This is the 1st commit message: Move argo installation to docker file. � This is the commit message #2: Remove TODO � This is the commit message #3: Fix test. � This is the commit message #4: Fix test.

# This is the 1st commit message: Add initial scripts # This is the commit message kubeflow#2: Add working pytest script # This is the commit message kubeflow#3: Add initial scripts # This is the commit message kubeflow#4: Add environment variable files # This is the commit message kubeflow#5: Remove old cluster script

* # This is a combination of 5 commits. # This is the 1st commit message: Add initial scripts # This is the commit message #2: Add working pytest script # This is the commit message #3: Add initial scripts # This is the commit message #4: Add environment variable files # This is the commit message #5: Remove old cluster script * Add initial scripts Add working pytest script Add initial scripts Add environment variable files Remove old cluster script Update pipeline credentials to OIDC Add initial scripts Add working pytest script Add initial scripts Add working pytest script * Remove debugging mark * Update example EKS cluster name * Remove quiet from Docker build * Manually pass env * Update env list vars as string * Update use array directly * Update variable array to export * Update to using read for splitting * Move to helper script * Update export from CodeBuild * Add wait for minio * Update kubectl wait timeout * Update minor changes for PR * Update integration test buildspec to quiet build * Add region to delete EKS * Add wait for pods * Updated README * Add fixed interval wait * Fix CodeBuild step order * Add file lock for experiment ID * Fix missing pytest parameter * Update run create only once * Add filelock to conda env * Update experiment name ensuring creation each time * Add try/catch with create experiment * Remove caching from KFP deployment * Remove disable KFP caching * Move .gitignore changes to inside component * Add blank line to default .gitignore

# This is the 1st commit message: parent 1551e34 author Dustin Luong <dusluong@amazon.com> 1593625846 -0700 committer Dustin Luong <dusluong@amazon.com> 1595961236 -0700 Unit test passes Updated unit tests, working on integration tests Updated READMEs, Logged rules status and errors to UI squash a94f809 Unfinished sample pipeline for debugger squash 8128572 Edge case: empty rules config Edge case: empty rules config Cleaned up example pipeline and fixed empty case of debug rules Finished integration test, debug rule logs only print after training has completed, refactored various code Unhardcoded rules registry New sample pipeline, minor changes to utils Refactored wait_for_debug_rules, added unit tests, updated readme for debugger demo, fixed typos and small errors rm .gz Changed defaults for train.template.yaml, updated example pipeline, removed exceptions from utils which are handled by boto3 removed trust.json minor clean ups Minor fixes Refactored code to incorporate changes from design review, notably removing collection_config # This is the commit message kubeflow#2: custom_rules.py # This is the commit message kubeflow#3: Added tensorboard # This is the commit message kubeflow#4: restored run_integration_tests # This is the commit message kubeflow#5: removed custom rule

* # This is a combination of 5 commits. # This is the 1st commit message: Add initial scripts # This is the commit message kubeflow#2: Add working pytest script # This is the commit message kubeflow#3: Add initial scripts # This is the commit message kubeflow#4: Add environment variable files # This is the commit message kubeflow#5: Remove old cluster script * Add initial scripts Add working pytest script Add initial scripts Add environment variable files Remove old cluster script Update pipeline credentials to OIDC Add initial scripts Add working pytest script Add initial scripts Add working pytest script * Remove debugging mark * Update example EKS cluster name * Remove quiet from Docker build * Manually pass env * Update env list vars as string * Update use array directly * Update variable array to export * Update to using read for splitting * Move to helper script * Update export from CodeBuild * Add wait for minio * Update kubectl wait timeout * Update minor changes for PR * Update integration test buildspec to quiet build * Add region to delete EKS * Add wait for pods * Updated README * Add fixed interval wait * Fix CodeBuild step order * Add file lock for experiment ID * Fix missing pytest parameter * Update run create only once * Add filelock to conda env * Update experiment name ensuring creation each time * Add try/catch with create experiment * Remove caching from KFP deployment * Remove disable KFP caching * Move .gitignore changes to inside component * Add blank line to default .gitignore

Added namespaced pipelines, with UI and API changes, as well as the ability to share pipelines. Fixing API Tests. npm formatting end 2 end tests fixes No Shared Pipelines Created name_namespace_index This solves for quite a big bug ... dropping index on pipelines.name there will be a composite index on name and namespace Kubeflow PIpelines Multi-User Backend fixing ui tests fixed formatting issues formatting code Adding KFP SDK changes correcting test cases Fixes for integration tests (single-user) npm formatting kubeflow#2 corrected more test cases new test cases & code quality correcting nits npm format go mod tidy added update to view-edit role

change argocompiler dependency to my own repo

* # This is a combination of 6 commits. # This is the 1st commit message: Modify agent test case for code coverage (kubeflow#1849) * Modifies the test case for sync models config Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> # This is the commit message kubeflow#2: Add test cases for agent storage utils (kubeflow#1849) * Add test case for FileExists function * Add test case for RemoveDir function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> # This is the commit message kubeflow#3: Add test case for agent storage utils * Add test case for GetProvider function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> # This is the commit message kubeflow#4: Add test case for gcs model downloader (kubeflow#1849) * Add test case for gcs model downloader in agent Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> # This is the commit message kubeflow#5: Add test cases for agent downloader Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> # This is the commit message kubeflow#6: Add test cases for configmap (kubeflow#1849) * Add test cases for v1beta1 configmap Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Add test cases for inference service defaults (kubeflow#1849) * Add test cases for all model runtimes * Add test cases for all runtime defaults Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * fmt (kubeflow#1849) Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Modify agent test case for code coverage (kubeflow#1849) * Modifies the test case for sync models config Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for agent storage utils (kubeflow#1849) * Add test case for FileExists function * Add test case for RemoveDir function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test case for agent storage utils * Add test case for GetProvider function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test case for gcs model downloader (kubeflow#1849) * Add test case for gcs model downloader in agent Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for agent downloader Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for configmap (kubeflow#1849) * Add test cases for v1beta1 configmap Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for inference service defaults (kubeflow#1849) * Add test cases for all model runtimes * Add test cases for all runtime defaults Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for predictor model (kubeflow#1849) * Add test cases for isFrameworkSupported function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for sklearn predictor (kubeflow#1849) * Add test cases for GetProtocol function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for configmap (kubeflow#1849) * Add test case for creating empty model config Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for utils (kubeflow#1849) * Add test cases for IncludesArg function * Add test cases for IsGPUEnabled function * Add test cases for FirstNonNilError function * Add test cases for RemoveString function * Add test cases for IsPrefixSupported function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Add test cases for creds_utils (kubeflow#1849) * Add test cases for set_gcs_credentials function * Add test cases for create_secret function * Add test cases for set_service_account function * Add test cases for create_service_account function * Add test cases for patch_service_account function * Add test cases for get_creds_name_from_config_map function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Add test cases for creds_utils (kubeflow#1849) * Add test cases for set_gcs_credentials function * Add test cases for create_secret function * Add test cases for set_service_account function * Add test cases for create_service_account function * Add test cases for patch_service_account function * Add test cases for get_creds_name_from_config_map function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for creds_utils (kubeflow#1849) * Add test cases for set_s3_credentials function * Add test cases for set_azure_credentials function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for v1beta1 component (kubeflow#1849) * Add test cases for validateStorageSpec function * Add test cases for validateLogger function * Add test cases for FirstNonNilComponent function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for inference service status (kubeflow#1849) * Add test cases for PropagateRawStatus function * Add test cases for PropagateStatus function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for predictor (kubeflow#1849) * Add test cases for GetPredictorImplementations function * Add test cases for GetPredictorImplementation function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for agent_injector (kubeflow#1849) * Add test cases for getLoggerConfigs function * Add test cases for getAgentConfigs function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for batcher_injector (kubeflow#1849) * Add test cases for getBatcherConfigs function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Add test cases for storage initializer injector (kubeflow#1849) * Add test cases for getStorageInitializerConfigs function * Add test cases for parsePvcUri function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for storage initializer injector (kubeflow#1849) * Add test cases for parsePvcUri function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> fmt (kubeflow#1849) Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Add test cases for controller utils (kubeflow#1849) * Add test cases for GetDeploymentMode function Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Remove double import of same package (kubeflow#1849) Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> temp commit Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Updated coverage for inference_service_default_test Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Added scripts for code coverage Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Updated make to track coverage including subpackages Added more coverage Added ignore to client package - generated code Added coverage script to workflow Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Temporarily commenting couple of test cases Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Temporary changes to debug e2e Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Commented configmap test Reverted accidental commit of generated code. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Added -v to debug failing tests Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Updated tests to remove dependency on k8s cluster Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Updated readme to show coverage

k8s-ci-robot added the size/XXL label May 24, 2018

vicaire requested review from jlewi and ankushagarwal May 24, 2018 01:21

vicaire changed the title ~~CRD for scheduling Argo workflows~~ CRD for scheduling Argo workflows (API spec) May 24, 2018

Addressing code review comments.

a66811e

k8s-ci-robot assigned jlewi May 25, 2018

k8s-ci-robot added the lgtm label May 25, 2018

k8s-ci-robot added the approved label May 25, 2018

vicaire merged commit 9add1bb into master May 26, 2018

hongye-sun added a commit to hongye-sun/pipelines that referenced this pull request Dec 12, 2018

Revert /tmp path kubeflow#2

6331f70

paveldournov mentioned this pull request Apr 22, 2019

Late binding aspects on Kubeflow Pipeline #1187

Closed

strangemonad-faire mentioned this pull request May 29, 2019

FR: Pipelines UI - Dag tasks should also show the task name #1397

Closed

IronPan deleted the vicaire/scheduling branch June 4, 2019 07:31

numerology referenced this pull request in numerology/pipelines Aug 13, 2019

� This is a combination of 3 commits.

d435ca0

� This is the 1st commit message: Move argo installation to docker file. � This is the commit message #2: Remove TODO � This is the commit message #3: Fix test.

dldaisy mentioned this pull request Feb 7, 2020

[Sample] CI Sample: mnist #3013

Merged

Bobgy mentioned this pull request Aug 9, 2020

fix(frontend): fix parsing large workflow graph. Fixes #4179 #4180

Merged

3 tasks

numerology mentioned this pull request Sep 18, 2020

test: fix TFX related test failure due to protobuf version. #4516

Merged

2 tasks

maganaluis added a commit to arllanos/pipelines that referenced this pull request Feb 1, 2021

npm formatting kubeflow#2

8191a6f

maganaluis added a commit to arllanos/pipelines that referenced this pull request Feb 8, 2021

npm formatting kubeflow#2

5cd84c5

maganaluis added a commit to arllanos/pipelines that referenced this pull request Feb 15, 2021

npm formatting kubeflow#2

065492e

maganaluis added a commit to arllanos/pipelines that referenced this pull request Feb 21, 2021

npm formatting kubeflow#2

f09e44b

capri-xiyue mentioned this pull request Aug 12, 2021

[release] 1.7.0 tracker #5779

Closed

23 tasks

zijianjoy mentioned this pull request Aug 13, 2021

[backend]: Determine cache scope for pipeline-root and namespace #6324

Closed

3 tasks

zacharymostowsky mentioned this pull request Sep 28, 2021

Support for non-docker based deployments #1654

Closed

Linchin added a commit that referenced this pull request Jun 10, 2022

Merge pull request #2 from Linchin/ling-exit-handler

e95bae7

change argocompiler dependency to my own repo

IronPan mentioned this pull request Aug 4, 2022

feat(backend): Implement RuntimeConfig in backend #8085

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRD for scheduling Argo workflows (API spec) #2

CRD for scheduling Argo workflows (API spec) #2

vicaire commented May 24, 2018 •

edited by jlewi

Loading

jlewi commented May 24, 2018

k8s-ci-robot commented May 24, 2018

jlewi commented May 24, 2018

vicaire commented May 25, 2018

jlewi commented May 25, 2018

vicaire commented May 25, 2018

jlewi commented May 25, 2018

jlewi commented May 25, 2018

k8s-ci-robot commented May 25, 2018

vicaire commented May 26, 2018

CRD for scheduling Argo workflows (API spec) #2

CRD for scheduling Argo workflows (API spec) #2

Conversation

vicaire commented May 24, 2018 • edited by jlewi Loading

jlewi commented May 24, 2018

k8s-ci-robot commented May 24, 2018

jlewi commented May 24, 2018

vicaire commented May 25, 2018

jlewi commented May 25, 2018

vicaire commented May 25, 2018

jlewi commented May 25, 2018

jlewi commented May 25, 2018

k8s-ci-robot commented May 25, 2018

vicaire commented May 26, 2018

vicaire commented May 24, 2018 •

edited by jlewi

Loading