Skip to content

Commit

Permalink
Release: add industry relevant examples along with refining the code (#…
Browse files Browse the repository at this point in the history
…198)

* init branch

* wip data exploration

* data exploration region/silo

* basic model

* regions

* basic network and finished data processing

* training

* Implement generic FedAvg without model object (#167)

* generic fedavg pytorch
* support model classes
* add docstrings

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* add README

* update normalization

* update exploration

* Thomas/small improvements (#171)

* remove unused local MNIST data

* add link to provisioning cookbook in docs readme

* recommend creating a conda env in the quickstart

Co-authored-by: thomasp-ms <XXX@me.com>

* update example for finance with multiple models

* successful training through lstm

* revert unneeded changes

* remove local exploration ipynb

* fix test metric

* different param value for AKS (#179)

Co-authored-by: thomasp-ms <XXX@me.com>

* Pneumonia xray example (#164)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* add more intuitive agg output dir path

* reformat using black

* add iteration2 branch for PR build testing

* reformat date and pass kwargs instead in the getUniqueIdentifier fn

* working submit

* working factory submit

* linting

* move component path

* add soft validation

* add soft validation

* Add basic tests on config

* linting

* working bicep deployment for vanilla demo

* proper orchestrator script, double containers

* fix name

* docstring

* docstring

* rollback to using only 1 container

* align naming convention

* instructions

* working submit

* set up permission model

* working orch perms

* wonky perms assignment

* working role assignments

* remove old perm model

* working except silo2orch

* fix typo

* working submit with config

* add sku as param

* use R/W for now

* fix submit to align with bicep provisioning demo

* linting

* remove dataset files

* fix docstring on permission model

* write draft docs with homepage, align structure, remove requirements, ensure demo documented

* rollback change to req

* change factory to use custom model type during validation

* linting

* Display metrics at the pipeline level (#68)

* Fix optional input yaml and mlflow log bugs (#59)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* Accomodate optional input chnages and switch from mlflow autologging to manual logging

* code style

* change optional inputs syntax

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Make changes to display all metrics at the pipeline level

* Log preprocessing metadata in mlflow

* linting

* Pass client as an arg

* Fix typo, rmv name from silo config, metric naming convention, and add
metric identifier in the preprocessing component

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Remove redundant files from the mlops directory (#69)

* Remove internal & external dir as provisioning is taken care by bicep

* keep mnist data files

* rename demo script (#71)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Unified documentation (#72)

* WIP: unifying docs

* Remove redundant doc file. We can always revisit if needed

* FL concepts will be covered in the glossary doc

* Remove internal and external silos docs as the code will be re-written in bicep

* provide comprehensive documentation

* rename file

* refine docs

* refine docs and rename fl_cross_silo_basic to fl_cross_silo_native

* simplify sandbox script

* simplify script, ensure it works

* align config of native submit

* align naming conventions between scripts, reinject rbac role

* create test job for quickly debugging provisioning issues

* fix tests

* linting

* move permissions to storage

* align config with bicep scrits

* Document the metrics panel of the pipeline overview in the quickstart (#76)

* WIP: unifying docs

* Remove redundant doc file. We can always revisit if needed

* FL concepts will be covered in the glossary doc

* Remove internal and external silos docs as the code will be re-written in bicep

* provide comprehensive documentation

* rename file

* refine docs

* refine docs and rename fl_cross_silo_basic to fl_cross_silo_native

* document the metrics/pipeline panel in the quickstart

* linting

* add docstrings and disclaimers

* Add instructions on how to create a custom graph  (#78)

* WIP: unifying docs

* Remove redundant doc file. We can always revisit if needed

* FL concepts will be covered in the glossary doc

* Remove internal and external silos docs as the code will be re-written in bicep

* provide comprehensive documentation

* rename file

* refine docs

* refine docs and rename fl_cross_silo_basic to fl_cross_silo_native

* document the metrics/pipeline panel in the quickstart

* add instructions on how to create a custom graph

* do better comments

* Refine native code (#82)

* fix silo name

* log only one datapoint per iteration for an aggregated metrics

* Align terminology for iteration/round/num_rounds

* linting

* use storage blob data contibutor

* add demoBaseName to guid name of role deployment (#85)

Co-authored-by: thomasp-ms <XXX@me.com>

* use id list, add listkeys builtin

* rename and dissociate orchestrator in resource + orchestrator

* separate orchestrator script

* draft sandbox setup

* make silo script distinct

* Update orchestrator_open.bicep

* Update internal_blob_open.bicep

* remove comments

* align hello world example with new naming conventions

* ensure uai assignments are created AFTER storage is created

* linting

* enforce precedence

* merge from secure branch

* use different regions, limit size of account

* reduce to 3 regions, add keys to guid

* substring

* align config

* do not use model

* Add msi version of scripts

* sandbox main can switch between uai and msi

* fix name

* linting

* linting

* implement ignore param, hotfix model with startswith

* Address my own comments on Jeff's PR (#96)

* remove magic number

* little improvements on some comments

* remove unused files

* put dash replacement next to length check

* don't necessarily assume USER AI

* UAI -> XAI

* revert previous UAI -> XAI changes

* move length check next to dash replacement

* typo

* try movind the dependsOn's

* RAGRS -> LRS

* revert dependsON changes

* revert another small change in a comment

Co-authored-by: thomasp-ms <XXX@me.com>

* align config of both submit scripts

* Make distinction between on-off and repeatable provisioning scripts (#99)

* clarify the role needed

* remove "custom role" line

* adjust locations

* use existing rg if not Owner of the sub

* clarify "Secure" setup

* add usage instructions in docstring

* explain what scripts are one-off (vs repeatable)

Co-authored-by: thomasp-ms <XXX@me.com>

* Align round/iteration terminology with the native code (#103)

* rename parameter in config file

* keep iterations instead of rounds

* round -> iteration

Co-authored-by: thomasp-ms <XXX@me.com>

* get all goodies from secureprovisioning branch wip

* get all goodies from secureprovisioning branch wip

* get all goodies from secureprovisioning branch wip

* align both submits to work

* add optional test

* rename native to literal

* add getting started in readme, introduce emojis

* change person

* remove emojs

* Propose rewriting of readme to highlight motivation first (#110)

* propose rewriting of readme to highlight motivation first

* minor edit

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Update README.md

* Update quickstart to mention rg clean-up

* Update quickstart.md

* Update quickstart.md

* Update quickstart.md

* Build bicep scripts as ARM template, add Azure Buttons to quickstart (#120)

* Update quickstart to lower header (hotfix) (#117)
* add arm templates, add button in quickstart
* switch to releasebranchlink

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Add subscription id, resource group and workspace name as CLI args (#122)

* add more cli args

* code style

* code style

* update quickstart doc

* update readme

* Initiate provisioning "cookbook" with list of provisioning scenarios + example (#123)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Continuous Integration Tests (#119)

* take values of subscription id, rs grp, ws name, etc from github secrets and submit a native pipeline

* change path

* Test azure creds in the github workflow

* reformatting

* add pipeline validation and testing workflow

* add permissions

* add permissions

* check only certain dir to trigger workflows

* add soft validation for any iteration branch PR

* add provisioning script test

* testing

* create rg

* create rg

* change compute for testing

* change demoname

* delete old rg

* change demoname

* add demobasename and aml ws name as github secrets

* random demo base name

* auto generate random base name

* random demo base name

* adjust random num length

* add vnet sandbox test

* rmv dependency b/w jobs

* submit various pipelines

* change execution graph path

* add cli args in the factory code

* change compute for testing

* ignore validation - factory

* create custom action

* correct path

* correct path

* add shell in the github action

* create github actions and take required values as input params

* add shell

* add wait condition

* add logs

* linting

* correct rg name

* add azure ml extension

* handle ml extension installation error.

* add release branch test cases

* add script to delete run history

* cronjob test

* cronjob test

* checkout branch

* test run history deletion script

* test run history deletion script

* test run history deletion script

* azure login

* date format change

* remove double quotes

* date format change

* archive run history script tested

* Add vnet-based provisioning options to cookbook (#128)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Make deployment name unique in our github actions (#135)

* set unique name for deployments
* add attempt to deployment name

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Refactor compute/storage scripts to be independent (#132)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Provide motivation in provisioning docs for using service endpoints (#136)

* add motivation for service endpoints
* add link

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Refresh provisioning arm buttons with latest from bicep (#139)

* align names of directories
* rebuild all arm

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Update silo_vnet_newstorage.md (#141)

* Add Bicep build vs ARM template diff test  (#140)

* Add diff test for bicep vs arm

* Debug

* Debug

* fix syntax error

* redirect build output to stdout

* coorect path

* trigger arm template test when pushing changes to main branch from release* branch

* remove redundant logs

* Add "open aks with cc" provision tutorial and bicep scripts (#138)

* implement bicep scripts to provision open aks with cc
* add aks cc tutorial
* build arm and add in branch
* add button

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Provide script + tutorial to attach pair with an existing storage (#142)

* provision datastore with existing storage
* add arm for existing storage, add docs
* add link in readme

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* add latest arm templates to diff build (#145)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Implements provisioning script for a confidential compute VM jumpbox inside a vnet (debug) (#146)

* add jumpbox script with tutorial
* add template to diff build

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Update jumpbox_cc.md (#147)

* update tutorials for silos to integrate feedback (#149)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Implement option to turn orchestrator storage fully private (behind PLE) (#150)


Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Tutorial on how to adapt native and factory code to write FL experiments.  (#100)

* WIP: add general information about the factory code

* moving factory-tutorial to another file

* add scenarios

* add instructions on how to adapt literal code

* rename file

* add general info and fix typos

* Jeff's feedback

* Apply code clean-up to provision scripts before bug bash (#148)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Instructions for provisioning external silos (#101)

* very first stab, far from done

* non-secure native job using the on-prem k8s

* use on-prem silos in example factory job

* Revert "very first stab, far from done"

This reverts commit e00d882dfee6a348eb89cd63e339a051b85ce0ca.

* Revert "use on-prem silos in example factory job"

This reverts commit e2ef8841c6be25a6c84b57ae079cca8f361323fe.

* Revert "non-secure native job using the on-prem k8s"

This reverts commit 923e5f321d28b30d8cd9759c47a7ffe5457e3284.

* restore doc stub

* Make Git ignore resources for test jobs

* fix gitignore

* typo in comment

* steps A through D

* 2 typos

* move to subdir

* fix workspace creation

* add orchestrator part, role, and timeline

* last commit before PR

* adjust to new open_azureml_workspace.bicep

* first wave after Jeff's comments

* address jeff's comments

* typo

* light trims

Co-authored-by: thomasp-ms <XXX@me.com>

* bump up every title

* skeleton

* first attempt at data prep like Harmke

* change secret name

* wrong secret name

* remove separate unzip

* change clients, create silo data assets

* different names for silo data assets, duh

* cleanup

* adjust secret name in doc

* .

* use latest literal code

* align environment with literal

* base on latest component

* one dataset, comment out 2 unused args (for now)

* introduce new arguments

* reflect modified args in component spec

* remove unused arg from config

* start hooking up to Harmke's trainer

* initialize PTLearner and include in run.py

* use same values as Harmke for epochs and lr

* attributes with _, start implementing local_train

* add loggings, add test(), fix device_

* train_loader_

* align _'s

* fix transform bug

* remove unused constants

* use proper model in aggregation code

* removed unused file

* remove unused code and arguments, logging to DEBUG

* restore `metrics_prefix` parameter

* finish restoring `metrics_prefix`

* do not duplicate model code

* revert dedup attempt

* improve docstrings and descriptions

* change experiment name

* change pipeline name and docstring

* cite sources, remove wrongly added licenses

* italics

* black

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: thomasp-ms <XXX@me.com>
Co-authored-by: unknown <Mitgarg17495@gmail.com>

* update formatting

* add readme section

* rename training to traininsilo for consistency

* add more comments and update docs

* include urgency in PR template (#184)

Co-authored-by: thomasp-ms <XXX@me.com>

* Share resources and standardize component names (#182)

* use shared agg component across all examples

* only keep a single {reqs/env}

* use more recent pip version

* standardize component spec name

* support dummy HELLOWORLD example is agg

* black

Co-authored-by: thomasp-ms <XXX@me.com>

* Thopo/share component and environment (#185)

* use shared agg component across all examples

* only keep a single {reqs/env}

* use more recent pip version

* standardize component spec name

* support dummy HELLOWORLD example is agg

* black

* SHARED -> utils, rename agg env

Co-authored-by: thomasp-ms <XXX@me.com>

* rename config to spec and add upload data step

* upload data script

* use util aggregateweights

* add data splitting pipeline

* docs update

* log pipeline level only once per silo training

* do categorical encoding ahead of splitting

* nit updates

* update comment

* update formatting

* Hotfix: grant `az login` permissions to the 'clear run history' script  (#166)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* simplify the job wait condition's code

* add comments

* trigger mnist pipeline check

* test token validity

* grant `az login` permissions to the clear-history script

* revert to sleep wait code

* test access token validity

* nit

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* fix readme

* aggregate weughts on whichever device is available

* update docstrings

* update formatting

* reduce upload pipeline file

* fix datastore

* add info about data upload step

* fix typo

* steps for changing access policies

* update docs

* Named Entity Recognition example (#177)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* add multinerd template files

* NER components

* re-structure

* partition data + log metrics

* add redme

* add readme

* restructuring

* restructuring

* add doc strings

* train on gpus

* create a separate component to upload data on silos

* docs

* rename

* add assert statement

* change upload-data job compute to orchestrator compute

* remove ner from literal example choices

* fix doc

* add model-name, tokenizer configurable

* pip version upgrade

* reformatting

* use shared aggregated component

* rename script file

* add note

* create a compute that has access to silos' storage accs

* change data uploading approach

* update doc

* incorporate Thomas's feedback

* fix typo

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Create nice-looking homepage for the examples in readme+docs (#190)

* add homepage for industry examples

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Align the medical imaging data provisioning process with other examples (#191)

* adjust paths in config file

* support component with 1 output for Pneumonia

* formatting

* adjust doc to new provisioning

* remove GH action for dataprep

* custom component for provisioning pneumonia data

* black

Co-authored-by: thomasp-ms <XXX@me.com>

* hot fix (#192)

Co-authored-by: thomasp-ms <XXX@me.com>

* Lots of micro-fixes after bug bashing all 3 industry examples (#194)

related to components:
* create distinct names for all components of each scenario
* polish component descriptions
* remove unused mnist datatransfer and postprocessing components
* upgrade all MCR images to a more recent OS
* cut some unnecessary dependencies
* use curated environments whenever possible (to speed up job build time)

related to pipelines:
* fix issues with ccfraud submit script (path to shared folder)
* remove unnecessary json+azure imports in submit scripts
* align all 3 submissions scripts
* in upload data pipeline, make --example required without default value to force intentional decision
* in upload data pipeline, use scenario name in the output path to avoid collision
* give each submit pipeline a distinct experiment and run name for readability

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Standardize all 3 real world example tutorials (#193)

* standardize documentation on all 3 examples
* change titles
* fix spaces
* add pip instructions
* upgrade azure-ai-ml version

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* provide correct link to Kaggle dataset (#196)

* provide correct link

* .

* .

Co-authored-by: thomasp-ms <XXX@me.com>

* Add CI tests for industry-relevant examples  (#186)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* add pneumonia and ner examples tests

* add ccfraud test in the CI/CD pipeline

* add data upload test

* trigger workflow

* CI testing1

* CI testing1

* test kv kaggle creds

* fix creds

* fix creds

* set kaggle creds

* test pneumonia data-upload

* test all industry relevant examples

* upload data test for 3 examples

* add main tests

* rmv redundant chrs

* fix typo

* avoid industry relevant examples tests to run on the vnet setup as it is already covered by the open setup

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* CLI commands to add credentials in the workspace keyvault  (#199)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* add cli cmds to set a kv secret

* Jeff's feedback

* Implement Thomas's feedback

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Thomas/bug bash feedback 04 (#203)

* no need to navigate to a specific directory

* keyvault -> key vault

* improve Kaggle sections

* GPU's for NER example

* ARM templates with latest bicep version

* bold

* GPU instructions in quickstart

Co-authored-by: thomasp-ms <XXX@me.com>

* fix test to align with new sdk (#204)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Hotfix: DataAccessError (orchestrator access) (#205)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* fix bug

* update arm template

* fix a problem that was encountered during resolving merge conflicts

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Add GitHub Action workflow concurrency and implement token expiration policy workaround (#200)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* add GitHub workflow concurrency

* test 1

* test 1

* test 1

* test 2

* test 3

* test 2

* test 3

* implement token expiry workaround

* test 1

* workaround to handle token expiry error

* fix typo

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Implement troubleshooting guide with first typical issues (#208)

* write troubleshooting guide

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* Fix order of precedence for AML workspace references in submit.py (#209)

* refactor components to use dpv2 + remove unnecessary environments

* working dpv2 pipeline

* refactor scripts with right inputs and outputs

* fix code path

* implement fake outputs

* fix paths

* fix imports

* fix args of aggregation script

* add note, fix component args

* add chekcpoint arg

* linting

* linting

* remove sdkv2 folder

* add argparse to submit script

* add docstring

* add docstring

* linting

* linting

* add staging branch to build

* rollback changes to build, leave it for another PR

* remove logging lien

* remove custom uuid

* linting

* add docstring to custom path function

* polish docstring

* rename model_silo_X to input_silo_X

* rename output

* rename agg output

* Improve auto-provisioning resources (#35) (#36)

* docker file stub

* move docker file, implement feedback

* login before setting subscription

* login before setting subscription

* use default k8s version

* pin latest version since default won't work

* remove executionpolicy part, other small updates

* clarify to change job file _in docker filesystem_

* login before setting subscription

* formatting

* \ -> /

* install azureml-core in docker file

* propagate changes to section 7

* fix dataset creation command

Co-authored-by: thomasp-ms <XXX@me.com>

Co-authored-by: thomasp-ms <XXX@me.com>

* Refactor folder structure (#37)

* `plan` -> `docs`

* 'plan' -> 'docs'

* 'automated_provisioning' -> 'mlops'

* 'fl_arc_k8s' -> 'examples'

Co-authored-by: thomasp-ms <XXX@me.com>

* auto provisioning - vanilla internal silos (#41)

* split internal and external provisioning

* adjust directories after internal/external split

* introduce overall mlops readme

* first stab

* remove useless comment and my alias

Co-authored-by: thomasp-ms <XXX@me.com>

* Perform real FL training on the MNIST dataset

Added component files customized for MNIST dataset. Set the setup for 3
silo having their own compute and datastore.
git config --global user.email "you@example.com"

* refine components and add logs

* maintain consistency b/w config files

* add requirement and env files

* add requirement and env files

* rmv redundant dependencies, rename conda envs

* Correct epoch default value

* point data asset instead of underlying URI

* beef up orchestrator cluster (#46)

Co-authored-by: thomasp-ms <XXX@me.com>

* Provision CPUs for silos (instead of GPUs) (#47)

* beef up orchestrator cluster

* gpu -> cpu

Co-authored-by: thomasp-ms <XXX@me.com>

* add preprocessing comp description, fix typo and correct default datastore name

* add integration validation test - build

* update readme file

* Move logger to the maion if block, add pytorch channel in the conda env
yaml and move readme to the docs folder

* code reformatting using black

* add documentation to run an FL experiment

* add more intuitive path for aggr output dir

* Merge changes

* fix order of precedence

* fix build

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>

* Add data permissions issue to TSG (#210)

* add permissions issue to TSG

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

* November notes (#211)

Co-authored-by: thomasp-ms <XXX@me.com>

* upgrade all pip dependencies (#212)

Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>

Co-authored-by: David Majercak <damajercak@microsoft.com>
Co-authored-by: Jeff Omhover <jf.omhover@gmail.com>
Co-authored-by: Jeff Omhover <jeomhove@microsoft.com>
Co-authored-by: Thomas <7998422+thomasp-ms@users.noreply.github.com>
Co-authored-by: thomasp-ms <XXX@me.com>
Co-authored-by: Dávid Majerčák <david1majercak@gmail.com>
  • Loading branch information
7 people committed Dec 10, 2022
1 parent 0e94d77 commit 5d510c4
Show file tree
Hide file tree
Showing 94 changed files with 5,962 additions and 500 deletions.
7 changes: 7 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@
<!-- Describe the intention of the changes being proposed. What problem does it solve or functionality does it add? -->
* ...

## What is the expected review turnaround time?
<!-- Feel free to add more precisions about the urgency of this PR. -->
Urgency:
- [ ] High (needs review today)
- [ ] Medium (needs review within a few days - most common)
- [ ] Low (can wait a week)

## Does this introduce a breaking change?
<!-- Mark one with an "x". -->
```
Expand Down
16 changes: 16 additions & 0 deletions .github/actions/provision-resources/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,15 @@ inputs:
demo-base-name:
description: Demo base name (it should be unique)
required: true
github-sp-object-id:
description: GitHub Service Principal Object ID
required: true
kaggle-username:
description: Kaggle Username
required: true
kaggle-api-token:
description: Kaggle API Token
required: true
region:
description: Region of the resource group
required: false
Expand All @@ -48,3 +57,10 @@ runs:
- name: Deploy the demo resources in your resource group
shell: bash
run: az deployment group create --template-file ${{ inputs.deployment-file-path }} --resource-group ${{ inputs.resource-group }} --parameters demoBaseName=${{ inputs.demo-base-name }} --name ${{ inputs.deployment-name }}

- name: Set kaggle credentials in the Azure key vault
shell: bash
run: |
az keyvault set-policy -n kv-${{ inputs.demo-base-name }} --secret-permissions list set delete --object-id ${{ inputs.github-sp-object-id }}
az keyvault secret set --name kaggleusername --vault-name kv-${{ inputs.demo-base-name }} --value ${{ inputs.kaggle-username }}
az keyvault secret set --name kagglekey --vault-name kv-${{ inputs.demo-base-name }} --value ${{ inputs.kaggle-api-token }}
5 changes: 2 additions & 3 deletions .github/actions/submit-aml-factory-pipeline/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,8 @@ runs:
- name: Install python dependencies
shell: bash
run: pip install -r examples/pipelines/fl_cross_silo_factory/requirements.txt
run: pip install -r examples/pipelines/requirements.txt

- name: Submit fl_cross_silo_factory pipeline
shell: bash
run: python examples/pipelines/fl_cross_silo_factory/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --example ${{ inputs.example }} --submit --ignore_validation --wait

run: python examples/pipelines/fl_cross_silo_factory/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --example ${{ inputs.example }} --submit --ignore_validation --wait || [ $? == 5 ]
4 changes: 2 additions & 2 deletions .github/actions/submit-aml-literal-pipeline/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ runs:
- name: Install python dependencies
shell: bash
run: pip install -r examples/pipelines/fl_cross_silo_literal/requirements.txt
run: pip install -r examples/pipelines/requirements.txt

- name: Submit fl_cross_silo_literal pipeline
shell: bash
run: python examples/pipelines/fl_cross_silo_literal/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --example ${{ inputs.example }} --submit --wait
run: python examples/pipelines/fl_cross_silo_literal/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --example ${{ inputs.example }} --submit --wait || [ $? == 5 ]

52 changes: 52 additions & 0 deletions .github/actions/submit-example-pipeline/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: Submit example pipeline
description: Submit example pipeline in AML
inputs:
client-id:
description: Client ID of the service principal
required: true
tenant-id:
description: Tenant ID of the service principal
required: true
subscription-id:
description: Subscription to use for resources
required: true
resource-group:
description: Resource group of the AML workspace
required: true
workspace-name:
description: Workspace name
required: true
example:
description: Example pipeline to run
required: true

runs:
using: composite
steps:
- name: Setup python
uses: actions/setup-python@v2.2.1
with:
python-version: 3.8

- name: Azure login
uses: azure/login@v1
with:
client-id: ${{ inputs.client-id }}
tenant-id: ${{ inputs.tenant-id }}
subscription-id: ${{ inputs.subscription-id }}

- name: Install azure ml latest extension
shell: bash
run: |
az extension remove -n azure-cli-ml || echo "azure-cli-ml extension is not installed."
az extension remove -n ml || echo "ml extension is not installed."
az extension add -n ml -y
- name: Install python dependencies
shell: bash
run: pip install -r examples/pipelines/requirements.txt

- name: Submit example pipeline
shell: bash
run: python examples/pipelines/${{ inputs.example }}/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --submit --wait || [ $? == 5 ]

52 changes: 52 additions & 0 deletions .github/actions/submit-upload-data-pipeline/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: Submit upload data pipeline
description: Submit upload data pipeline in AML
inputs:
client-id:
description: Client ID of the service principal
required: true
tenant-id:
description: Tenant ID of the service principal
required: true
subscription-id:
description: Subscription to use for resources
required: true
resource-group:
description: Resource group of the AML workspace
required: true
workspace-name:
description: Workspace name
required: true
example:
description: Example upload data pipeline to run
required: true

runs:
using: composite
steps:
- name: Setup python
uses: actions/setup-python@v2.2.1
with:
python-version: 3.8

- name: Azure login
uses: azure/login@v1
with:
client-id: ${{ inputs.client-id }}
tenant-id: ${{ inputs.tenant-id }}
subscription-id: ${{ inputs.subscription-id }}

- name: Install azure ml latest extension
shell: bash
run: |
az extension remove -n azure-cli-ml || echo "azure-cli-ml extension is not installed."
az extension remove -n ml || echo "ml extension is not installed."
az extension add -n ml -y
- name: Install python dependencies
shell: bash
run: pip install -r examples/pipelines/requirements.txt

- name: Submit example pipeline
shell: bash
run: python examples/pipelines/utils/upload_data/submit.py --subscription_id ${{ inputs.subscription-id }} --resource_group ${{ inputs.resource-group }} --workspace_name ${{ inputs.workspace-name }} --submit --wait --example ${{ inputs.example }} || [ $? == 5 ]

2 changes: 1 addition & 1 deletion .github/workflows/build-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
if: github.base_ref

- name: Intall python dependencies
run: pip install -r examples/pipelines/fl_cross_silo_literal/requirements.txt
run: pip install -r examples/pipelines/requirements.txt

- name: Validate fl_cross_silo_literal pipeline
run: python examples/pipelines/fl_cross_silo_literal/submit.py
2 changes: 2 additions & 0 deletions .github/workflows/clear-pipeline-run-history.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ on:
jobs:
delete-history:
runs-on: ubuntu-latest
permissions:
id-token: write
steps:
- uses: actions/checkout@v2

Expand Down
Loading

0 comments on commit 5d510c4

Please sign in to comment.