diff --git a/mlops-multi-account-cdk/mlops-infra/ADVANCED_TOPICS.md b/mlops-multi-account-cdk/mlops-infra/ADVANCED_TOPICS.md new file mode 100644 index 00000000..13154a85 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-infra/ADVANCED_TOPICS.md @@ -0,0 +1,39 @@ +# Advanced topics +The topics defined here assume you have already deployed the solution once following the steps in the main [README](README.md) + +- [Advanced topics](#advanced-topics) + - [Setup CodeCommit with this repository](#setup-codecommit-with-this-repository) + + +## Setup CodeCommit with this repository +You would wonder after you have cloned this repository and deployed the solution how would you then start to interact with your deployed CodeCommit repository and start using it as a main repository and push changes to it. You have 2 options for this: +1. Clone the created CodeCommit repository and start treating it seperately from this repository +2. Just use this folder as a repository + +For the second option, you can do the following (while you are in the folder `mlops-infra`): +``` +git init +``` +this will create a local git for this folder which would be separate from the main so you can treat it as any git repo and it would not impact the main repository git. So, add the CodeCommit Repository as a remote source: +``` +git remote add origin https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/mlops-infra +``` +Ensure you have configured your machine to connect to CodeCommit and make `git push` or `git pull` commands to it; follow [Step 3 from the AWS documentation](https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-https-unixes.html). + +Now you can interact with the CodeCommit repository as normal. You will need to do the following for the first commit: +``` +git add -A +git commit -m "first commit" +export AWS_PROFILE=mlops-governance +git push origin main +make init # this will enable precommit which will now block any further pushes to the main branch +``` + +Ensure that your git uses the branch name **main** by default, otherwise the push command might fail and you will need to create a main branch then push changes through it. + +If you want to push the changes you made back to the main repository this folder belongs to you can just run this command: +``` +rm -fr .git +``` +This will remove the git settings from this folder so it would go back to the main repository settings. You can then raise a PR to include your changes to the main repository in GitHub. + diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/Dockerfile b/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/Dockerfile deleted file mode 100644 index 7057bb4f..00000000 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/Dockerfile +++ /dev/null @@ -1,40 +0,0 @@ -FROM public.ecr.aws/docker/library/python:3.7-buster as base - -RUN apt-get -y update && apt-get install -y \ - nginx \ - ca-certificates \ - policycoreutils \ - && rm -rf /var/lib/apt/lists/* - -ENV PATH="/usr/sbin/:${PATH}" - -COPY helpers/requirements.txt /requirements.txt - -RUN pip install --upgrade pip && pip install --no-cache -r /requirements.txt && \ - rm /requirements.txt -# Set up the program in the image -COPY helpers /opt/program - - -### start of TRAINING container -FROM base as xgboost -COPY training/xgboost/requirements.txt /requirements.txt -RUN pip install --no-cache -r /requirements.txt && \ - rm /requirements.txt - -# sm vars -ENV SAGEMAKER_MODEL_SERVER_TIMEOUT="300" -ENV MODEL_SERVER_TIMEOUT="300" -ENV PYTHONUNBUFFERED=TRUE -ENV PYTHONDONTWRITEBYTECODE=TRUE -ENV PATH="/opt/program:${PATH}" - -# env vars - -# Set up the program in the image -COPY training/xgboost /opt/program - -# set permissions of entrypoint -RUN chmod +x /opt/program/__main__.py - -WORKDIR /opt/program diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/.githooks/pre-commit b/mlops-multi-account-cdk/mlops-sm-project-template/.githooks/pre-commit similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/.githooks/pre-commit rename to mlops-multi-account-cdk/mlops-sm-project-template/.githooks/pre-commit diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/.gitignore b/mlops-multi-account-cdk/mlops-sm-project-template/.gitignore similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/.gitignore rename to mlops-multi-account-cdk/mlops-sm-project-template/.gitignore diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/.pre-commit-config.yaml b/mlops-multi-account-cdk/mlops-sm-project-template/.pre-commit-config.yaml new file mode 100644 index 00000000..288e4718 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/.pre-commit-config.yaml @@ -0,0 +1,75 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +repos: +# General +- repo: https://github.com/pre-commit/pre-commit-hooks + rev: v4.3.0 + hooks: + - id: check-case-conflict + - id: detect-private-key + - id: trailing-whitespace + - id: end-of-file-fixer + - id: mixed-line-ending + args: + - --fix=lf + exclude: /package-lock\.json$ + - id: check-added-large-files + args: + - --maxkb=1000 + - id: check-merge-conflict + - id: no-commit-to-branch + args: + - --branch + - main + - id: pretty-format-json + args: + - --autofix + - --indent=2 + - --no-sort-keys + exclude: /package-lock\.json$ +# Secrets +- repo: https://github.com/awslabs/git-secrets + rev: b9e96b3212fa06aea65964ff0d5cda84ce935f38 + hooks: + - id: git-secrets + entry: git-secrets --scan + files: . +- repo: https://github.com/psf/black + rev: 22.6.0 + hooks: + - id: black + args: ["--line-length=120"] +- repo: https://gitlab.com/PyCQA/flake8 + rev: 3.9.2 + hooks: + - id: flake8 + args: ["--ignore=E231,E501,F841,W503,F403,E266,W605,F541,F401,E302", "--exclude=app.py", "--max-line-length=120"] +- repo: https://github.com/Lucas-C/pre-commit-hooks + rev: v1.2.0 + hooks: + - id: forbid-crlf + - id: remove-crlf + - id: insert-license + files: \.(py|yaml)$ +- repo: local + hooks: + - id: clear-jupyter-notebooks + name: clear-jupyter-notebooks + entry: bash -c 'find . -type f -name "*.ipynb" -exec jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace "{}" \; && git add . && exit 0' + language: system + pass_filenames: false diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/ADVANCED_TOPICS.md b/mlops-multi-account-cdk/mlops-sm-project-template/ADVANCED_TOPICS.md new file mode 100644 index 00000000..b007106e --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/ADVANCED_TOPICS.md @@ -0,0 +1,94 @@ +# Advanced topics +The topics defined here assume you have already deployed the solution once following the steps in the main [README](README.md) + +- [Advanced topics](#advanced-topics) + - [Setup CodeCommit with this repository](#setup-codecommit-with-this-repository) + - [Test the created sagemaker templates](#test-the-created-sagemaker-templates) + + +## Setup CodeCommit with this repository +You would wonder after you have cloned this repository and deployed the solution how would you then start to interact with your deployed CodeCommit repository and start using it as a main repository and push changes to it. You have 2 options for this: +1. Clone the created CodeCommit repository and start treating it seperately from this repository +2. Just use this folder as a repository + +For the second option, you can do the following (while you are in the folder `mlops-sm-project-template`): +``` +git init +``` +this will create a local git for this folder which would be separate from the main so you can treat it as any git repo and it would not impact the main repository git. So, add the CodeCommit Repository as a remote source: +``` +git remote add origin https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/mlops-sm-project-template +``` +Ensure you have configured your machine to connect to CodeCommit and make `git push` or `git pull` commands to it; follow [Step 3 from the AWS documentation](https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-https-unixes.html). + +Now you can interact with the CodeCommit repository as normal. You will need to do the following for the first commit: +``` +git add -A +git commit -m "first commit" +export AWS_PROFILE=mlops-governance +git push origin main +make init # this will enable precommit which will now block any further pushes to the main branch +``` + +Ensure that your git uses the branch name **main** by default, otherwise the push command might fail and you will need to create a main branch then push changes through it. + +If you want to push the changes you made back to the main repository this folder belongs to you can just run this command: +``` +rm -fr .git +``` +This will remove the git settings from this folder so it would go back to the main repository settings. You can then raise a PR to include your changes to the main repository in GitHub. + + +## Test the created sagemaker templates +***NOTE:** make sure to run `cdk synth` before running any of the commands defined below.* + +You will need to deploy the `service catalog stack` as that would setup your account with the required resources and ssm parameters before you can start testing your templates directly. If you don't have the service catalog stack already deployed in your account, you can achieve this by running the following command: +``` +cdk --app ./cdk.out/assembly-Personal deploy —all --profile mlops-dev +``` + +otherwise make sure you have these ssm parameters defined: +- in the dev account: + - /mlops/dev/account_id + - /mlops/code/seed_bucket + - /mlops/code/build + - /mlops/code/build/byoc + - /mlops/code/deploy +- in the preprod account: + - /mlops/preprod/account_id + - /mlops/preprod/region +- in the prod account: + - /mlops/prod/account_id + - /mlops/prod/region + +**OPTION 1** For quick testing of the sagemaker templates, you could deploy the json generated by CDK directly in your account by running the following command: +``` +aws cloudformation deploy \ + --template-file ./cdk.out/byoc-project-stack-dev.template.json \ + --stack-name byoc-project-stack-dev \ + --region eu-west-1 \ + --capabilities CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND \ + --disable-rollback \ + --s3-bucket \ + --profile mlops-dev \ + --parameter-overrides \ + SageMakerProjectName=mlops-test-0 \ + SageMakerProjectId=sm12340 +``` +This command will deploy the byoc project stack if you want to deploy other templates just change the `--template-file`, if you want to create a new stack you can change the other fields as well. + +**OPTION 2** It is also possible to use CDK command for this exact purpose but this would require you to add the following to `app.py` file: +``` +from mlops_sm_project_template.templates.byoc_project_stack import MLOpsStack + +MLOpsStack( + app, + "test", + env=deployment_env, +) +``` +The run `cdk synth` and then run the following to deploy: +``` +cdk deploy test --parameters SageMakerProjectName=mlops-test \ + --parameters SageMakerProjectId=sm1234 --profile mlops-dev +``` \ No newline at end of file diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/LICENSE.txt b/mlops-multi-account-cdk/mlops-sm-project-template/LICENSE.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/LICENSE.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/LICENSE.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/Makefile b/mlops-multi-account-cdk/mlops-sm-project-template/Makefile similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/Makefile rename to mlops-multi-account-cdk/mlops-sm-project-template/Makefile diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/README.md similarity index 95% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/README.md index 2192cc07..8925f9d7 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/README.md +++ b/mlops-multi-account-cdk/mlops-sm-project-template/README.md @@ -118,12 +118,12 @@ There are 2 way to trigger the deployment CI/CD Pipeline: - **Model Events** - These are events which get triggered through a status change to the model package group in SageMaker Model Registry. - **Code Events** - The pipeline is triggered on git update events over a specific branch, in this solution it is linked to the **main** branch. -**Note:** For the deployment stages for **PREPROD** and **PROD**, the roles defined for cloudformation deployment in `mlops_sm_project_template_rt/templates/constructs/deploy_pipeline_construct.py` lines 284-292 and lines 317-326 are created when the **PREPROD** and **PROD** are bootstrapped with CDK with trust policies for the deployment CI/CD pipeline account (**DEV** account in our solution); the roles must be created before deploying this stack to any account along with trust policies included between the accounts and the roles. If you can bootstrap those accounts for any reason you should ensure to create similar roles in each of those accounts and adding them to the lines mentioned above in the file. +**Note:** For the deployment stages for **PREPROD** and **PROD**, the roles defined for cloudformation deployment in `mlops_sm_project_template/templates/constructs/deploy_pipeline_construct.py` lines 284-292 and lines 317-326 are created when the **PREPROD** and **PROD** are bootstrapped with CDK with trust policies for the deployment CI/CD pipeline account (**DEV** account in our solution); the roles must be created before deploying this stack to any account along with trust policies included between the accounts and the roles. If you can bootstrap those accounts for any reason you should ensure to create similar roles in each of those accounts and adding them to the lines mentioned above in the file. ### CodeCommit Stack *This stack is only needed if you want to handle deployments of this folder of the repository to be managed through a CICD pipeline.* -This stack handles setting up an AWS CodeCommit repository for this folder of the repository. This repository will be used as the source for the CI/CD pipeline defined in [Pipeline Stack](#pipeline-stack). The repository will be named based on the value defined in `mlops_sm_project_template_rt/config/constants.py` with this variable `CODE_COMMIT_REPO_NAME`. The repository will be intialised with a default branch as defined in the `constants.py` file under `PIPELINE_BRANCH` variable. +This stack handles setting up an AWS CodeCommit repository for this folder of the repository. This repository will be used as the source for the CI/CD pipeline defined in [Pipeline Stack](#pipeline-stack). The repository will be named based on the value defined in `mlops_sm_project_template/config/constants.py` with this variable `CODE_COMMIT_REPO_NAME`. The repository will be intialised with a default branch as defined in the `constants.py` file under `PIPELINE_BRANCH` variable. ### Pipeline Stack @@ -131,7 +131,7 @@ This stack handles setting up an AWS CodeCommit repository for this folder of th The CICD pipeline in this repository is setup to monitor an AWS CodeCommit repository as defined in [CodeCommit Stack](#codecommit-stack). -If you are using other sources like github or bitbucket for your repository, you will need to modify the connection to the appropriate repository as defined in `mlops_sm_project_template_rt/pipeline_stack.py`. This can be done using AWS CodeStar but must be setup on the account. +If you are using other sources like github or bitbucket for your repository, you will need to modify the connection to the appropriate repository as defined in `mlops_sm_project_template/pipeline_stack.py`. This can be done using AWS CodeStar but must be setup on the account. Make sure the pipelines also point to your targeted branch; by default the pipeline is linked to `main` branch events, this is defined in the `constants.py` file under `PIPELINE_BRANCH` variable. @@ -162,7 +162,7 @@ This is an AWS CDK project written in Python 3.8. Here's what you need to have o ├── app.py ├── cdk.json ├── diagrams -├── mlops_sm_project_template_rt +├── mlops_sm_project_template │   ├── README.md │   ├── __init__.py │   ├── cdk_helper_scripts @@ -219,7 +219,7 @@ aws_session_token = YOUR_SESSION_TOKEN # this token is generated if you are usi ... ``` -Before you start with the deployment of the solution make sure to bootstrap your accounts. Ensure you add the account details in `mlops_sm_project_template_rt/config/constants.py` mainly the target deployment accounts: **DEV**, **PREPROD** and **PROD**. +Before you start with the deployment of the solution make sure to bootstrap your accounts. Ensure you add the account details in `mlops_sm_project_template/config/constants.py` mainly the target deployment accounts: **DEV**, **PREPROD** and **PROD**. ``` PIPELINE_ACCOUNT = "" # account to host the pipeline handling updates of this repository diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/app.py b/mlops-multi-account-cdk/mlops-sm-project-template/app.py similarity index 85% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/app.py rename to mlops-multi-account-cdk/mlops-sm-project-template/app.py index 889a0340..bcfb24ef 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/app.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/app.py @@ -18,9 +18,9 @@ import aws_cdk as cdk import os -from mlops_sm_project_template_rt.pipeline_stack import PipelineStack, CoreStage -from mlops_sm_project_template_rt.codecommit_stack import CodeCommitStack -from mlops_sm_project_template_rt.config.constants import DEFAULT_DEPLOYMENT_REGION, PIPELINE_ACCOUNT, DEV_ACCOUNT +from mlops_sm_project_template.pipeline_stack import PipelineStack, CoreStage +from mlops_sm_project_template.codecommit_stack import CodeCommitStack +from mlops_sm_project_template.config.constants import DEFAULT_DEPLOYMENT_REGION, PIPELINE_ACCOUNT, DEV_ACCOUNT app = cdk.App() diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/cdk.json b/mlops-multi-account-cdk/mlops-sm-project-template/cdk.json similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/cdk.json rename to mlops-multi-account-cdk/mlops-sm-project-template/cdk.json diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/MLOPs Foundation Architecture-mlops project cicd architecture.jpg b/mlops-multi-account-cdk/mlops-sm-project-template/diagrams/MLOPs Foundation Architecture-mlops project cicd architecture.jpg similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/MLOPs Foundation Architecture-mlops project cicd architecture.jpg rename to mlops-multi-account-cdk/mlops-sm-project-template/diagrams/MLOPs Foundation Architecture-mlops project cicd architecture.jpg diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/MLOPs Foundation Architecture-sagemaker project architecture.jpg b/mlops-multi-account-cdk/mlops-sm-project-template/diagrams/MLOPs Foundation Architecture-sagemaker project architecture.jpg similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/MLOPs Foundation Architecture-sagemaker project architecture.jpg rename to mlops-multi-account-cdk/mlops-sm-project-template/diagrams/MLOPs Foundation Architecture-sagemaker project architecture.jpg diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/building.png b/mlops-multi-account-cdk/mlops-sm-project-template/diagrams/building.png similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/building.png rename to mlops-multi-account-cdk/mlops-sm-project-template/diagrams/building.png diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/deployment.png b/mlops-multi-account-cdk/mlops-sm-project-template/diagrams/deployment.png similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/diagrams/deployment.png rename to mlops-multi-account-cdk/mlops-sm-project-template/diagrams/deployment.png diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/__init__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/__init__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/cdk_helper_scripts/zip-image/Dockerfile b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/cdk_helper_scripts/zip-image/Dockerfile similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/cdk_helper_scripts/zip-image/Dockerfile rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/cdk_helper_scripts/zip-image/Dockerfile diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/codecommit_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/codecommit_stack.py similarity index 94% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/codecommit_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/codecommit_stack.py index 611bc5f2..b62ef866 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/codecommit_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/codecommit_stack.py @@ -26,10 +26,7 @@ from constructs import Construct -from mlops_sm_project_template_rt.config.constants import ( - CODE_COMMIT_REPO_NAME, - PIPELINE_BRANCH -) +from mlops_sm_project_template.config.constants import CODE_COMMIT_REPO_NAME, PIPELINE_BRANCH class CodeCommitStack(Stack): @@ -52,7 +49,7 @@ def __init__( "DeployAsset", path="", bundling=BundlingOptions( - image=DockerImage.from_build("mlops_sm_project_template_rt/cdk_helper_scripts/zip-image"), + image=DockerImage.from_build("mlops_sm_project_template/cdk_helper_scripts/zip-image"), command=[ "sh", "-c", diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/config/constants.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/config/constants.py similarity index 96% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/config/constants.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/config/constants.py index 1568cc11..cd64f57d 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/config/constants.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/config/constants.py @@ -15,7 +15,7 @@ # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -CODE_COMMIT_REPO_NAME = "mlops-sm-project-template-rt" +CODE_COMMIT_REPO_NAME = "mlops-sm-project-template" PIPELINE_BRANCH = "main" PIPELINE_ACCOUNT = "" # account used to host the pipeline handling updates of this repository diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/pipeline_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/pipeline_stack.py similarity index 97% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/pipeline_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/pipeline_stack.py index 7b5b2bda..2d34267b 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/pipeline_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/pipeline_stack.py @@ -26,7 +26,7 @@ from constructs import Construct -from mlops_sm_project_template_rt.config.constants import ( +from mlops_sm_project_template.config.constants import ( APP_PREFIX, CODE_COMMIT_REPO_NAME, DEV_ACCOUNT, @@ -34,7 +34,7 @@ PIPELINE_BRANCH, ) -from mlops_sm_project_template_rt.service_catalog_stack import ServiceCatalogStack +from mlops_sm_project_template.service_catalog_stack import ServiceCatalogStack class CoreStage(Stage): diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/service_catalog_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/service_catalog_stack.py similarity index 81% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/service_catalog_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/service_catalog_stack.py index 1a890dde..99e38f53 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/service_catalog_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/service_catalog_stack.py @@ -37,8 +37,8 @@ from constructs import Construct -from mlops_sm_project_template_rt.templates.basic_project_stack import MLOpsStack -from mlops_sm_project_template_rt.ssm_construct import SSMConstruct +from mlops_sm_project_template.templates.basic_project_stack import MLOpsStack +from mlops_sm_project_template.ssm_construct import SSMConstruct # Get environment variables LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO").upper() @@ -137,6 +137,9 @@ def __init__( products_launch_role.add_managed_policy( iam.ManagedPolicy.from_aws_managed_policy_name("AmazonSSMReadOnlyAccess") ) + products_launch_role.add_managed_policy( + iam.ManagedPolicy.from_aws_managed_policy_name("AmazonEC2ContainerRegistryFullAccess") + ) products_launch_role.add_to_policy( iam.PolicyStatement( @@ -196,59 +199,61 @@ def __init__( principal_type="IAM", ) - product = servicecatalog.CloudFormationProduct( - self, - "DeployProduct", - owner=portfolio_owner, - product_name=MLOpsStack.TEMPLATE_NAME, - product_versions=[ - servicecatalog.CloudFormationProductVersion( - cloud_formation_template=servicecatalog.CloudFormationTemplate.from_asset( - self.generate_template(MLOpsStack, f"MLOpsApp-{stage_name}", **kwargs) - ), - product_version_name=product_version, - ) - ], - description=MLOpsStack.DESCRIPTION, - ) + # product = servicecatalog.CloudFormationProduct( + # self, + # "DeployProduct", + # owner=portfolio_owner, + # product_name=MLOpsStack.TEMPLATE_NAME, + # product_versions=[ + # servicecatalog.CloudFormationProductVersion( + # cloud_formation_template=servicecatalog.CloudFormationTemplate.from_asset( + # self.generate_template(MLOpsStack, f"MLOpsApp-{stage_name}", **kwargs) + # ), + # product_version_name=product_version, + # ) + # ], + # description=MLOpsStack.DESCRIPTION, + # ) - portfolio_association.node.add_dependency(product) + # portfolio_association.node.add_dependency(product) - # Add product tags, and create role constraint for each product + # # Add product tags, and create role constraint for each product - portfolio.add_product(product) + # portfolio.add_product(product) - Tags.of(product).add(key="sagemaker:studio-visibility", value="true") + # Tags.of(product).add(key="sagemaker:studio-visibility", value="true") - role_constraint = servicecatalog.CfnLaunchRoleConstraint( - self, - f"LaunchRoleConstraint", - portfolio_id=portfolio.portfolio_id, - product_id=product.product_id, - role_arn=products_launch_role.role_arn, - description=f"Launch as {products_launch_role.role_arn}", - ) - role_constraint.add_depends_on(portfolio_association) + # role_constraint = servicecatalog.CfnLaunchRoleConstraint( + # self, + # f"LaunchRoleConstraint", + # portfolio_id=portfolio.portfolio_id, + # product_id=product.product_id, + # role_arn=products_launch_role.role_arn, + # description=f"Launch as {products_launch_role.role_arn}", + # ) + # role_constraint.add_depends_on(portfolio_association) # uncomment this block if you want to create service catalog products based on all templates - # make sure you comment out lines 213-247 - # products = self.deploy_all_products( - # portfolio_association, - # portfolio, - # products_launch_role, - # portfolio_owner, - # product_version, - # stage_name, - # **kwargs, - # ) + # make sure you comment out lines 202-234 + products = self.deploy_all_products( + portfolio_association, + portfolio, + products_launch_role, + portfolio_owner, + product_version, + stage_name, + **kwargs, + ) # Create the build and deployment asset as an output to pass to pipeline stack + zip_image = DockerImage.from_build("mlops_sm_project_template/cdk_helper_scripts/zip-image") + build_app_asset = s3_assets.Asset( self, "BuildAsset", path="seed_code/build_app/", bundling=BundlingOptions( - image=DockerImage.from_build("mlops_sm_project_template_rt/cdk_helper_scripts/zip-image"), + image=zip_image, command=[ "sh", "-c", @@ -258,12 +263,27 @@ def __init__( ), ) + byoc_build_app_asset = s3_assets.Asset( + self, + "BYOCBuildAsset", + path="seed_code/byoc_build_app/", + bundling=BundlingOptions( + image=zip_image, + command=[ + "sh", + "-c", + """zip -r /asset-output/byoc_build_app.zip .""", + ], + output_type=BundlingOutput.ARCHIVED, + ), + ) + deploy_app_asset = s3_assets.Asset( self, "DeployAsset", path="seed_code/deploy_app/", bundling=BundlingOptions( - image=DockerImage.from_build("mlops_sm_project_template_rt/cdk_helper_scripts/zip-image"), + image=zip_image, command=[ "sh", "-c", @@ -275,6 +295,7 @@ def __init__( build_app_asset.grant_read(grantee=products_launch_role) deploy_app_asset.grant_read(grantee=products_launch_role) + byoc_build_app_asset.grant_read(grantee=products_launch_role) # Output the deployment bucket and key, for input into pipeline stack self.export_ssm( @@ -287,6 +308,11 @@ def __init__( "/mlops/code/build", build_app_asset.s3_object_key, ) + self.export_ssm( + "BYOCCodeBuildKey", + "/mlops/code/build/byoc", + byoc_build_app_asset.s3_object_key, + ) self.export_ssm( "CodeDeployKey", "/mlops/code/deploy", @@ -303,7 +329,7 @@ def deploy_all_products( portfolio_owner: str, product_version: str, stage_name: str, - templates_directory: str = "mlops_sm_project_template_rt/templates", + templates_directory: str = "mlops_sm_project_template/templates", **kwargs, ): @@ -314,7 +340,7 @@ def deploy_all_products( if filename.endswith("_stack.py"): template_py_file = filename[:-3] - template_module = importlib.import_module(f"mlops_sm_project_template_rt.templates.{template_py_file}") + template_module = importlib.import_module(f"mlops_sm_project_template.templates.{template_py_file}") template_py_file = template_py_file.replace("_", "-") diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/ssm_construct.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/ssm_construct.py similarity index 97% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/ssm_construct.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/ssm_construct.py index a768c72b..9fc3eb46 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/ssm_construct.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/ssm_construct.py @@ -21,7 +21,7 @@ from constructs import Construct -from mlops_sm_project_template_rt.config.constants import ( +from mlops_sm_project_template.config.constants import ( DEV_ACCOUNT, PREPROD_ACCOUNT, PROD_ACCOUNT, diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/basic_project_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/basic_project_stack.py similarity index 95% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/basic_project_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/basic_project_stack.py index 30f2bd67..886c966a 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/basic_project_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/basic_project_stack.py @@ -31,18 +31,18 @@ from constructs import Construct -from mlops_sm_project_template_rt.templates.pipeline_constructs.build_pipeline_construct import ( +from mlops_sm_project_template.templates.pipeline_constructs.build_pipeline_construct import ( BuildPipelineConstruct, ) -from mlops_sm_project_template_rt.templates.pipeline_constructs.deploy_pipeline_construct import ( +from mlops_sm_project_template.templates.pipeline_constructs.deploy_pipeline_construct import ( DeployPipelineConstruct, ) -from mlops_sm_project_template_rt.config.constants import PREPROD_ACCOUNT, PROD_ACCOUNT, DEFAULT_DEPLOYMENT_REGION +from mlops_sm_project_template.config.constants import PREPROD_ACCOUNT, PROD_ACCOUNT, DEFAULT_DEPLOYMENT_REGION class MLOpsStack(Stack): - DESCRIPTION: str = "This template includes a model building pipeline that includes a workflow to pre-process, train, evaluate and register a model. The deploy pipeline creates a preprod and production endpoint. The target DEV/PREPROD/PROD accounts are predefined in the template." + DESCRIPTION: str = "This template includes a model building pipeline that includes a workflow to pre-process, train, evaluate and register a model. The deploy pipeline creates a dev,preprod and production endpoint. The target DEV/PREPROD/PROD accounts are predefined in the template." TEMPLATE_NAME: str = "Basic MLOps template for real-time deployment" def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None: diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/build_pipeline_construct.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/build_pipeline_construct.py new file mode 100644 index 00000000..a6434c47 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/build_pipeline_construct.py @@ -0,0 +1,270 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +from aws_cdk import ( + Aws, + aws_codecommit as codecommit, + aws_codebuild as codebuild, + aws_s3 as s3, + aws_iam as iam, + aws_codepipeline as codepipeline, + aws_codepipeline_actions as codepipeline_actions, +) +import aws_cdk +from constructs import Construct + + +class BuildPipelineConstruct(Construct): + def __init__( + self, + scope: Construct, + construct_id: str, + project_name: str, + project_id: str, + s3_artifact: s3.IBucket, + pipeline_artifact_bucket: s3.IBucket, + model_package_group_name: str, + ecr_repository_name: str, + repo_s3_bucket_name: str, + repo_s3_object_key: str, + **kwargs, + ) -> None: + super().__init__(scope, construct_id, **kwargs) + + # Define resource names + pipeline_name = f"{project_name}-{construct_id}" + pipeline_description = f"{project_name} Model Build Pipeline" + + # Create source repo from seed bucket/key + build_app_cfnrepository = codecommit.CfnRepository( + self, + "BuildAppCodeRepo", + repository_name=f"{project_name}-{construct_id}", + code=codecommit.CfnRepository.CodeProperty( + s3=codecommit.CfnRepository.S3Property( + bucket=repo_s3_bucket_name, + key=repo_s3_object_key, + object_version=None, + ), + branch_name="main", + ), + tags=[ + aws_cdk.CfnTag(key="sagemaker:project-id", value=project_id), + aws_cdk.CfnTag(key="sagemaker:project-name", value=project_name), + ], + ) + + # Reference the newly created repository + build_app_repository = codecommit.Repository.from_repository_name( + self, "ImportedBuildRepo", build_app_cfnrepository.attr_name + ) + + codebuild_role = iam.Role( + self, + "CodeBuildRole", + assumed_by=iam.ServicePrincipal("codebuild.amazonaws.com"), + path="/service-role/", + ) + + sagemaker_execution_role = iam.Role( + self, + "SageMakerExecutionRole", + assumed_by=iam.ServicePrincipal("sagemaker.amazonaws.com"), + path="/service-role/", + ) + + # Create a policy statement for SM and ECR pull + sagemaker_policy = iam.Policy( + self, + "SageMakerPolicy", + document=iam.PolicyDocument( + statements=[ + iam.PolicyStatement( + actions=[ + "logs:CreateLogGroup", + "logs:CreateLogStream", + "logs:PutLogEvents", + ], + resources=["*"], + ), + iam.PolicyStatement( + actions=["sagemaker:*"], + not_resources=[ + "arn:aws:sagemaker:*:*:domain/*", + "arn:aws:sagemaker:*:*:user-profile/*", + "arn:aws:sagemaker:*:*:app/*", + "arn:aws:sagemaker:*:*:flow-definition/*", + ], + ), + iam.PolicyStatement( + actions=[ + "ecr:BatchCheckLayerAvailability", + "ecr:BatchGetImage", + "ecr:Describe*", + "ecr:GetAuthorizationToken", + "ecr:GetDownloadUrlForLayer", + ], + resources=["*"], + ), + iam.PolicyStatement( + actions=[ + "cloudwatch:PutMetricData", + ], + resources=["*"], + ), + iam.PolicyStatement( + actions=[ + "s3:AbortMultipartUpload", + "s3:DeleteObject", + "s3:GetBucket*", + "s3:GetObject*", + "s3:List*", + "s3:PutObject*", + "s3:Create*", + ], + resources=[ + s3_artifact.bucket_arn, + f"{s3_artifact.bucket_arn}/*", + "arn:aws:s3:::sagemaker-*", + ], + ), + iam.PolicyStatement( + actions=["iam:PassRole"], + resources=[sagemaker_execution_role.role_arn], + ), + iam.PolicyStatement( + actions=[ + "kms:Encrypt", + "kms:ReEncrypt*", + "kms:GenerateDataKey*", + "kms:Decrypt", + "kms:DescribeKey", + ], + effect=iam.Effect.ALLOW, + resources=[f"arn:aws:kms:{Aws.REGION}:{Aws.ACCOUNT_ID}:key/*"], + ), + ] + ), + ) + + sagemaker_policy.attach_to_role(sagemaker_execution_role) + sagemaker_policy.attach_to_role(codebuild_role) + + sm_pipeline_build = codebuild.PipelineProject( + self, + "SMPipelineBuild", + project_name=f"{project_name}-{construct_id}", + role=codebuild_role, # figure out what actually this role would need + build_spec=codebuild.BuildSpec.from_source_filename("buildspec.yml"), + environment=codebuild.BuildEnvironment( + build_image=codebuild.LinuxBuildImage.STANDARD_5_0, + environment_variables={ + "SAGEMAKER_PROJECT_NAME": codebuild.BuildEnvironmentVariable(value=project_name), + "SAGEMAKER_PROJECT_ID": codebuild.BuildEnvironmentVariable(value=project_id), + "MODEL_PACKAGE_GROUP_NAME": codebuild.BuildEnvironmentVariable(value=model_package_group_name), + "AWS_REGION": codebuild.BuildEnvironmentVariable(value=Aws.REGION), + "SAGEMAKER_PIPELINE_NAME": codebuild.BuildEnvironmentVariable( + value=pipeline_name, + ), + "SAGEMAKER_PIPELINE_DESCRIPTION": codebuild.BuildEnvironmentVariable( + value=pipeline_description, + ), + "SAGEMAKER_PIPELINE_ROLE_ARN": codebuild.BuildEnvironmentVariable( + value=sagemaker_execution_role.role_arn, + ), + "ARTIFACT_BUCKET": codebuild.BuildEnvironmentVariable(value=s3_artifact.bucket_name), + "ARTIFACT_BUCKET_KMS_ID": codebuild.BuildEnvironmentVariable( + value=s3_artifact.encryption_key.key_id + ), + "ECR_REPO_URI": codebuild.BuildEnvironmentVariable( + value=f"{Aws.ACCOUNT_ID}.dkr.ecr.{Aws.REGION}.amazonaws.com/{ecr_repository_name}" + ), + }, + ), + ) + + # code build to include security scan over cloudformation template + docker_build = codebuild.Project( + self, + "DockerBuild", + build_spec=codebuild.BuildSpec.from_object( + { + "version": 0.2, + "phases": { + "build": { + "commands": [ + "cd source_scripts", + "chmod +x docker-build.sh", + f"./docker-build.sh {ecr_repository_name}", + ] + }, + }, + } + ), + environment=codebuild.BuildEnvironment(build_image=codebuild.LinuxBuildImage.STANDARD_5_0, privileged=True), + ) + + docker_build.add_to_role_policy( + iam.PolicyStatement( + actions=["ecr:*"], + effect=iam.Effect.ALLOW, + resources=[f"arn:aws:ecr:{Aws.REGION}:{Aws.ACCOUNT_ID}:repository/{ecr_repository_name}"], + ) + ) + + docker_build.add_to_role_policy( + iam.PolicyStatement( + actions=["ecr:Get*"], + effect=iam.Effect.ALLOW, + resources=["*"], + ) + ) + + source_artifact = codepipeline.Artifact(artifact_name="GitSource") + + build_pipeline = codepipeline.Pipeline( + self, + "Pipeline", + pipeline_name=f"{project_name}-{construct_id}", + artifact_bucket=pipeline_artifact_bucket, + ) + + # add a source stage + source_stage = build_pipeline.add_stage(stage_name="Source") + source_stage.add_action( + codepipeline_actions.CodeCommitSourceAction( + action_name="Source", + output=source_artifact, + repository=build_app_repository, + branch="main", + ) + ) + + # add a build stage + build_stage = build_pipeline.add_stage(stage_name="Build") + + build_stage.add_action( + codepipeline_actions.CodeBuildAction( + action_name="DockerBuild", input=source_artifact, project=docker_build, run_order=1 + ) + ) + + build_stage.add_action( + codepipeline_actions.CodeBuildAction( + action_name="SMPipeline", input=source_artifact, project=sm_pipeline_build, run_order=2 + ) + ) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/deploy_pipeline_construct.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/deploy_pipeline_construct.py new file mode 100644 index 00000000..27280fcd --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_pipeline_constructs/deploy_pipeline_construct.py @@ -0,0 +1,352 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +from aws_cdk import ( + Aws, + CfnCapabilities, + aws_codecommit as codecommit, + aws_codebuild as codebuild, + aws_codepipeline_actions as codepipeline_actions, + aws_codepipeline as codepipeline, + aws_events as events, + aws_events_targets as targets, + aws_s3 as s3, + aws_iam as iam, +) +import aws_cdk +from constructs import Construct + + +class DeployPipelineConstruct(Construct): + def __init__( + self, + scope: Construct, + construct_id: str, + project_name: str, + project_id: str, + pipeline_artifact_bucket: s3.IBucket, + model_package_group_name: str, + ecr_repo_arn: str, + model_bucket_arn: str, + repo_s3_bucket_name: str, + repo_s3_object_key: str, + preprod_account: int, + prod_account: int, + deployment_region: str, + **kwargs, + ) -> None: + super().__init__(scope, construct_id, **kwargs) + + # Define resource names + pipeline_name = f"{project_name}-{construct_id}" + + # Create source repo from seed bucket/key + deploy_app_cfnrepository = codecommit.CfnRepository( + self, + "BuildAppCodeRepo", + repository_name=f"{project_name}-{construct_id}", + code=codecommit.CfnRepository.CodeProperty( + s3=codecommit.CfnRepository.S3Property( + bucket=repo_s3_bucket_name, + key=repo_s3_object_key, + object_version=None, + ), + branch_name="main", + ), + tags=[ + aws_cdk.CfnTag(key="sagemaker:project-id", value=project_id), + aws_cdk.CfnTag(key="sagemaker:project-name", value=project_name), + ], + ) + + # Reference the newly created repository + deploy_app_repository = codecommit.Repository.from_repository_name( + self, "ImportedDeployRepo", deploy_app_cfnrepository.attr_name + ) + + cdk_synth_build_role = iam.Role( + self, + "CodeBuildRole", + assumed_by=iam.ServicePrincipal("codebuild.amazonaws.com"), + path="/service-role/", + ) + + cdk_synth_build_role.add_to_policy( + iam.PolicyStatement( + actions=["sagemaker:ListModelPackages"], + resources=[ + f"arn:{Aws.PARTITION}:sagemaker:{Aws.REGION}:{Aws.ACCOUNT_ID}:model-package-group/{project_name}-{project_id}*", + f"arn:{Aws.PARTITION}:sagemaker:{Aws.REGION}:{Aws.ACCOUNT_ID}:model-package/{project_name}-{project_id}/*", + ], + ) + ) + + cdk_synth_build_role.add_to_policy( + iam.PolicyStatement( + actions=["ssm:GetParameter"], + resources=[ + f"arn:{Aws.PARTITION}:ssm:{Aws.REGION}:{Aws.ACCOUNT_ID}:parameter/*", + ], + ) + ) + + cdk_synth_build_role.add_to_policy( + iam.PolicyStatement( + actions=[ + "kms:Encrypt", + "kms:ReEncrypt*", + "kms:GenerateDataKey*", + "kms:Decrypt", + "kms:DescribeKey", + ], + effect=iam.Effect.ALLOW, + resources=[f"arn:aws:kms:{Aws.REGION}:{Aws.ACCOUNT_ID}:key/*"], + ), + ) + + cdk_synth_build = codebuild.PipelineProject( + self, + "CDKSynthBuild", + role=cdk_synth_build_role, + build_spec=codebuild.BuildSpec.from_object( + { + "version": "0.2", + "phases": { + "build": { + "commands": [ + "npm install -g aws-cdk", + "pip install -r requirements.txt", + "cdk synth --no-lookups", + ] + } + }, + "artifacts": {"base-directory": "cdk.out", "files": "**/*"}, + } + ), + environment=codebuild.BuildEnvironment( + build_image=codebuild.LinuxBuildImage.STANDARD_5_0, + environment_variables={ + "MODEL_PACKAGE_GROUP_NAME": codebuild.BuildEnvironmentVariable(value=model_package_group_name), + "PROJECT_ID": codebuild.BuildEnvironmentVariable(value=project_id), + "PROJECT_NAME": codebuild.BuildEnvironmentVariable(value=project_name), + "ECR_REPO_ARN": codebuild.BuildEnvironmentVariable(value=ecr_repo_arn), + "MODEL_BUCKET_ARN": codebuild.BuildEnvironmentVariable(value=model_bucket_arn), + }, + ), + ) + + # code build to include security scan over cloudformation template + security_scan = codebuild.Project( + self, + "SecurityScanTooling", + build_spec=codebuild.BuildSpec.from_object( + { + "version": 0.2, + "env": { + "shell": "bash", + "variables": { + "TemplateFolder": "./*.template.json", + "FAIL_BUILD": "true", + }, + }, + "phases": { + "install": { + "runtime-versions": {"ruby": 2.6}, + "commands": [ + "export date=`date +%Y-%m-%dT%H:%M:%S.%NZ`", + "echo Installing cfn_nag - `pwd`", + "gem install cfn-nag", + "echo cfn_nag installation complete `date`", + ], + }, + "build": { + "commands": [ + "echo Starting cfn scanning `date` in `pwd`", + "echo 'RulesToSuppress:\n- id: W58\n reason: W58 is an warning raised due to Lambda functions require permission to write CloudWatch Logs, although the lambda role contains the policy that support these permissions cgn_nag continues to through this problem (https://github.com/stelligent/cfn_nag/issues/422)' > cfn_nag_ignore.yml", # this is temporary solution to an issue with W58 rule with cfn_nag + 'mkdir report || echo "dir report exists"', + "SCAN_RESULT=$(cfn_nag_scan --fail-on-warnings --deny-list-path cfn_nag_ignore.yml --input-path ${TemplateFolder} -o json > ./report/cfn_nag.out.json && echo OK || echo FAILED)", + "echo Completed cfn scanning `date`", + "echo $SCAN_RESULT", + "echo $FAIL_BUILD", + """if [[ "$FAIL_BUILD" = "true" && "$SCAN_RESULT" = "FAILED" ]]; then printf "\n\nFailiing pipeline as possible insecure configurations were detected\n\n" && exit 1; fi""", + ] + }, + }, + "artifacts": {"files": "./report/cfn_nag.out.json"}, + } + ), + environment=codebuild.BuildEnvironment( + build_image=codebuild.LinuxBuildImage.STANDARD_5_0, + ), + ) + + source_artifact = codepipeline.Artifact(artifact_name="GitSource") + cdk_synth_artifact = codepipeline.Artifact(artifact_name="CDKSynth") + cfn_nag_artifact = codepipeline.Artifact(artifact_name="CfnNagScanReport") + + deploy_code_pipeline = codepipeline.Pipeline( + self, + "DeployPipeline", + cross_account_keys=True, + pipeline_name=pipeline_name, + artifact_bucket=pipeline_artifact_bucket, + ) + + # add a source stage + source_stage = deploy_code_pipeline.add_stage(stage_name="Source") + source_stage.add_action( + codepipeline_actions.CodeCommitSourceAction( + action_name="Source", + output=source_artifact, + repository=deploy_app_repository, + branch="main", + ) + ) + + # add a build stage + build_stage = deploy_code_pipeline.add_stage(stage_name="Build") + + build_stage.add_action( + codepipeline_actions.CodeBuildAction( + action_name="Synth", + input=source_artifact, + outputs=[cdk_synth_artifact], + project=cdk_synth_build, + ) + ) + + # add a security evaluation stage for cloudformation templates + security_stage = deploy_code_pipeline.add_stage(stage_name="SecurityEvaluation") + + security_stage.add_action( + codepipeline_actions.CodeBuildAction( + action_name="CFNNag", + input=cdk_synth_artifact, + outputs=[cfn_nag_artifact], + project=security_scan, + ) + ) + + # add stages to deploy to the different environments + deploy_code_pipeline.add_stage( + stage_name="DeployDev", + actions=[ + codepipeline_actions.CloudFormationCreateUpdateStackAction( + action_name="Deploy_CFN_Dev", + run_order=1, + template_path=cdk_synth_artifact.at_path("dev.template.json"), + stack_name=f"{project_name}-{construct_id}-dev", + admin_permissions=False, + replace_on_failure=True, + role=iam.Role.from_role_arn( + self, + "DevActionRole", + f"arn:{Aws.PARTITION}:iam::{Aws.ACCOUNT_ID}:role/cdk-hnb659fds-deploy-role-{Aws.ACCOUNT_ID}-{Aws.REGION}", + ), + deployment_role=iam.Role.from_role_arn( + self, + "DevDeploymentRole", + f"arn:{Aws.PARTITION}:iam::{Aws.ACCOUNT_ID}:role/cdk-hnb659fds-cfn-exec-role-{Aws.ACCOUNT_ID}-{Aws.REGION}", + ), + cfn_capabilities=[ + CfnCapabilities.AUTO_EXPAND, + CfnCapabilities.NAMED_IAM, + ], + ), + codepipeline_actions.ManualApprovalAction( + action_name="Approve_PreProd", + run_order=2, + additional_information="Approving deployment for preprod", + ), + ], + ) + + deploy_code_pipeline.add_stage( + stage_name="DeployPreProd", + actions=[ + codepipeline_actions.CloudFormationCreateUpdateStackAction( + action_name="Deploy_CFN_PreProd", + run_order=1, + template_path=cdk_synth_artifact.at_path("preprod.template.json"), + stack_name=f"{project_name}-{construct_id}-preprod", + admin_permissions=False, + replace_on_failure=True, + role=iam.Role.from_role_arn( + self, + "PreProdActionRole", + f"arn:{Aws.PARTITION}:iam::{preprod_account}:role/cdk-hnb659fds-deploy-role-{preprod_account}-{deployment_region}", + ), + deployment_role=iam.Role.from_role_arn( + self, + "PreProdDeploymentRole", + f"arn:{Aws.PARTITION}:iam::{preprod_account}:role/cdk-hnb659fds-cfn-exec-role-{preprod_account}-{deployment_region}", + ), + cfn_capabilities=[ + CfnCapabilities.AUTO_EXPAND, + CfnCapabilities.NAMED_IAM, + ], + ), + codepipeline_actions.ManualApprovalAction( + action_name="Approve_Prod", + run_order=2, + additional_information="Approving deployment for prod", + ), + ], + ) + + deploy_code_pipeline.add_stage( + stage_name="DeployProd", + actions=[ + codepipeline_actions.CloudFormationCreateUpdateStackAction( + action_name="Deploy_CFN_Prod", + run_order=1, + template_path=cdk_synth_artifact.at_path("prod.template.json"), + stack_name=f"{project_name}-{construct_id}-prod", + admin_permissions=False, + replace_on_failure=True, + role=iam.Role.from_role_arn( + self, + "ProdActionRole", + f"arn:{Aws.PARTITION}:iam::{prod_account}:role/cdk-hnb659fds-deploy-role-{prod_account}-{deployment_region}", + ), + deployment_role=iam.Role.from_role_arn( + self, + "ProdDeploymentRole", + f"arn:{Aws.PARTITION}:iam::{prod_account}:role/cdk-hnb659fds-cfn-exec-role-{prod_account}-{deployment_region}", + ), + cfn_capabilities=[ + CfnCapabilities.AUTO_EXPAND, + CfnCapabilities.NAMED_IAM, + ], + ), + ], + ) + + # CloudWatch rule to trigger model pipeline when a status change event happens to the model package group + model_event_rule = events.Rule( + self, + "ModelEventRule", + event_pattern=events.EventPattern( + source=["aws.sagemaker"], + detail_type=["SageMaker Model Package State Change"], + detail={ + "ModelPackageGroupName": [model_package_group_name], + "ModelApprovalStatus": ["Approved", "Rejected"], + }, + ), + targets=[targets.CodePipeline(deploy_code_pipeline)], + ) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_project_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_project_stack.py new file mode 100644 index 00000000..8776d773 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/byoc_project_stack.py @@ -0,0 +1,314 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +from aws_cdk import ( + Aws, + CfnDynamicReference, + CfnDynamicReferenceService, + Stack, + Tags, + aws_s3 as s3, + aws_iam as iam, + aws_kms as kms, + aws_ecr as ecr, + aws_sagemaker as sagemaker, +) + +import aws_cdk + +from constructs import Construct + +from mlops_sm_project_template.templates.byoc_pipeline_constructs.build_pipeline_construct import ( + BuildPipelineConstruct, +) +from mlops_sm_project_template.templates.byoc_pipeline_constructs.deploy_pipeline_construct import ( + DeployPipelineConstruct, +) + +from mlops_sm_project_template.config.constants import PREPROD_ACCOUNT, PROD_ACCOUNT, DEFAULT_DEPLOYMENT_REGION + + +class MLOpsStack(Stack): + DESCRIPTION: str = "This template includes a model building pipeline that includes a workflow to build your own containers, pre-process, train, evaluate and register a model. The deploy pipeline creates a dev, preprod and production endpoint. The target DEV/PREPROD/PROD accounts are predefined in the template." + TEMPLATE_NAME: str = "MLOps template for real-time deployment using your own container" + + def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None: + super().__init__(scope, construct_id, **kwargs) + + # Define required parmeters + project_name = aws_cdk.CfnParameter( + self, + "SageMakerProjectName", + type="String", + description="The name of the SageMaker project.", + min_length=1, + max_length=32, + ).value_as_string + + project_id = aws_cdk.CfnParameter( + self, + "SageMakerProjectId", + type="String", + min_length=1, + max_length=16, + description="Service generated Id of the project.", + ).value_as_string + + Tags.of(self).add("sagemaker:project-id", project_id) + Tags.of(self).add("sagemaker:project-name", project_name) + + # create kms key to be used by the assets bucket + kms_key = kms.Key( + self, + "ArtifactsBucketKMSKey", + description="key used for encryption of data in Amazon S3", + enable_key_rotation=True, + policy=iam.PolicyDocument( + statements=[ + iam.PolicyStatement( + actions=["kms:*"], + effect=iam.Effect.ALLOW, + resources=["*"], + principals=[iam.AccountRootPrincipal()], + ) + ] + ), + ) + + # allow cross account access to the kms key + kms_key.add_to_resource_policy( + iam.PolicyStatement( + actions=[ + "kms:Encrypt", + "kms:Decrypt", + "kms:ReEncrypt*", + "kms:GenerateDataKey*", + "kms:DescribeKey", + ], + resources=[ + "*", + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{PREPROD_ACCOUNT}:root"), + iam.ArnPrincipal(f"arn:aws:iam::{PROD_ACCOUNT}:root"), + ], + ) + ) + + s3_artifact = s3.Bucket( + self, + "S3Artifact", + bucket_name=f"mlops-{project_name}-{project_id}-{Aws.REGION}", + encryption_key=kms_key, + versioned=True, + removal_policy=aws_cdk.RemovalPolicy.DESTROY, + ) + + # Block insecure requests to the bucket + s3_artifact.add_to_resource_policy( + iam.PolicyStatement( + sid="AllowSSLRequestsOnly", + actions=["s3:*"], + effect=iam.Effect.DENY, + resources=[ + s3_artifact.bucket_arn, + s3_artifact.arn_for_objects(key_pattern="*"), + ], + conditions={"Bool": {"aws:SecureTransport": "false"}}, + principals=[iam.AnyPrincipal()], + ) + ) + + # DEV account access to objects in the bucket + s3_artifact.add_to_resource_policy( + iam.PolicyStatement( + sid="AddDevPermissions", + actions=["s3:*"], + resources=[ + s3_artifact.arn_for_objects(key_pattern="*"), + s3_artifact.bucket_arn, + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{Aws.ACCOUNT_ID}:root"), + ], + ) + ) + + # PROD account access to objects in the bucket + s3_artifact.add_to_resource_policy( + iam.PolicyStatement( + sid="AddCrossAccountPermissions", + actions=["s3:List*", "s3:Get*", "s3:Put*"], + resources=[ + s3_artifact.arn_for_objects(key_pattern="*"), + s3_artifact.bucket_arn, + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{PREPROD_ACCOUNT}:root"), + iam.ArnPrincipal(f"arn:aws:iam::{PROD_ACCOUNT}:root"), + ], + ) + ) + + model_package_group_name = f"{project_name}-{project_id}" + + # cross account model registry resource policy + model_package_group_policy = iam.PolicyDocument( + statements=[ + iam.PolicyStatement( + sid="ModelPackageGroup", + actions=[ + "sagemaker:DescribeModelPackageGroup", + ], + resources=[ + f"arn:aws:sagemaker:{Aws.REGION}:{Aws.ACCOUNT_ID}:model-package-group/{model_package_group_name}" + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{PREPROD_ACCOUNT}:root"), + iam.ArnPrincipal(f"arn:aws:iam::{PROD_ACCOUNT}:root"), + ], + ), + iam.PolicyStatement( + sid="ModelPackage", + actions=[ + "sagemaker:DescribeModelPackage", + "sagemaker:ListModelPackages", + "sagemaker:UpdateModelPackage", + "sagemaker:CreateModel", + ], + resources=[ + f"arn:aws:sagemaker:{Aws.REGION}:{Aws.ACCOUNT_ID}:model-package/{model_package_group_name}/*" + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{PREPROD_ACCOUNT}:root"), + iam.ArnPrincipal(f"arn:aws:iam::{PROD_ACCOUNT}:root"), + ], + ), + ] + ).to_json() + + model_package_group = sagemaker.CfnModelPackageGroup( + self, + "ModelPackageGroup", + model_package_group_name=model_package_group_name, + model_package_group_description=f"Model Package Group for {project_name}", + model_package_group_policy=model_package_group_policy, + tags=[ + aws_cdk.CfnTag(key="sagemaker:project-id", value=project_id), + aws_cdk.CfnTag(key="sagemaker:project-name", value=project_name), + ], + ) + + # create ECR repository + ml_models_ecr_repo = ecr.Repository( + self, + "MLModelsECRRepository", + image_scan_on_push=True, + image_tag_mutability=ecr.TagMutability.MUTABLE, + repository_name=f"{project_name}", + ) + + # add cross account resource policies + ml_models_ecr_repo.add_to_resource_policy( + iam.PolicyStatement( + actions=[ + "ecr:BatchCheckLayerAvailability", + "ecr:BatchGetImage", + "ecr:CompleteLayerUpload", + "ecr:GetDownloadUrlForLayer", + "ecr:InitiateLayerUpload", + "ecr:PutImage", + "ecr:UploadLayerPart", + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{Aws.ACCOUNT_ID}:root"), + ], + ) + ) + + ml_models_ecr_repo.add_to_resource_policy( + iam.PolicyStatement( + actions=[ + "ecr:BatchCheckLayerAvailability", + "ecr:BatchGetImage", + "ecr:GetDownloadUrlForLayer", + ], + principals=[ + iam.ArnPrincipal(f"arn:aws:iam::{PREPROD_ACCOUNT}:root"), + iam.ArnPrincipal(f"arn:aws:iam::{PROD_ACCOUNT}:root"), + ], + ) + ) + + seed_bucket = CfnDynamicReference(CfnDynamicReferenceService.SSM, "/mlops/code/seed_bucket").to_string() + build_app_key = CfnDynamicReference(CfnDynamicReferenceService.SSM, "/mlops/code/build/byoc").to_string() + deploy_app_key = CfnDynamicReference(CfnDynamicReferenceService.SSM, "/mlops/code/deploy").to_string() + + kms_key = kms.Key( + self, + "PipelineBucketKMSKey", + description="key used for encryption of data in Amazon S3", + enable_key_rotation=True, + policy=iam.PolicyDocument( + statements=[ + iam.PolicyStatement( + actions=["kms:*"], + effect=iam.Effect.ALLOW, + resources=["*"], + principals=[iam.AccountRootPrincipal()], + ) + ] + ), + ) + + pipeline_artifact_bucket = s3.Bucket( + self, + "PipelineBucket", + bucket_name=f"pipeline-{project_id}-{Aws.REGION}", + encryption_key=kms_key, + versioned=True, + removal_policy=aws_cdk.RemovalPolicy.DESTROY, + ) + + BuildPipelineConstruct( + self, + "build", + project_name, + project_id, + s3_artifact, + pipeline_artifact_bucket, + model_package_group_name, + ml_models_ecr_repo.repository_name, + seed_bucket, + build_app_key, + ) + + DeployPipelineConstruct( + self, + "deploy", + project_name, + project_id, + pipeline_artifact_bucket, + model_package_group_name, + ml_models_ecr_repo.repository_arn, + s3_artifact.bucket_arn, + seed_bucket, + deploy_app_key, + PREPROD_ACCOUNT, + PROD_ACCOUNT, + DEFAULT_DEPLOYMENT_REGION, + ) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/dynamic_accounts_project_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/dynamic_accounts_project_stack.py similarity index 96% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/dynamic_accounts_project_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/dynamic_accounts_project_stack.py index 1649cde7..e041449a 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/dynamic_accounts_project_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/dynamic_accounts_project_stack.py @@ -31,16 +31,16 @@ from constructs import Construct -from mlops_sm_project_template_rt.templates.pipeline_constructs.build_pipeline_construct import ( +from mlops_sm_project_template.templates.pipeline_constructs.build_pipeline_construct import ( BuildPipelineConstruct, ) -from mlops_sm_project_template_rt.templates.pipeline_constructs.deploy_pipeline_construct import ( +from mlops_sm_project_template.templates.pipeline_constructs.deploy_pipeline_construct import ( DeployPipelineConstruct, ) class MLOpsStack(Stack): - DESCRIPTION: str = "This template includes a model building pipeline that includes a workflow to pre-process, train, evaluate and register a model. The deploy pipeline creates a preprod and production endpoint. The target PREPROD/PROD accounts are provided as cloudformation paramters and must be provided during project creation." + DESCRIPTION: str = "This template includes a model building pipeline that includes a workflow to pre-process, train, evaluate and register a model. The deploy pipeline creates a dev, preprod and production endpoint. The target PREPROD/PROD accounts are provided as cloudformation paramters and must be provided during project creation." TEMPLATE_NAME: str = "Dynamic Accounts MLOps template for real-time deployment" def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None: diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/pipeline_constructs/build_pipeline_construct.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/pipeline_constructs/build_pipeline_construct.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/pipeline_constructs/build_pipeline_construct.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/pipeline_constructs/build_pipeline_construct.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/pipeline_constructs/deploy_pipeline_construct.py b/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/pipeline_constructs/deploy_pipeline_construct.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/mlops_sm_project_template_rt/templates/pipeline_constructs/deploy_pipeline_construct.py rename to mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/pipeline_constructs/deploy_pipeline_construct.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/requirements-dev.txt b/mlops-multi-account-cdk/mlops-sm-project-template/requirements-dev.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/requirements-dev.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/requirements-dev.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/scripts/cdk-account-setup.sh b/mlops-multi-account-cdk/mlops-sm-project-template/scripts/cdk-account-setup.sh similarity index 96% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/scripts/cdk-account-setup.sh rename to mlops-multi-account-cdk/mlops-sm-project-template/scripts/cdk-account-setup.sh index 19061231..5b38d33a 100755 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/scripts/cdk-account-setup.sh +++ b/mlops-multi-account-cdk/mlops-sm-project-template/scripts/cdk-account-setup.sh @@ -15,7 +15,7 @@ sed -i '' -e "s/^PIPELINE_ACCOUNT = \"$pattern\"/PIPELINE_ACCOUNT = \"$gov_accou -e "s/^PREPROD_ACCOUNT = \"$pattern\"/PREPROD_ACCOUNT = \"$preprod_account\"/" \ -e "s/^PROD_ACCOUNT = \"$pattern\"/PROD_ACCOUNT = \"$prod_account\"/" \ -e "s/^DEFAULT_DEPLOYMENT_REGION = \"$pattern\"/DEFAULT_DEPLOYMENT_REGION = \"$region\"/" \ - mlops_sm_project_template_rt/config/constants.py + mlops_sm_project_template/config/constants.py echo 'AWS profiles to be used for each account' read -p 'Governance Account AWS Profile: ' gov_profile diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/scripts/install-prerequisites-brew.sh b/mlops-multi-account-cdk/mlops-sm-project-template/scripts/install-prerequisites-brew.sh similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/scripts/install-prerequisites-brew.sh rename to mlops-multi-account-cdk/mlops-sm-project-template/scripts/install-prerequisites-brew.sh diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/.githooks/pre-commit b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/.githooks/pre-commit similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/.githooks/pre-commit rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/.githooks/pre-commit diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/.pre-commit-config.yaml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/.pre-commit-config.yaml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/.pre-commit-config.yaml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/.pre-commit-config.yaml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/Makefile b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/Makefile similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/Makefile rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/Makefile diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/buildspec.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/buildspec.yml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/buildspec.yml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/buildspec.yml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/__init__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/__init__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/__version__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/__version__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/__version__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/__version__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/_utils.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/_utils.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/_utils.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/_utils.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/get_pipeline_definition.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/get_pipeline_definition.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/get_pipeline_definition.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/get_pipeline_definition.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/run_pipeline.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/run_pipeline.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/run_pipeline.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/run_pipeline.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/__init__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/__init__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/_utils.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/_utils.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/_utils.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/_utils.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/pipeline.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/pipeline.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/ml_pipelines/training/pipeline.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/ml_pipelines/training/pipeline.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/notebooks/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/notebooks/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/notebooks/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/notebooks/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/setup.cfg b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/setup.cfg similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/setup.cfg rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/setup.cfg diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/setup.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/setup.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/setup.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/setup.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/logger.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/logger.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/logger.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/logger.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/s3_helper.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/s3_helper.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/s3_helper.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/s3_helper.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/test/test_a.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/test/test_a.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/helpers/test/test_a.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/helpers/test/test_a.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/__main__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/__main__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/__main__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/__main__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/.githooks/pre-commit b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/.githooks/pre-commit similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/.githooks/pre-commit rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/.githooks/pre-commit diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/.pre-commit-config.yaml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/.pre-commit-config.yaml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/build_app/.pre-commit-config.yaml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/.pre-commit-config.yaml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/Makefile b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/Makefile similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/Makefile rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/Makefile diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/README.md new file mode 100644 index 00000000..5f37e522 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/README.md @@ -0,0 +1,24 @@ +# SageMaker Build - Train Pipelines + +This folder contains all the SageMaker Pipelines of your project. + +`buildspec.yml` defines how to run a pipeline after each commit to this repository. +`ml_pipelines/` contains the SageMaker pipelines definitions. +The expected output of the your main pipeline (here `training/pipeline.py`) is a model registered to SageMaker Model Registry. + +`source_scripts/` contains the underlying scripts run by the steps of your SageMaker Pipelines. For example, if your SageMaker Pipeline runs a Processing Job as part of a Processing Step, the code being run inside the Processing Job should be defined in this folder. +A typical folder structure for `source_scripts/` can contain `helpers`, `preprocessing`, `training`, `postprocessing`, `evaluate`, depending on the nature of the steps run as part of the SageMaker Pipeline. +We provide here an example with the Abalone dataset, to train an XGBoost model (using), and exaluating the model on a test set before sending it for manual approval to SageMaker Model Registry inside the SageMaker ModelPackageGroup defined when creating the SageMaker Project. +Additionally, if you use custom containers, the Dockerfile definitions should be found in that folder. + +`tests/` contains the unittests for your `source_scripts/` + +`notebooks/` contains experimentation notebooks. + +# Run pipeline from command line from this folder + +``` +pip install -e . + +run-pipeline --module-name ml_pipelines.training.pipeline --role-arn YOUR_SAGEMAKER_EXECUTION_ROLE_ARN --kwargs '{"region":"eu-west-1"}' +``` diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/buildspec.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/buildspec.yml new file mode 100644 index 00000000..b94ff07c --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/buildspec.yml @@ -0,0 +1,22 @@ +version: 0.2 + +phases: + install: + runtime-versions: + python: 3.8 + commands: + - pip install --upgrade --force-reinstall . "awscli>1.20.30" + + build: + commands: + - export PYTHONUNBUFFERED=TRUE + - export SAGEMAKER_PROJECT_NAME_ID="${SAGEMAKER_PROJECT_NAME}-${SAGEMAKER_PROJECT_ID}" + # Copy sample dataset for template - REMOVE when using your own data + - aws s3 cp s3://sagemaker-sample-files/datasets/tabular/uci_abalone/abalone.csv . + - aws s3 cp abalone.csv s3://${ARTIFACT_BUCKET} + - | + run-pipeline --module-name ml_pipelines.training.pipeline \ + --role-arn $SAGEMAKER_PIPELINE_ROLE_ARN \ + --tags "[{\"Key\":\"sagemaker:project-name\", \"Value\":\"${SAGEMAKER_PROJECT_NAME}\"}, {\"Key\":\"sagemaker:project-id\", \"Value\":\"${SAGEMAKER_PROJECT_ID}\"}]" \ + --kwargs "{\"region\":\"${AWS_REGION}\",\"role\":\"${SAGEMAKER_PIPELINE_ROLE_ARN}\",\"default_bucket\":\"${ARTIFACT_BUCKET}\",\"pipeline_name\":\"${SAGEMAKER_PROJECT_NAME_ID}\",\"model_package_group_name\":\"${MODEL_PACKAGE_GROUP_NAME}\",\"base_job_prefix\":\"${SAGEMAKER_PROJECT_NAME_ID}\", \"bucket_kms_id\":\"${ARTIFACT_BUCKET_KMS_ID}\", \"git_hash\":\"${CODEBUILD_RESOLVED_SOURCE_VERSION}\", \"ecr_repo_uri\":\"${ECR_REPO_URI}\", \"default_input_data\":\"s3://${ARTIFACT_BUCKET}/abalone.csv\"}" + - echo "Create/Update of the SageMaker Pipeline and execution completed." diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/README.md new file mode 100644 index 00000000..8e309f81 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/README.md @@ -0,0 +1,7 @@ +# SageMaker Pipelines + +This folder contains SageMaker Pipeline definitions and helper scripts to either simply "get" a SageMaker Pipeline definition (JSON dictionnary) with `get_pipeline_definition.py`, or "run" a SageMaker Pipeline from a SageMaker pipeline definition with `run_pipeline.py`. + +Those files are generic and can be reused to call any SageMaker Pipeline. + +Each SageMaker Pipeline definition should be be treated as a modul inside its own folder, for example here the "training" pipeline, contained inside `training/`. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__init__.py new file mode 100644 index 00000000..ff79f21c --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__init__.py @@ -0,0 +1,30 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__version__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__version__.py new file mode 100644 index 00000000..660d19ee --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/__version__.py @@ -0,0 +1,26 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""Metadata for the ml pipelines package.""" + +__title__ = "ml_pipelines" +__description__ = "ml pipelines - template package" +__version__ = "0.0.1" +__author__ = "" +__author_email__ = "" +__license__ = "Apache 2.0" +__url__ = "" diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/_utils.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/_utils.py new file mode 100644 index 00000000..581e1eb7 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/_utils.py @@ -0,0 +1,91 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. + +# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"). You +# may not use this file except in compliance with the License. A copy of +# the License is located at +# +# http://aws.amazon.com/apache2.0/ +# +# or in the "license" file accompanying this file. This file is +# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF +# ANY KIND, either express or implied. See the License for the specific +# language governing permissions and limitations under the License. +"""Provides utilities for SageMaker Pipeline CLI.""" +from __future__ import absolute_import + +import ast + + +def get_pipeline_driver(module_name, passed_args=None): + """Gets the driver for generating your pipeline definition. + + Pipeline modules must define a get_pipeline() module-level method. + + Args: + module_name: The module name of your pipeline. + passed_args: Optional passed arguments that your pipeline may be templated by. + + Returns: + The SageMaker Workflow pipeline. + """ + _imports = __import__(module_name, fromlist=["get_pipeline"]) + kwargs = convert_struct(passed_args) + return _imports.get_pipeline(**kwargs) + + +def convert_struct(str_struct=None): + """convert the string argument to it's proper type + + Args: + str_struct (str, optional): string to be evaluated. Defaults to None. + + Returns: + string struct as it's actuat evaluated type + """ + return ast.literal_eval(str_struct) if str_struct else {} + + +def get_pipeline_custom_tags(module_name, args, tags): + """Gets the custom tags for pipeline + + Returns: + Custom tags to be added to the pipeline + """ + try: + _imports = __import__(module_name, fromlist=["get_pipeline_custom_tags"]) + kwargs = convert_struct(args) + return _imports.get_pipeline_custom_tags(tags, kwargs["region"], kwargs["sagemaker_project_arn"]) + except Exception as e: + print(f"Error getting project tags: {e}") + return tags diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/get_pipeline_definition.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/get_pipeline_definition.py new file mode 100644 index 00000000..edfb6b40 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/get_pipeline_definition.py @@ -0,0 +1,77 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +"""A CLI to get pipeline definitions from pipeline modules.""" +from __future__ import absolute_import + +import argparse +import sys + +from ml_pipelines._utils import get_pipeline_driver + + +def main(): # pragma: no cover + """The main harness that gets the pipeline definition JSON. + + Prints the json to stdout or saves to file. + """ + parser = argparse.ArgumentParser("Gets the pipeline definition for the pipeline script.") + + parser.add_argument( + "-n", + "--module-name", + dest="module_name", + type=str, + help="The module name of the pipeline to import.", + ) + parser.add_argument( + "-f", + "--file-name", + dest="file_name", + type=str, + default=None, + help="The file to output the pipeline definition json to.", + ) + parser.add_argument( + "-kwargs", + "--kwargs", + dest="kwargs", + default=None, + help="Dict string of keyword arguments for the pipeline generation (if supported)", + ) + args = parser.parse_args() + + if args.module_name is None: + parser.print_help() + sys.exit(2) + + try: + pipeline = get_pipeline_driver(args.module_name, args.kwargs) + content = pipeline.definition() + if args.file_name: + with open(args.file_name, "w") as f: + f.write(content) + else: + print(content) + except Exception as e: # pylint: disable=W0703 + print(f"Exception: {e}") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/run_pipeline.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/run_pipeline.py new file mode 100644 index 00000000..d91be30b --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/run_pipeline.py @@ -0,0 +1,109 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""A CLI to create or update and run pipelines.""" +from __future__ import absolute_import + +import argparse +import json +import sys + +from ml_pipelines._utils import get_pipeline_driver, convert_struct, get_pipeline_custom_tags + + +def main(): # pragma: no cover + """The main harness that creates or updates and runs the pipeline. + + Creates or updates the pipeline and runs it. + """ + parser = argparse.ArgumentParser("Creates or updates and runs the pipeline for the pipeline script.") + + parser.add_argument( + "-n", + "--module-name", + dest="module_name", + type=str, + help="The module name of the pipeline to import.", + ) + parser.add_argument( + "-kwargs", + "--kwargs", + dest="kwargs", + default=None, + help="Dict string of keyword arguments for the pipeline generation (if supported)", + ) + parser.add_argument( + "-role-arn", + "--role-arn", + dest="role_arn", + type=str, + help="The role arn for the pipeline service execution role.", + ) + parser.add_argument( + "-description", + "--description", + dest="description", + type=str, + default=None, + help="The description of the pipeline.", + ) + parser.add_argument( + "-tags", + "--tags", + dest="tags", + default=None, + help="""List of dict strings of '[{"Key": "string", "Value": "string"}, ..]'""", + ) + args = parser.parse_args() + + if args.module_name is None or args.role_arn is None: + parser.print_help() + sys.exit(2) + tags = convert_struct(args.tags) + + try: + pipeline = get_pipeline_driver(args.module_name, args.kwargs) + print("###### Creating/updating a SageMaker Pipeline with the following definition:") + parsed = json.loads(pipeline.definition()) + print(json.dumps(parsed, indent=2, sort_keys=True)) + + all_tags = get_pipeline_custom_tags(args.module_name, args.kwargs, tags) + + upsert_response = pipeline.upsert(role_arn=args.role_arn, description=args.description, tags=all_tags) + + upsert_response = pipeline.upsert( + role_arn=args.role_arn, description=args.description + ) # , tags=tags) # Removing tag momentaneously + print("\n###### Created/Updated SageMaker Pipeline: Response received:") + print(upsert_response) + + execution = pipeline.start() + print(f"\n###### Execution started with PipelineExecutionArn: {execution.arn}") + + # TODO removiong wait time as training can take some time + print("Waiting for the execution to finish...") + execution.wait() + print("\n#####Execution completed. Execution step details:") + + print(execution.list_steps()) + except Exception as e: # pylint: disable=W0703 + print(f"Exception: {e}") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/README.md new file mode 100644 index 00000000..8a493ac6 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/README.md @@ -0,0 +1,7 @@ +# Training SageMaker Pipeline + +This SageMaker Pipeline definition creates a workflow that will: +- Prepare the Abalone dataset through a SageMaker Processing Job +- Train an XGBoost algorithm on the train set +- Evaluate the performance of the trained XGBoost algorithm on the validation set +- If the performance reaches a specified threshold, send the model for Manual Approval to SageMaker Model Registry. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/__init__.py new file mode 100644 index 00000000..ff79f21c --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/__init__.py @@ -0,0 +1,30 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/_utils.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/_utils.py new file mode 100644 index 00000000..78330433 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/_utils.py @@ -0,0 +1,86 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +import logging + +from botocore.exceptions import ClientError + +logger = logging.getLogger(__name__) + + +def resolve_ecr_uri_from_image_versions(sagemaker_session, image_versions, image_name): + """Gets ECR URI from image versions + Args: + sagemaker_session: boto3 session for sagemaker client + image_versions: list of the image versions + image_name: Name of the image + + Returns: + ECR URI of the image version + """ + + # Fetch image details to get the Base Image URI + for image_version in image_versions: + if image_version["ImageVersionStatus"] == "CREATED": + image_arn = image_version["ImageVersionArn"] + version = image_version["Version"] + logger.info(f"Identified the latest image version: {image_arn}") + response = sagemaker_session.sagemaker_client.describe_image_version(ImageName=image_name, Version=version) + return response["ContainerImage"] + return None + + +def resolve_ecr_uri(sagemaker_session, image_arn): + """Gets the ECR URI from the image name + + Args: + sagemaker_session: boto3 session for sagemaker client + image_name: name of the image + + Returns: + ECR URI of the latest image version + """ + + # Fetching image name from image_arn (^arn:aws(-[\w]+)*:sagemaker:.+:[0-9]{12}:image/[a-z0-9]([-.]?[a-z0-9])*$) + image_name = image_arn.partition("image/")[2] + try: + # Fetch the image versions + next_token = "" + while True: + response = sagemaker_session.sagemaker_client.list_image_versions( + ImageName=image_name, MaxResults=100, SortBy="VERSION", SortOrder="DESCENDING", NextToken=next_token + ) + + ecr_uri = resolve_ecr_uri_from_image_versions(sagemaker_session, response["ImageVersions"], image_name) + + if ecr_uri is not None: + return ecr_uri + + if "NextToken" in response: + next_token = response["NextToken"] + else: + break + + # Return error if no versions of the image found + error_message = f"No image version found for image name: {image_name}" + logger.error(error_message) + raise Exception(error_message) + + except (ClientError, sagemaker_session.sagemaker_client.exceptions.ResourceNotFound) as e: + error_message = e.response["Error"]["Message"] + logger.error(error_message) + raise Exception(error_message) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/pipeline.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/pipeline.py new file mode 100644 index 00000000..0f84194c --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/ml_pipelines/training/pipeline.py @@ -0,0 +1,293 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""Example workflow pipeline script for abalone pipeline. + + . -RegisterModel + . + Process-> Train -> Evaluate -> Condition . + . + . -(stop) + +Implements a get_pipeline(**kwargs) method. +""" +import os + +import boto3 +import logging +import sagemaker +import sagemaker.session + +from sagemaker.estimator import Estimator +from sagemaker.inputs import TrainingInput +from sagemaker.model_metrics import ( + MetricsSource, + ModelMetrics, +) +from sagemaker.processing import ( + ProcessingInput, + ProcessingOutput, + ScriptProcessor, +) +from sagemaker.sklearn.processing import SKLearnProcessor +from sagemaker.workflow.conditions import ConditionLessThanOrEqualTo +from sagemaker.workflow.condition_step import ( + ConditionStep, +) +from sagemaker.workflow.functions import ( + JsonGet, +) +from sagemaker.workflow.parameters import ( + ParameterInteger, + ParameterString, +) +from sagemaker.workflow.pipeline import Pipeline +from sagemaker.workflow.properties import PropertyFile +from sagemaker.workflow.steps import ( + ProcessingStep, + TrainingStep, +) +from sagemaker.workflow.step_collections import RegisterModel + +from botocore.exceptions import ClientError +from sagemaker.network import NetworkConfig + + +# BASE_DIR = os.path.dirname(os.path.realpath(__file__)) + +logger = logging.getLogger(__name__) + + +def get_session(region, default_bucket): + """Gets the sagemaker session based on the region. + + Args: + region: the aws region to start the session + default_bucket: the bucket to use for storing the artifacts + + Returns: + `sagemaker.session.Session instance + """ + + boto_session = boto3.Session(region_name=region) + + sagemaker_client = boto_session.client("sagemaker") + runtime_client = boto_session.client("sagemaker-runtime") + session = sagemaker.session.Session( + boto_session=boto_session, + sagemaker_client=sagemaker_client, + sagemaker_runtime_client=runtime_client, + default_bucket=default_bucket, + ) + + return session + + +def get_pipeline( + region, + role=None, + default_bucket=None, + bucket_kms_id=None, + model_package_group_name="AbalonePackageGroup", + pipeline_name="AbalonePipeline", + base_job_prefix="Abalone", + project_id="SageMakerProjectId", + git_hash="", + ecr_repo_uri="", + default_input_data="", +): + """Gets a SageMaker ML Pipeline instance working with on abalone data. + + Args: + region: AWS region to create and run the pipeline. + role: IAM role to create and run steps and pipeline. + default_bucket: the bucket to use for storing the artifacts + git_hash: the hash id of the current commit. Used to determine which docker image version to use + ecr_repo_uri: uri of the ECR repository used by this project + default_input_data: s3 location with data to be used by pipeline + + Returns: + an instance of a pipeline + """ + + sagemaker_session = get_session(region, default_bucket) + if role is None: + role = sagemaker.session.get_execution_role(sagemaker_session) + + # parameters for pipeline execution + processing_instance_count = ParameterInteger(name="ProcessingInstanceCount", default_value=1) + processing_instance_type = ParameterString(name="ProcessingInstanceType", default_value="ml.m5.xlarge") + training_instance_type = ParameterString(name="TrainingInstanceType", default_value="ml.m5.xlarge") + inference_instance_type = ParameterString(name="InferenceInstanceType", default_value="ml.m5.xlarge") + model_approval_status = ParameterString(name="ModelApprovalStatus", default_value="PendingManualApproval") + input_data = ParameterString( + name="InputDataUrl", + default_value=default_input_data, + ) + processing_image_uri = f"{ecr_repo_uri}:processing-{git_hash}" + training_image_uri = f"{ecr_repo_uri}:training-{git_hash}" + inference_image_uri = f"{ecr_repo_uri}:training-{git_hash}" + + # network_config = NetworkConfig( + # enable_network_isolation=True, + # security_group_ids=security_group_ids, + # subnets=subnets, + # encrypt_inter_container_traffic=True, + # ) + + script_processor = ScriptProcessor( + image_uri=processing_image_uri, + instance_type=processing_instance_type, + instance_count=processing_instance_count, + base_job_name=f"{base_job_prefix}/byoc-abalone-preprocess", + command=["Rscript"], + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + ) + step_process = ProcessingStep( + name="PreprocessAbaloneData", + processor=script_processor, + inputs=[ProcessingInput(source=input_data, destination="/opt/ml/processing/input")], + outputs=[ + ProcessingOutput(output_name="train", source="/opt/ml/processing/output/train"), + ProcessingOutput(output_name="validation", source="/opt/ml/processing/output/validation"), + ProcessingOutput(output_name="test", source="/opt/ml/processing/output/test"), + ], + code="source_scripts/preprocessing/prepare_abalone_data/preprocessing.R", # we must figure out this path to get it from step_source directory + ) + + # training step for generating model artifacts + model_path = f"s3://{default_bucket}/{base_job_prefix}/AbaloneTrain" + + train_estimator = Estimator( + image_uri=training_image_uri, + instance_type=training_instance_type, + instance_count=1, + output_path=model_path, + base_job_name=f"{base_job_prefix}/abalone-train", + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + source_dir="source_scripts/training/", + entry_point="train.R", + metric_definitions=[{"Name": "rmse-validation", "Regex": "Calculated validation RMSE: ([0-9.]+);.*$"}], + ) + + step_train = TrainingStep( + name="TrainAbaloneModel", + estimator=train_estimator, + inputs={ + "train": TrainingInput( + s3_data=step_process.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri, + content_type="text/csv", + ), + "validation": TrainingInput( + s3_data=step_process.properties.ProcessingOutputConfig.Outputs["validation"].S3Output.S3Uri, + content_type="text/csv", + ), + }, + ) + + # processing step for evaluation + script_eval = ScriptProcessor( + image_uri=training_image_uri, + command=["Rscript"], + instance_type=processing_instance_type, + instance_count=1, + base_job_name=f"{base_job_prefix}/script-abalone-eval", + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + ) + evaluation_report = PropertyFile( + name="AbaloneEvaluationReport", + output_name="evaluation", + path="evaluation.json", + ) + step_eval = ProcessingStep( + name="EvaluateAbaloneModel", + processor=script_eval, + inputs=[ + ProcessingInput( + source=step_train.properties.ModelArtifacts.S3ModelArtifacts, + destination="/opt/ml/processing/model", + ), + ProcessingInput( + source=step_process.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri, + destination="/opt/ml/processing/test", + ), + ], + outputs=[ + ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation"), + ], + code="source_scripts/evaluate/evaluation.R", + property_files=[evaluation_report], + ) + + # register model step that will be conditionally executed + model_metrics = ModelMetrics( + model_statistics=MetricsSource( + s3_uri="{}/evaluation.json".format( + step_eval.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"] + ), + content_type="application/json", + ) + ) + + step_register = RegisterModel( + name="RegisterAbaloneModel", + estimator=train_estimator, + image_uri=inference_image_uri, + model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts, + content_types=["application/json"], + response_types=["application/json"], + inference_instances=["ml.t2.medium", "ml.m5.large"], + transform_instances=["ml.m5.large"], + model_package_group_name=model_package_group_name, + approval_status=model_approval_status, + model_metrics=model_metrics, + ) + + # condition step for evaluating model quality and branching execution + cond_lte = ConditionLessThanOrEqualTo( + left=JsonGet( + step_name=step_eval.name, property_file=evaluation_report, json_path="regression_metrics.rmse.value" + ), + right=6.0, + ) + step_cond = ConditionStep( + name="CheckMSEAbaloneEvaluation", + conditions=[cond_lte], + if_steps=[step_register], + else_steps=[], + ) + + # pipeline instance + pipeline = Pipeline( + name=pipeline_name, + parameters=[ + processing_instance_type, + processing_instance_count, + training_instance_type, + model_approval_status, + input_data, + ], + steps=[step_process, step_train, step_eval, step_cond], + sagemaker_session=sagemaker_session, + ) + return pipeline diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/notebooks/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/notebooks/README.md new file mode 100644 index 00000000..c0749333 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/notebooks/README.md @@ -0,0 +1,4 @@ +# Jupyter Notebooks + +This folder is intended to store your experiment notebooks. +Typically the first step would be to store your Data Science notebooks, and start defining example SageMaker pipelines in here. Once satisfied with the first iteration of a SageMaker pipeline, the code should move as python scripts inside the respective `ml_pipelines/` and `source_scripts/` folders. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.cfg b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.cfg new file mode 100644 index 00000000..6f878705 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.cfg @@ -0,0 +1,14 @@ +[tool:pytest] +addopts = + -vv +testpaths = tests + +[aliases] +test=pytest + +[metadata] +description-file = README.md +license_file = LICENSE + +[wheel] +universal = 1 diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.py new file mode 100644 index 00000000..b10bb142 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/setup.py @@ -0,0 +1,77 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +import os +import setuptools + + +about = {} +here = os.path.abspath(os.path.dirname(__file__)) +with open(os.path.join(here, "ml_pipelines", "__version__.py")) as f: + exec(f.read(), about) + + +with open("README.md", "r") as f: + readme = f.read() + + +required_packages = ["sagemaker"] +extras = { + "test": [ + "black", + "coverage", + "flake8", + "mock", + "pydocstyle", + "pytest", + "pytest-cov", + "sagemaker", + "tox", + ] +} +setuptools.setup( + name=about["__title__"], + description=about["__description__"], + version=about["__version__"], + author=about["__author__"], + author_email=["__author_email__"], + long_description=readme, + long_description_content_type="text/markdown", + url=about["__url__"], + license=about["__license__"], + packages=setuptools.find_packages(), + include_package_data=True, + python_requires=">=3.6", + install_requires=required_packages, + extras_require=extras, + entry_points={ + "console_scripts": [ + "get-pipeline-definition=pipelines.get_pipeline_definition:main", + "run-pipeline=ml_pipelines.run_pipeline:main", + ] + }, + classifiers=[ + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Natural Language :: English", + "Programming Language :: Python", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.6", + "Programming Language :: Python :: 3.7", + "Programming Language :: Python :: 3.8", + ], +) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/Dockerfile b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/Dockerfile new file mode 100644 index 00000000..696959c0 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/Dockerfile @@ -0,0 +1,37 @@ +FROM public.ecr.aws/docker/library/r-base:4.1.2 as base + +# Install tidyverse +RUN apt update && apt-get install -y --no-install-recommends \ + r-cran-tidyverse + +RUN R -e "install.packages(c('rjson'))" + + +### start of PROCESSING container +FROM base as processing +ENTRYPOINT ["Rscript"] + +### start of TRAINING container +FROM base as training +RUN apt-get -y update && apt-get install -y --no-install-recommends \ + wget \ + apt-transport-https \ + ca-certificates \ + libcurl4-openssl-dev \ + libsodium-dev + +RUN apt-get update && apt-get install -y python3-dev python3-pip +RUN pip3 install boto3 +RUN R -e "install.packages('reticulate',dependencies=TRUE, repos='http://cran.rstudio.com/')" +RUN R -e "install.packages(c('readr','plumber'))" + +ENV PATH="/opt/ml/code:${PATH}" + +WORKDIR /opt/ml/code + +COPY docker_helpers/run.sh /opt/ml/code/run.sh +COPY docker_helpers/entrypoint.R /opt/ml/entrypoint.R + +RUN /bin/bash -c 'chmod +x /opt/ml/code/run.sh' + +ENTRYPOINT ["/bin/bash", "run.sh"] diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker-build.sh b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker-build.sh new file mode 100755 index 00000000..22e0d653 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker-build.sh @@ -0,0 +1,30 @@ +#!/bin/bash + +REPO_NAME=$1 + +echo $REPO_NAME + +aws ecr describe-repositories --region $AWS_DEFAULT_REGION --repository-names $REPO_NAME | jq --raw-output '.repositories[0]' > repository-info.json; + +AWS_ACCOUNT_ID=$(jq -r .registryId repository-info.json); +REPOSITORY_URI=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/${REPO_NAME}; +# REPOSITORY_URI=local + +aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com + +target_stages=("processing" "training") + +for stage in "${target_stages[@]}" +do + + IMAGE_TAG=$stage-$CODEBUILD_RESOLVED_SOURCE_VERSION; + + echo $IMAGE_TAG + + docker build --target $stage -t $REPOSITORY_URI:$stage . + docker tag $REPOSITORY_URI:$stage $REPOSITORY_URI:$IMAGE_TAG + + docker push $REPOSITORY_URI:$stage + docker push $REPOSITORY_URI:$IMAGE_TAG + +done diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/entrypoint.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/entrypoint.R new file mode 100644 index 00000000..b69849d1 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/entrypoint.R @@ -0,0 +1,62 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +library(jsonlite) +library(reticulate) +library(stringr) + + +args = commandArgs(trailingOnly=TRUE) +print(args) + +boto3 <- import('boto3') +s3 <- boto3$client('s3') + +# Setup parameters +# Container directories +prefix <- '/opt/ml' +input_path <- paste(prefix, 'input/data', sep='/') +output_path <- paste(prefix, 'output', sep='/') +model_path <- paste(prefix, 'model', sep='/') +code_dir <- paste(prefix, 'code', sep='/') +inference_code_dir <- paste(model_path, 'code', sep='/') + + +if (args=="train") { + + # This is where the hyperparamters are saved by the estimator on the container instance + param_path <- paste(prefix, 'input/config/hyperparameters.json', sep='/') + params <- read_json(param_path) + + s3_source_code_tar <- gsub('"', '', params$sagemaker_submit_directory) + script <- gsub('"', '', params$sagemaker_program) + + bucketkey <- str_replace(s3_source_code_tar, "s3://", "") + bucket <- str_remove(bucketkey, "/.*") + key <- str_remove(bucketkey, ".*?/") + + s3$download_file(bucket, key, "sourcedir.tar.gz") + untar("sourcedir.tar.gz", exdir=code_dir) + + print("training started") + source(file.path(code_dir, script)) + +} else if(args=="serve"){ + print("inference time") + source(file.path(inference_code_dir, "deploy.R")) +} diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/run.sh b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/run.sh new file mode 100644 index 00000000..3b4a2d2e --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/docker_helpers/run.sh @@ -0,0 +1,3 @@ +#!/bin/bash +echo "ready to execute" +Rscript /opt/ml/entrypoint.R $1 diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/README.md new file mode 100644 index 00000000..3727ec16 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/README.md @@ -0,0 +1 @@ +Use this folder to add all code related to evaluate the performance of your model. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/evaluation.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/evaluation.R new file mode 100644 index 00000000..4976fed1 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/evaluate/evaluation.R @@ -0,0 +1,47 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +library(readr) +library(rjson) + +model_path <- "/opt/ml/processing/model/" +model_file_tar <- paste0(model_path, "model.tar.gz") +model_file <- paste0(model_path, "model") + +untar(model_file_tar, exdir = "/opt/ml/processing/model") + +load(model_file) + +test_path <- "/opt/ml/processing/test/" +abalone_test <- read_csv(paste0(test_path, 'abalone_test.csv')) + + +y_pred= predict(regressor, newdata=abalone_test[,-1]) +rmse <- sqrt(mean(((abalone_test[,1] - y_pred)^2)[,])) +print(paste0("Calculated validation RMSE: ",rmse,";")) + +report_dict = list( + regression_metrics = list( + rmse= list(value= rmse, standard_deviation = NA) + ) +) + +output_dir = "/opt/ml/processing/evaluation/evaluation.json" + +jsonData <- toJSON(report_dict) +write(jsonData, output_dir) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/logger.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/logger.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/requirements.txt new file mode 100644 index 00000000..e69de29b diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/s3_helper.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/s3_helper.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/unittests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/test/test_a.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/unittests/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/helpers/test/test_a.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/README.md new file mode 100644 index 00000000..0b8678a4 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/README.md @@ -0,0 +1 @@ +Use this folder to add all code related to preprocessing your data. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/prepare_abalone_data/preprocessing.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/prepare_abalone_data/preprocessing.R new file mode 100644 index 00000000..a05da8b8 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/preprocessing/prepare_abalone_data/preprocessing.R @@ -0,0 +1,51 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +library(readr) +library(dplyr) +library(ggplot2) +library(forcats) + +input_dir <- "/opt/ml/processing/input/" +output_dir <- "/opt/ml/processing/output/" +#dir.create(output_dir, showWarnings = FALSE) + +filename <- Sys.glob(paste(input_dir, "*.csv", sep="")) +abalone <- read_csv(filename) + +names(abalone) <- c('sex', 'length', 'diameter', 'height', 'whole_weight', 'shucked_weight', 'viscera_weight', 'shell_weight', 'rings') + +abalone <- abalone %>% + mutate(female = as.integer(ifelse(sex == 'F', 1, 0)), + male = as.integer(ifelse(sex == 'M', 1, 0)), + infant = as.integer(ifelse(sex == 'I', 1, 0))) %>% + select(-sex) +abalone <- abalone %>% select(rings:infant, length:shell_weight) + + +abalone_train <- abalone %>% + sample_frac(size = 0.7) +abalone <- anti_join(abalone, abalone_train) +abalone_test <- abalone %>% + sample_frac(size = 0.5) +abalone_valid <- anti_join(abalone, abalone_test) + + +write_csv(abalone_train, paste0(output_dir,'train/abalone_train.csv')) +write_csv(abalone_valid, paste0(output_dir,'validation/abalone_valid.csv')) +write_csv(abalone_test, paste0(output_dir,'test/abalone_test.csv')) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/README.md new file mode 100644 index 00000000..fcf7e627 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/README.md @@ -0,0 +1 @@ +Use this folder to add all code related to training your model. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/deploy.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/deploy.R new file mode 100644 index 00000000..82848a64 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/deploy.R @@ -0,0 +1,39 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +library(plumber) +library(readr) +library(jsonlite) + +# load the trained model +prefix <- '/opt/ml/' +model_path <- paste0(prefix, 'model/model') +code_path <- paste0(prefix, 'model/code/') + +load(model_path) +print("Loaded model successfully") + +# function to use our model. You may require to transform data to make compatible with model +inference <- function(x){ + data = read_csv(x) + output <- predict(regressor, newdata=data) + list(output=output) +} + +app <- plumb(paste0(code_path,'endpoints.R')) +app$run(host='0.0.0.0', port=8080) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/endpoints.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/endpoints.R new file mode 100644 index 00000000..f0126104 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/endpoints.R @@ -0,0 +1,37 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +#' Ping to show server is there +#' @get /ping +function() { + return('Alive') +} + + +#' Parse input and return prediction from model +#' @param req The http request sent +#' @post /invocations +function(req) { + + # Read in data + input_json <- fromJSON(req$postBody) + output <- inference(input_json$features) + # Return prediction + return(output) + +} diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/test/test_a.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/test/test_a.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/train.R b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/train.R new file mode 100644 index 00000000..6559a239 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/byoc_build_app/source_scripts/training/train.R @@ -0,0 +1,53 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +library(readr) + +prefix <- '/opt/ml/' + +input_path <- paste0(prefix , 'input/data/train/') +input_path_v <- paste0(prefix , 'input/data/validation/') +output_path <- paste0(prefix, 'output/') +model_path <- paste0(prefix, 'model/') +code_path <- paste(prefix, 'code', sep='/') +inference_code_dir <- paste(model_path, 'code', sep='/') + + +abalone_train <- read_csv(paste0(input_path, 'abalone_train.csv')) +abalone_valid <- read_csv(paste0(input_path_v, 'abalone_valid.csv')) + +regressor = lm(formula = rings ~ female + male + length + diameter + height + whole_weight + shucked_weight + viscera_weight + shell_weight, data = abalone_train) +summary(regressor) + +y_pred= predict(regressor, newdata=abalone_valid[,-1]) +rmse <- sqrt(mean(((abalone_valid[,1] - y_pred)^2)[,])) +print(paste0("Calculated validation RMSE: ",rmse,";")) + + +# Save trained model +save(regressor, file = paste0(model_path,"model")) + +# Save inference code to be used with model +# find the files that you want +list_of_files <- list.files(code_path) + +# copy the files to the new folder +dir.create(inference_code_dir) +file.copy(list_of_files, inference_code_dir, recursive=TRUE) + +print("successfully saved model & code") diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/.githooks/pre-commit b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/.githooks/pre-commit new file mode 100755 index 00000000..12eaeef7 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/.githooks/pre-commit @@ -0,0 +1,44 @@ +#!/usr/bin/env python3 +# File generated by pre-commit: https://pre-commit.com +# ID: 138fd403232d2ddd5efb44317e38bf03 +import os +import sys + +# we try our best, but the shebang of this script is difficult to determine: +# - macos doesn't ship with python3 +# - windows executables are almost always `python.exe` +# therefore we continue to support python2 for this small script +if sys.version_info < (3, 3): + from distutils.spawn import find_executable as which +else: + from shutil import which + +# work around https://github.com/Homebrew/homebrew-core/issues/30445 +os.environ.pop("__PYVENV_LAUNCHER__", None) + +# start templated +INSTALL_PYTHON = "/usr/local/Caskroom/miniconda/base/envs/aws/bin/python" +ARGS = ["hook-impl", "--config=.pre-commit-config.yaml", "--hook-type=pre-commit"] +# end templated +ARGS.extend(("--hook-dir", os.path.realpath(os.path.dirname(__file__)))) +ARGS.append("--") +ARGS.extend(sys.argv[1:]) + +DNE = "`pre-commit` not found. Did you forget to activate your virtualenv?" +if os.access(INSTALL_PYTHON, os.X_OK): + CMD = [INSTALL_PYTHON, "-mpre_commit"] +elif which("pre-commit"): + CMD = ["pre-commit"] +else: + raise SystemExit(DNE) + +CMD.extend(ARGS) +if sys.platform == "win32": # https://bugs.python.org/issue19124 + import subprocess + + if sys.version_info < (3, 7): # https://bugs.python.org/issue25942 + raise SystemExit(subprocess.Popen(CMD).wait()) + else: + raise SystemExit(subprocess.call(CMD)) +else: + os.execvp(CMD[0], CMD) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/.pre-commit-config.yaml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/.pre-commit-config.yaml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/.pre-commit-config.yaml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/.pre-commit-config.yaml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/Makefile b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/Makefile new file mode 100644 index 00000000..ce0bc7b2 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/Makefile @@ -0,0 +1,102 @@ +.PHONY: lint init + +################################################################################# +# GLOBALS # +################################################################################# + +PROJECT_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST)))) +PROJECT_NAME = gfdtv-dataanalysis-data-models +PYTHON_INTERPRETER = python3 + +ifeq (,$(shell which conda)) +HAS_CONDA=False +else +HAS_CONDA=True +endif + +################################################################################# +# COMMANDS # +################################################################################# + +## Lint using flake8 +lint: + flake8 src +## Setup git hooks +init: + git config core.hooksPath .githooks + +clean: + rm -f cdk.staging + rm -rf cdk.out + find . -name '*.egg-info' -exec rm -fr {} + + find . -name '.coverage' -exec rm -fr {} + + find . -name '.pytest_cache' -exec rm -fr {} + + find . -name '.tox' -exec rm -fr {} + + find . -name '__pycache__' -exec rm -fr {} + +################################################################################# +# PROJECT RULES # +################################################################################# + + + + +################################################################################# +# Self Documenting Commands # +################################################################################# + +.DEFAULT_GOAL := help + +# Inspired by +# sed script explained: +# /^##/: +# * save line in hold space +# * purge line +# * Loop: +# * append newline + line to hold space +# * go to next line +# * if line starts with doc comment, strip comment character off and loop +# * remove target prerequisites +# * append hold space (+ newline) to line +# * replace newline plus comments by `---` +# * print line +# Separate expressions are necessary because labels cannot be delimited by +# semicolon; see +.PHONY: help +help: + @echo "$$(tput bold)Available rules:$$(tput sgr0)" + @echo + @sed -n -e "/^## / { \ + h; \ + s/.*//; \ + :doc" \ + -e "H; \ + n; \ + s/^## //; \ + t doc" \ + -e "s/:.*//; \ + G; \ + s/\\n## /---/; \ + s/\\n/ /g; \ + p; \ + }" ${MAKEFILE_LIST} \ + | LC_ALL='C' sort --ignore-case \ + | awk -F '---' \ + -v ncol=$$(tput cols) \ + -v indent=19 \ + -v col_on="$$(tput setaf 6)" \ + -v col_off="$$(tput sgr0)" \ + '{ \ + printf "%s%*s%s ", col_on, -indent, $$1, col_off; \ + n = split($$2, words, " "); \ + line_length = ncol - indent; \ + for (i = 1; i <= n; i++) { \ + line_length -= length(words[i]) + 1; \ + if (line_length <= 0) { \ + line_length = ncol - indent - length(words[i]) - 1; \ + printf "\n%*s ", -indent, " "; \ + } \ + printf "%s ", words[i]; \ + } \ + printf "\n"; \ + }' \ + | more $(shell test $(shell uname) = Darwin && echo '--no-init --raw-control-chars') diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/README.md similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/README.md rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/README.md diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/app.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/app.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/app.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/app.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/cdk.json b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/cdk.json similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/cdk.json rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/cdk.json diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/config_mux.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/config_mux.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/config_mux.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/config_mux.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/constants.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/constants.py similarity index 93% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/constants.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/constants.py index 1ecfd485..c4bb7d31 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/constants.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/constants.py @@ -33,3 +33,5 @@ PROJECT_NAME = os.getenv("PROJECT_NAME", "") PROJECT_ID = os.getenv("PROJECT_ID", "") MODEL_PACKAGE_GROUP_NAME = os.getenv("MODEL_PACKAGE_GROUP_NAME", "") +MODEL_BUCKET_ARN = os.getenv("MODEL_BUCKET_ARN", "arn:aws:s3:::*mlops*") +ECR_REPO_ARN = os.getenv("ECR_REPO_ARN", None) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/dev/constants.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/dev/constants.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/dev/constants.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/dev/constants.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/dev/endpoint-config.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/dev/endpoint-config.yml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/dev/endpoint-config.yml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/dev/endpoint-config.yml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/prod/constants.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/prod/constants.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/prod/constants.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/prod/constants.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/prod/endpoint-config.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/prod/endpoint-config.yml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/prod/endpoint-config.yml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/prod/endpoint-config.yml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/staging/constants.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/staging/constants.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/staging/constants.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/staging/constants.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/staging/endpoint-config.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/staging/endpoint-config.yml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/config/staging/endpoint-config.yml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/config/staging/endpoint-config.yml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/unit/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/__init__.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/unit/__init__.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/__init__.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py similarity index 88% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py index b21a80e0..bd2577ac 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/deploy_endpoint_stack.py @@ -30,7 +30,14 @@ from .get_approved_package import get_approved_package -from config.constants import PROJECT_NAME, PROJECT_ID, MODEL_PACKAGE_GROUP_NAME, DEV_ACCOUNT +from config.constants import ( + PROJECT_NAME, + PROJECT_ID, + MODEL_PACKAGE_GROUP_NAME, + DEV_ACCOUNT, + ECR_REPO_ARN, + MODEL_BUCKET_ARN, +) from datetime import datetime, timezone from dataclasses import dataclass @@ -51,7 +58,9 @@ class EndpointConfigProductionVariant(StageYamlDataClassConfig): instance_type: str = "ml.m5.2xlarge" variant_name: str = "AllTraffic" - FILE_PATH: Path = create_file_path_field("endpoint-config.yml", path_is_absolute=True) + FILE_PATH: Path = create_file_path_field( + "endpoint-config.yml", path_is_absolute=True + ) def get_endpoint_config_production_variant(self, model_name): """ @@ -126,7 +135,8 @@ def __init__( ], effect=iam.Effect.ALLOW, resources=[ - f"arn:aws:s3:::*mlops*", + MODEL_BUCKET_ARN, + f"{MODEL_BUCKET_ARN}/*", ], ), iam.PolicyStatement( @@ -144,13 +154,24 @@ def __init__( ), ) + if ECR_REPO_ARN: + model_execution_policy.add_statements( + iam.PolicyStatement( + actions=["ecr:Get*"], + effect=iam.Effect.ALLOW, + resources=[ECR_REPO_ARN], + ) + ) + model_execution_role = iam.Role( self, "ModelExecutionRole", assumed_by=iam.ServicePrincipal("sagemaker.amazonaws.com"), managed_policies=[ model_execution_policy, - iam.ManagedPolicy.from_aws_managed_policy_name("AmazonSageMakerFullAccess"), + iam.ManagedPolicy.from_aws_managed_policy_name( + "AmazonSageMakerFullAccess" + ), ], ) @@ -171,7 +192,9 @@ def __init__( execution_role_arn=model_execution_role.role_arn, model_name=model_name, containers=[ - sagemaker.CfnModel.ContainerDefinitionProperty(model_package_name=latest_approved_model_package) + sagemaker.CfnModel.ContainerDefinitionProperty( + model_package_name=latest_approved_model_package + ) ], vpc_config=sagemaker.CfnModel.VpcConfigProperty( security_group_ids=[sg_id], @@ -210,7 +233,9 @@ def __init__( endpoint_config_name=endpoint_config_name, kms_key_id=kms_key.key_id, production_variants=[ - endpoint_config_production_variant.get_endpoint_config_production_variant(model.model_name) + endpoint_config_production_variant.get_endpoint_config_production_variant( + model.model_name + ) ], ) diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/get_approved_package.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/get_approved_package.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/deploy_endpoint/get_approved_package.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/deploy_endpoint/get_approved_package.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/requirements-dev.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/requirements-dev.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/requirements-dev.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/requirements-dev.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/requirements.txt b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/requirements.txt similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/requirements.txt rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/requirements.txt diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/source.bat b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/source.bat similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/source.bat rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/source.bat diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/README.md b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/__init__.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/__init__.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/buildspec.yml b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/buildspec.yml similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/buildspec.yml rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/buildspec.yml diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/endpoint_test.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/endpoint_test.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/integration_tests/endpoint_test.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/integration_tests/endpoint_test.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/unittests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/unittests/__init__.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/unittests/__init__.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/unittests/test_deploy_app_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/unittests/test_deploy_app_stack.py similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/seed_code/deploy_app/tests/unittests/test_deploy_app_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/seed_code/deploy_app/tests/unittests/test_deploy_app_stack.py diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/source.bat b/mlops-multi-account-cdk/mlops-sm-project-template/source.bat similarity index 100% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/source.bat rename to mlops-multi-account-cdk/mlops-sm-project-template/source.bat diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/tests/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/tests/__init__.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/tests/__init__.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/__init__.py b/mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/__init__.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/__init__.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/unit/test_mlops_batch_v2_stack.py b/mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/test_mlops_batch_v2_stack.py similarity index 89% rename from mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/unit/test_mlops_batch_v2_stack.py rename to mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/test_mlops_batch_v2_stack.py index add758e3..2201746a 100644 --- a/mlops-multi-account-cdk/mlops-sm-project-template-rt/tests/unit/test_mlops_batch_v2_stack.py +++ b/mlops-multi-account-cdk/mlops-sm-project-template/tests/unit/test_mlops_batch_v2_stack.py @@ -18,11 +18,11 @@ import aws_cdk as core import aws_cdk.assertions as assertions -from mlops_sm_project_template_rt.sm_project_stack import MlopsBatchV2Stack +from mlops_sm_project_template.sm_project_stack import MlopsBatchV2Stack # example tests. To run these tests, uncomment this file along with the example -# resource in mlops_sm_project_template_rt_v2/mlops_sm_project_template_rt_v2_stack.py +# resource in mlops_sm_project_template_v2/mlops_sm_project_template_v2_stack.py def test_sqs_queue_created(): app = core.App() stack = MlopsBatchV2Stack(app, "mlops-batch-v2")