aws-samples · durgasury · Sep 9, 2021 · Aug 20, 2021 · Aug 22, 2021 · Aug 22, 2021
diff --git a/multi-branch-mlops-train/README.md b/multi-branch-mlops-train/README.md
@@ -0,0 +1,148 @@
+# Multi-Branch MLOps training pipeline
+
+## Purpose
+
+The purpose of this template is to enable multiple data scientists to work in parallel in concurrent experiments without interfering with each other and submitting conflicting changes to the repository.
+
+Much like in the context of software engineering where there is the concept of feature branches and GitFlow, this sample introduces the concept of experiment branches.
+
+Each experiment when submitted to the remote repository by using ``git push`` will trigger a training job that will generate a model artifact tagged with the commit hash and a `Pending` status.
+
+When a pull request gets approved from the ``experiment/<experiment_name>`` branch into `main`, the produced model artifact status gets automatically changed to `Approved`.
+
+![experiment-branch.jpg](images/experiment-branch.jpg)
+
+## Architecture
+
+There are two architectures available, one using AWS CodePipeline and AWS CodeCommit and another using Jenkins and GitHub.
+
+### AWS CodePipeline and AWS CodeCommit
+
+![codepipeline-codecommit-arch-train-complete.png](images/codepipeline-codecommit-arch-train-complete.png)
+
+### Jenkins and GitHub
+
+![jenkins-arch-train-complete.png](images/jenkins-arch-train-complete.png)
+
+## Usage (Adding the template to Amazon SageMaker Projects in Studio)
+
+### Step 1. Deploy the baseline stack 
+
+```
+git clone https://github.com/aws-samples/sagemaker-custom-project-templates.git
+mkdir sample-multi-branch-train
+cp -r sagemaker-custom-project-templates/multi-branch-mlops-train/* sample-multi-branch-train
+cd sample-multi-branch-train
+./deploy.sh -p code_pipeline+code_commit
+```
+
+In the example above you can also deploy the stack to support Jenkins and GitHub, using `./deploy.sh -p jenkins`.
+
+### Step 2. Create portfolio in AWS Service Catalog
+
+![img.png](images/create-portfolio.png)
+
+### Step 3. Create a new product for the portfolio
+
+![img.png](images/create-product-1.png)
+
+![img.png](images/create-product-2.png)
+
+Use the AWS Cloud Formation template deployed by the baseline stack.
+
+`https://cloud-formation-<ACCOUNT-ID>-us-east-1.s3.amazonaws.com/model_train.yaml`
+
+![img.png](images/create-product-3.png)
+
+### Step 4. Add SageMaker visibility tag to the product
+
+Tag `sagemaker:studio-visibility` with value `true`.
+
+![img.png](images/add-product-tag.png)
+
+### Step 5. Go to the Portfolio created and add a constraint.
+
+The role `MultiBranchTrainMLOpsLaunchRole` was created by the baseline stack.
+
+![img.png](images/add-portfolio-constraint.png)
+
+### Step 6. Go to the Portfolio created and share it with the relevant users as well as the SageMaker execution role, used by SageMaker Studio.
+
+![img.png](images/add-portfolio-roles.png)
+
+### Step 7. The template becomes available in SageMaker Studio
+
+![img.png](images/studio-project-available.png)
+
+## Usage (Creating a new project)
+
+### Step 1. Select the template in the example above and provide a name.
+
+Note that the name may have a maximum of 18 characters.
+
+![img.png](images/create-project.png)
+
+### Step 2. Wait for the project to be created.
+
+![img.png](images/wait-project-create.png)
+
+### Step 3. Add the sample code to the created repository
+
+Continue from the previously used terminal.
+
+Note that the user or role that is being used must have permission to use CodeCommit, such as the [AWSCodeCommitPowerUser](https://docs.aws.amazon.com/codecommit/latest/userguide/security-iam-awsmanpol.html#managed-policies-poweruser).
+
+```
+git init
+git stage .
+git commit -m "adds sample code"
+git remote add origin-aws https://git-codecommit.us-east-1.amazonaws.com/v1/repos/model-myawesomeproject-train
+git push --set-upstream origin-aws main
+```
+
+## Usage (Creating a new experiment)
+
+### Step 1. Submit experiment code to the repository.
+
+Either clone the CodeCommit repository or start from the previous terminal.
+
+```
+git checkout -b experiment/myexperiment
+<make some changes to the code>
+git commit -m "adds some-change"
+git push --set-upstream origin-aws experiment/myexperiment
+```
+
+Given a few seconds a new pipeline gets created in AWS CodePipeline.
+
+![img.png](images/codepipeline-running.png)
+
+The `Train` step of the pipeline launches a new AWS SageMaker Pipelines pipeline that trains the model.
+
+![img.png](images/sagemakerpipeline-running.png)
+
+When the pipeline finishes, a new model gets stored in SageMaker Model Registry with `Pending` status.
+
+![img.png](images/model-registry-pending.png)
+
+At this point the data scientist can assess the experiment results and push subsequent commits attempting to reach better results for the experiment goal. When doing so, the pipeline will be triggered again and new model versions will be stored in the Model Registry.
+
+If on the other hand, the Data Scientist deems the experiment successful, he can go ahead and create a pull request, asking to merge the changes from the `experiment/myexperiment` branch into `main`.
+
+### Step 2. Open pull request with successful experiment code.
+
+![img.png](images/open-pr-button.png)
+
+![img.png](images/create-pr.png)
+
+![img.png](images/pr-created.png)
+
+With the pull request created it can be reviewed, not just the code, but the results of the experiment as well.
+
+If all is good, we can merge the pull request in Fast forward-merge.
+
+![img.png](images/merge-pr.png)
+
+As soon as the merge is done, the respective model gets automatically approved in the Model Registry.
+
+![img.png](images/model-registry-approved.png)
diff --git a/multi-branch-mlops-train/buildspec_train.yml b/multi-branch-mlops-train/buildspec_train.yml
@@ -0,0 +1,18 @@
+version: 0.2
+
+env:
+  shell: bash
+
+phases:
+  install:
+    runtime-versions:
+      python: 3.8
+    commands:
+      - pip install --upgrade --force-reinstall awscli
+      - pip install -r requirements.txt
+  build:
+    commands:
+      - export PYTHONUNBUFFERED=TRUE
+      - export BRANCH_NAME_NORM=$(echo $BRANCH_NAME | sed 's/origin\///;s/\//-/')
+      - export COMMIT_HASH=${CODEBUILD_RESOLVED_SOURCE_VERSION:-${COMMIT_HASH:-}}
+      - python pipelines/run_pipeline.py --region $AWS_DEFAULT_REGION --experiment-name $BRANCH_NAME_NORM --model-package-group-name $MODEL_PACKAGE_GROUP_NAME --model-name $MODEL_NAME --project-id $PROJECT_ID --commit-id $COMMIT_HASH --role-arn $SAGEMAKER_PIPELINE_ROLE_ARN