## MLOps Pipeline Template

Overview, code templates, and model options from various stages of an end-to-end, full-stack MLOps pipeline 

***

### Create Directory Structure

```
- mlops_pipeline/
  - cdk/
    - app.py
    - stack.py
  - model/
    - train.py (PyTorch code)
    - train_tf.py (TensorFlow code)
  - docker/
    - Dockerfile
  - .gitlab-ci.yml

```

***Bash script to create the directory structure:***

```
#!/bin/bash

# Create the main directory
mkdir -p mlops_pipeline

# Create subdirectories for CDK, model, docker, and GitLab CI/CD
mkdir -p mlops_pipeline/cdk
mkdir -p mlops_pipeline/model
mkdir -p mlops_pipeline/docker

# Create the necessary files
touch mlops_pipeline/cdk/app.py
touch mlops_pipeline/cdk/stack.py
touch mlops_pipeline/model/train.py
touch mlops_pipeline/model/train_tf.py
touch mlops_pipeline/docker/Dockerfile
touch mlops_pipeline/.gitlab-ci.yml

echo "Directory structure created."

```

***Save this content in a file, make it executable, and run it:***


```
chmod +x create_directory_structure.sh
./create_directory_structure.sh

```

***

### AWS CDK - Define CloudFormation Resources

`cdk/app.py`

In [None]:
from aws_cdk import core
from stack import MLOpsStack

app = core.App()
MLOpsStack(app, "MLOpsStack")

`cdk/stack.py`

***Run*** `cdk deploy` ***to deploy the stack***

In [None]:
from aws_cdk import (
    aws_s3 as s3,
    aws_sagemaker as sagemaker,
    core,
    ##-- import other required services
)

class MLOpsStack(core.Stack):
    def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        ##-- Create an S3 bucket
        data_bucket = s3.Bucket(self, "DataBucket")
        ##-- Add other resources like SageMaker, Lambda, etc.

### Random Forest Model

##### PyTorch Model

In [None]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
import torch

X, y = load_iris(return_X_y=True)
clf = RandomForestClassifier()
clf.fit(X, y)

# Convert model to PyTorch
torch_model = torch.from_numpy(clf.feature_importances_)

##### TensorFlow Model

In [None]:
import tensorflow as tf

# Assuming a pre-trained RandomForest Classifier model in 'clf'
feature_columns = [tf.feature_column.numeric_column(key=str(i)) for i in range(X.shape[1])]
tf_estimator = tf.estimator.BoostedTreesClassifier(feature_columns, n_batches_per_layer=1)

##### Scalable Model w/ PySpark and/or Scala

In [None]:
import org.apache.spark.ml.classification.RandomForestClassifier

val rf = new RandomForestClassifier()
  .setLabelCol("label")
  .setFeaturesCol("features")

// train the model
val model = rf.fit(trainingData)

### Deploy Model 

##### SageMaker (Python SDK)

In [2]:
from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role

role = get_execution_role()

pytorch_model = PyTorchModel(model_data='s3://path/to/model.tar.gz', role=role, framework_version='1.5.0', entry_point='inference.py')

predictor = pytorch_model.deploy(instance_type='ml.m4.xlarge', initial_instance_count=1)

##### SageMaker (Boto3)

In [None]:
import boto3

sagemaker = boto3.client('sagemaker')

model_url = "s3://my-bucket/model.tar.gz"

response = sagemaker.create_model(
    ModelName='my-random-forest-model',
    PrimaryContainer={
        'Image': 'your-docker-image-url',
        'ModelDataUrl': model_url
    },
    ExecutionRoleArn='your-iam-role-arn'
)

##-- Code for endpoint creation, model deployment

### AWS SDK for Python (Boto3) - Create S3 Bucket

In [None]:
import boto3

s3 = boto3.client('s3')
s3.create_bucket(Bucket='my-data-bucket')

***

### Productionalize with Docker
`docker/Dockerfile`

`dockerfile`

```
# Use an official PyTorch/TensorFlow runtime as a parent image
FROM pytorch/pytorch

# Set the working directory
WORKDIR /app

# Copy the current directory contents
COPY . /app

# Install any needed packages
RUN pip install -r requirements.txt

# Run train.py when the container launches
CMD ["python", "train.py"]
```

### CI/CD & Version Control with GitLab
`.gitlab-ci.yml`

`yml`

```
stages:
  - build
  - deploy

build:
  script:
    - docker build -t my-model .
    - docker push my-model

deploy:
  script:
    - aws s3 cp model.tar.gz s3://my-bucket/
    - aws sagemaker create-model --model-name "my-model" ...
```

### Initialize New Repo for Pipeline
`bash`

```
git init
git add .
git commit -m "Initial commit"
git remote add origin [GitLab Repo URL]
git push -u origin master
```