## Running Numerical optimization jobs with your own dependencies

Below, you walk through how to create a SageMaker processing container, and how to use a `ScriptProcessor` to run your own numerical optimization code within a container. You can provide your own dependencies inside this container to run your processing script with.

In [8]:
import boto3
import sagemaker
from sagemaker import get_execution_role

region = boto3.session.Session().region_name

role = get_execution_role()

In [2]:
!rm -r docker

In [2]:
!mkdir docker

In [3]:
%%writefile docker/Dockerfile

FROM continuumio/anaconda3

RUN pip install boto3 pandas scikit-learn pulp pyomo inspyred ortools scipy deap 

RUN conda install -c conda-forge ipopt coincbc glpk

ENV PYTHONUNBUFFERED=TRUE

ENTRYPOINT ["python"]

Writing docker/Dockerfile


This block of code builds the container using the `docker` command, creates an Amazon Elastic Container Registry (Amazon ECR) repository, and pushes the image to Amazon ECR.

In [4]:
import boto3

account_id = boto3.client('sts').get_caller_identity().get('Account')
ecr_repository = 'sagemaker-opt-container'
tag = ':latest'
processing_repository_uri = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account_id, region, ecr_repository + tag)

# Create ECR repository and push docker image
!docker build -t $ecr_repository docker
!$(aws ecr get-login --region $region --registry-ids $account_id --no-include-email)
!aws ecr create-repository --repository-name $ecr_repository
!docker tag {ecr_repository + tag} $processing_repository_uri
!docker push $processing_repository_uri

Sending build context to Docker daemon  2.048kB
Step 1/5 : FROM continuumio/anaconda3
latest: Pulling from continuumio/anaconda3

[1B81a07f80: Pulling fs layer 
[1B74263e80: Pull complete 7.7MB/867.7MBB[2A[2K[1A[2K[2A[2K[1A[2K[1A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[1A[2K[2A[2K[1A[2K[2A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A

libcblas-3.9.0       | 11 KB     | ########## | 100% 
ipopt-3.12.13        | 928 KB    | ########## | 100% 
conda-4.10.3         | 3.1 MB    | ########## | 100% 
ampl-mp-3.1.0        | 1.2 MB    | ########## | 100% 
mumps-include-5.2.1  | 23 KB     | ########## | 100% 
mumps-seq-5.2.1      | 3.4 MB    | ########## | 100% 
coincbc-2.10.5       | 7.9 MB    | ########## | 100% 
python_abi-3.8       | 4 KB      | ########## | 100% 
liblapack-3.9.0      | 11 KB     | ########## | 100% 
libblas-3.9.0        | 12 KB     | ########## | 100% 
metis-5.1.0          | 4.1 MB    | ########## | 100% 
scotch-6.0.8         | 1.4 MB    | ########## | 100% 
glpk-4.65            | 1.0 MB    | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
[91m

  current version: 4.10.1
  latest version: 4.10.3

Please update conda by running

    $ conda update -n base -c defaults conda


[0mRemoving intermediate containe

The `ScriptProcessor` class lets you run a command inside this container, which you can use to run your own script.

In [16]:
%%writefile parameters.json

{
    "node": 10, 
    "connect_prob": 0.5, 
    "parallel": 6
}


Overwriting parameters.json


In [14]:
import sagemaker 
session = sagemaker.session.Session()
bucket = session.default_bucket()

In [15]:
!aws s3 cp parameters.json s3://$bucket/opt-example/parameters.json

Completed 66 Bytes/66 Bytes (988 Bytes/s) with 1 file(s) remainingupload: ./parameters.json to s3://sagemaker-us-west-2-230755935769/opt-example/parameters.json


In [5]:
from sagemaker.processing import ScriptProcessor

script_processor = ScriptProcessor(command=['python'],
                image_uri=processing_repository_uri,
                role=role,
                instance_count=1,
                instance_type='ml.m5.xlarge')

## MaxCut - Using OR Tools 

### A nurse scheduling problem

In the next example, a hospital supervisor needs to create a schedule for four nurses over a three-day period, subject to the following conditions:

- Each day is divided into three 8-hour shifts.
- Every day, each shift is assigned to a single nurse, and no nurse works more than one shift.
- Each nurse is assigned to at least two shifts during the three-day period.

More on the employee scheduling problem can be found here - https://developers.google.com/optimization/scheduling/employee_scheduling

In [25]:
%%writefile preprocessing.py


import json 
import networkx as nx
from ortools.sat.python import cp_model


def objective(optvar, edges, n_nodes):
    exp = None 
    for j in range(0, n_nodes): 
        for i in range(0, n_nodes): 
            if j > i and edges[i][j] > 0: 
                if exp == None: 
                    exp = (1 - optvar[i][j]*int(edges[i][j]))
                else:
                    exp += (1 - optvar[i][j]*int(edges[i][j])) 
    return exp         

input_f = open('/opt/ml/processing/input/parameters.json')
data = json.load(input_f)
n_nodes = data['node']
p = data['connect_prob']  # probability of an edge
workers_n = data['parallel']
seed = 1967

g = nx.erdos_renyi_graph(n_nodes, p=p, seed=seed)
# nx.draw(g, with_labels=True, pos=positions, node_size=600)
edges = nx.to_numpy_matrix(g)
edges = edges.tolist()
model = cp_model.CpModel()

nodes = [None for i in range(0, n_nodes)] 
for i in range(0, n_nodes): 
    name = "x"+str(i)
    nodes[i] = model.NewIntVar(-1, 1, name)

optvar = [[None for i in range(0, n_nodes)] for j in range(0, n_nodes)]     
for j in range(0, n_nodes):
    for i in range(0, n_nodes):
        if j > i:
            name = "x{}x{}".format(str(i), str(j))
            optvar[i][j] = model.NewIntVar(-1, 1, name)
            model.AddMultiplicationEquality( optvar[i][j], [nodes[i], nodes[j]])
            
model.Maximize(objective(optvar, edges, n_nodes))            

solver = cp_model.CpSolver()
solution_printer = cp_model.VarArrayAndObjectiveSolutionPrinter(nodes)
solver.parameters.num_search_workers = workers_n
status = solver.Solve(model, solution_printer)


Overwriting preprocessing.py


In [26]:
from sagemaker.processing import ProcessingInput, ProcessingOutput

script_processor.run(code='preprocessing.py',
                      inputs=[ProcessingInput(
                        source='s3://{}/opt-example/parameters.json'.format(bucket),
                        destination='/opt/ml/processing/input')],
                      outputs=[ProcessingOutput(output_name='data',
                                                source='/opt/ml/processing/data')])

script_processor_job_description = script_processor.jobs[-1].describe()
print(script_processor_job_description)


Job Name:  sagemaker-opt-container-2021-07-27-14-45-33-319
Inputs:  [{'InputName': 'input-1', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-west-2-230755935769/opt-example/parameters.json', 'LocalPath': '/opt/ml/processing/input', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-west-2-230755935769/sagemaker-opt-container-2021-07-27-14-45-33-319/input/code/preprocessing.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'data', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-west-2-230755935769/sagemaker-opt-container-2021-07-27-14-45-33-319/output/data', 'LocalPath': '/opt/ml/processing/data', 'S3UploadMode': 'EndOfJob'}}]
.........................[3

### Summary

We used various examples, front ends and solvers to solve numerical optimization problems using Sagemaker Processing. Next, try using Scipy.optimize, DEAP or Inspyred to explore other examples.

### References

1. https://sagemaker.readthedocs.io/en/stable/processing.html
1. https://pythonhosted.org/PuLP/
1. https://developers.google.com/optimization/introduction/get_started
1. https://pyomo.readthedocs.io/en/stable/
1. https://pythonhosted.org/inspyred/recipes.html
1. https://docs.scipy.org/doc/scipy/reference/optimize.html
1. https://deap.readthedocs.io/en/master/