## Running Numerical optimization jobs with your own dependencies

Below, you walk through how to create a SageMaker processing container, and how to use a `ScriptProcessor` to run your own numerical optimization code within a container. You can provide your own dependencies inside this container to run your processing script with.

In [None]:
import boto3
import sagemaker
from sagemaker import get_execution_role

region = boto3.session.Session().region_name

role = get_execution_role()

In [None]:
!rm -r docker

In [None]:
!mkdir docker

In [None]:
%%writefile docker/Dockerfile

FROM continuumio/anaconda3

RUN pip install boto3 pandas scikit-learn pulp pyomo inspyred ortools scipy deap 

RUN conda install -c conda-forge ipopt coincbc glpk

ENV PYTHONUNBUFFERED=TRUE

ENTRYPOINT ["python"]

This block of code builds the container using the `docker` command, creates an Amazon Elastic Container Registry (Amazon ECR) repository, and pushes the image to Amazon ECR.

In [None]:
import boto3

account_id = boto3.client('sts').get_caller_identity().get('Account')
ecr_repository = 'sagemaker-opt-container'
tag = ':latest'
processing_repository_uri = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account_id, region, ecr_repository + tag)

# Create ECR repository and push docker image
!docker build -t $ecr_repository docker
!$(aws ecr get-login --region $region --registry-ids $account_id --no-include-email)
!aws ecr create-repository --repository-name $ecr_repository
!docker tag {ecr_repository + tag} $processing_repository_uri
!docker push $processing_repository_uri

The `ScriptProcessor` class lets you run a command inside this container, which you can use to run your own script.

In [None]:
%%writefile parameters.json

{
    "node": 20, 
    "connect_prob": 0.5, 
    "parallel": 12
}


In [None]:
import sagemaker 
session = sagemaker.session.Session()
bucket = session.default_bucket()

In [None]:
!aws s3 cp parameters.json s3://$bucket/opt-example/parameters.json

In [None]:
from sagemaker.processing import ScriptProcessor

script_processor = ScriptProcessor(command=['python'],
                image_uri=processing_repository_uri,
                role=role,
                instance_count=1,
                instance_type='ml.m5.xlarge') #for larger jobs, we can switch instance type, for example, c5.4xlarge (ref: https://aws.amazon.com/ec2/instance-types/) 

## MaxCut - Using OR Tools 

In [None]:
%%writefile preprocessing.py


import json 
import networkx as nx
from ortools.sat.python import cp_model


def objective(optvar, edges, n_nodes):
    exp = None 
    for j in range(0, n_nodes): 
        for i in range(0, n_nodes): 
            if j > i and edges[i][j] > 0: 
                if exp == None: 
                    exp = (1 - optvar[i][j]*int(edges[i][j]))
                else:
                    exp += (1 - optvar[i][j]*int(edges[i][j])) 
    return exp         

input_f = open('/opt/ml/processing/input/parameters.json')
data = json.load(input_f)
n_nodes = data['node']
p = data['connect_prob']  # probability of an edge
workers_n = data['parallel']
seed = 1967

g = nx.erdos_renyi_graph(n_nodes, p=p, seed=seed)
# nx.draw(g, with_labels=True, pos=positions, node_size=600)
edges = nx.to_numpy_matrix(g)
edges = edges.tolist()
model = cp_model.CpModel()

nodes = [None for i in range(0, n_nodes)] 
for i in range(0, n_nodes): 
    name = "x"+str(i)
    nodes[i] = model.NewIntVar(-1, 1, name)

optvar = [[None for i in range(0, n_nodes)] for j in range(0, n_nodes)]     
for j in range(0, n_nodes):
    for i in range(0, n_nodes):
        if j > i:
            name = "x{}x{}".format(str(i), str(j))
            optvar[i][j] = model.NewIntVar(-1, 1, name)
            model.AddMultiplicationEquality( optvar[i][j], [nodes[i], nodes[j]])
            
model.Maximize(objective(optvar, edges, n_nodes))            

solver = cp_model.CpSolver()
solution_printer = cp_model.VarArrayAndObjectiveSolutionPrinter(nodes)
solver.parameters.num_search_workers = workers_n
status = solver.Solve(model, solution_printer)


In [None]:
from sagemaker.processing import ProcessingInput, ProcessingOutput

script_processor.run(code='preprocessing.py',
                      inputs=[ProcessingInput(
                        source='s3://{}/opt-example/parameters.json'.format(bucket),
                        destination='/opt/ml/processing/input')],
                      outputs=[ProcessingOutput(output_name='data',
                                                source='/opt/ml/processing/data')])

script_processor_job_description = script_processor.jobs[-1].describe()
print(script_processor_job_description)

### Summary

We used various examples, front ends and solvers to solve numerical optimization problems using Sagemaker Processing. Next, try using Scipy.optimize, DEAP or Inspyred to explore other examples.