## Comet.ml: Sagemaker Linear Learner Introduction Integration

The code below is taken directly from Amazon Sagemaker's official [An Introduction to Linear Learner with MNIST](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/linear_learner_mnist/linear_learner_mnist.ipynb) notebook.

The descriptive text has more or less been removed, but the code is identical. 

Follow along below to learn how to log Sagemaker training jobs to Comet.ml.

#### Install the comet_ml_sagemaker python package

Comet's SageMaker configuration is available to Enterprise customers only. If you are interested in learning more about Comet Enterprise, or are in a trial period with Comet.ml and would like to evaluate the SageMaker integration, please email support@comet.ml and credentials can be shared to download the correct packages.

### Prerequisites and Preprocessing
#### Permissions and Environment Variables

In [None]:
bucket = "NAME_YOUR_BUCKET"
prefix = "sagemaker/DEMO-linear-mnist"

# Define IAM role
import boto3
import re
from sagemaker import get_execution_role

role = get_execution_role()

### Data Ingestion

In [None]:
%%time
import pickle, gzip, numpy, urllib.request, json

# Load the dataset
urllib.request.urlretrieve(
    "http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz"
)
with gzip.open("mnist.pkl.gz", "rb") as f:
    train_set, valid_set, test_set = pickle.load(f, encoding="latin1")

### Data Inspection

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (2, 10)


def show_digit(img, caption="", subplot=None):
    if subplot == None:
        _, (subplot) = plt.subplots(1, 1)
    imgr = img.reshape((28, 28))
    subplot.axis("off")
    subplot.imshow(imgr, cmap="gray")
    plt.title(caption)


show_digit(train_set[0][30], "This is a {}".format(train_set[1][30]))

### Data Conversion

In [None]:
import io
import numpy as np
import sagemaker.amazon.common as smac

vectors = np.array([t.tolist() for t in train_set[0]]).astype("float32")
labels = np.where(np.array([t.tolist() for t in train_set[1]]) == 0, 1, 0).astype(
    "float32"
)

buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, vectors, labels)
buf.seek(0)

### Upload Training Data

In [None]:
import boto3
import os

key = "recordio-pb-data"
boto3.resource("s3").Bucket(bucket).Object(
    os.path.join(prefix, "train", key)
).upload_fileobj(buf)
s3_train_data = "s3://{}/{}/train/{}".format(bucket, prefix, key)
print("uploaded training data location: {}".format(s3_train_data))

#### Set up output S3 location for the model artifact that will be output as the result of training with the algorithm

In [None]:
output_location = "s3://{}/{}/output".format(bucket, prefix)
print("training artifacts will be uploaded to: {}".format(output_location))

### Training the Linear Model

In [None]:
from sagemaker.amazon.amazon_estimator import get_image_uri

container = get_image_uri(boto3.Session().region_name, "linear-learner")

In [None]:
import boto3
import sagemaker

sess = sagemaker.Session()

linear = sagemaker.estimator.Estimator(
    container,
    role,
    train_instance_count=1,
    train_instance_type="ml.c4.xlarge",
    output_path=output_location,
    sagemaker_session=sess,
)
linear.set_hyperparameters(
    feature_dim=784, predictor_type="binary_classifier", mini_batch_size=200
)

linear.fit({"train": s3_train_data})

## Logging to Comet.ml

Define your Comet [REST API](https://www.comet.com/docs/rest-api/getting-started/) and your [workspace](https://www.comet.com/docs/user-interface/#workspaces). See the [configuration documentation](http://docs.comet.ml/python-sdk/advanced/#python-configuration) for info on both specifications.

In [None]:
COMET_REST_API = "YOUR_API_KEY"
COMET_WORKSPACE = "YOUR_WORKSPACE"

Import `comet_ml_sagemaker` package.

In [None]:
import comet_ml_sagemaker

### comet_ml_sagemaker.log_sagemaker_job(estimator/regressor, api_key, workspace, project_name)
Logs a Sagemaker job based on an estimator/regressor object 

* estimator/regressor = Sagemaker estimator/regressor object
* api_key = your Comet REST API key
* workspace = your Comet workspace
* project_name = your Comet project_name

In [None]:
# .log_sagemaker_job(regressor/estimator object from Sagemaker SDK, Comet Rest API key (optional, can be taken from usual config source), workspace (comet), project (comet))
# I have used the Sagemaker SDK to train a model. I have the estimator/regressor object. I want to log whatever I just trained
experiment = comet_ml_sagemaker.log_sagemaker_job(
    linear, api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name="sagemaker"
)
print(experiment.url)

### comet_ml_sagemaker.log_sagemaker_job_by_name(job_name, api_key, workspace, project_name)
Logs a specific Sagemaker training job based on the jobname from the Sagemaker SDK.

* job_name = Cloudwatch/Sagemaker training job name
* api_key = your Comet REST API key
* workspace = your Comet workspace
* project_name = your Comet project_name

In [None]:
# I have the name of a completed training job I want to lob
# Same as .log_sagemaker_job, except instead of passing the regressor/estimator object, you pass the job name
SAGEMAKER_TRAINING_JOB_NAME = "SAGEMAKER_TRAINING_JOB_NAME"
experiment = comet_ml_sagemaker.log_sagemaker_job_by_name(
    SAGEMAKER_TRAINING_JOB_NAME,
    api_key=COMET_REST_API,
    workspace=COMET_WORKSPACE,
    project_name="sagemaker",
)
print(experiment.url)

### comet_ml_sagemaker.log_last_sagemaker_job(api_key, workspace, project_name)
Will log the last *started* Sagemaker training job based on the current config.

* api_key = your Comet REST API key
* workspace = your Comet workspace
* project_name = your Comet project_name

In [None]:
# Logs the last job for your current Amazon Region / S3
experiment = comet_ml_sagemaker.log_last_sagemaker_job(
    api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name="sagemaker"
)
print(experiment.url)

#### Note on SageMaker configuration

The Comet.ml Sagemaker configuration is using boto to find your training job data, please refer to the [boto documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) to configure the region and/or credentials if needed.