# Now, we can start a new training job

We'll send a zip file called **trainingjob.zip**, with the following structure:
 - trainingjob.json (Sagemaker training job descriptor)
 - assets/deploy-model-prd.yml (Cloudformation for deploying our model into Production)
 - assets/deploy-model-dev.yml (Cloudformation for deploying our model into Development)

## Then, let's  create the trainingjob descriptor

In [None]:
import boto3
import time
import sagemaker
import os
from sagemaker import get_execution_role

# Get the current Sagemaker session
role = sagemaker.get_execution_role()

sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker-r-mars'

region = boto3.Session().region_name
account = boto3.client('sts').get_caller_identity().get('Account')

In [None]:
r_job = prefix + '-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

print("Training job", r_job)

r_training_params = {
    "RoleArn": role,
    "TrainingJobName": r_job,
    "AlgorithmSpecification": {
        "TrainingImage": '{}.dkr.ecr.{}.amazonaws.com/sagemaker-rmars:latest'.format(account, region),
        "TrainingInputMode": "File"
    },
    "ResourceConfig": {
        "InstanceCount": 1,
        "InstanceType": "ml.m4.xlarge",
        "VolumeSizeInGB": 10
    },
    "InputDataConfig": [
        {
            "ChannelName": "train",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": "s3://{}/{}/train".format(bucket, prefix),
                    "S3DataDistributionType": "FullyReplicated"
                }
            },
            "CompressionType": "None",
            "RecordWrapperType": "None"
        }
    ],
    "OutputDataConfig": {
        "S3OutputPath": "s3://{}/{}/output".format(bucket, prefix)
    },
    "HyperParameters": {
        "target": "Sepal.Length",
        "degree": "2"
    },
    "StoppingCondition": {
        "MaxRuntimeInSeconds": 60 * 60
    }
}

## Before we start the training process, we need to upload our dataset to S3

In [None]:
train_file = 'iris.csv'
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', train_file)).upload_file(train_file)

## Alright! Now it's time to start the training process

In [None]:
import boto3
import io
import zipfile
import json

s3 = boto3.client('s3')

bucket_name = "mlops-%s-%s" % (region, account)
key_name = "training_jobs/rmars-model/trainingjob.zip"

zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'a') as zf:
    zf.writestr('trainingjob.json', json.dumps(r_training_params))
    zf.writestr('assets/deploy-model-prd.yml', open('../../assets/deploy-model-prd.yml', 'r').read())
    zf.writestr('assets/deploy-model-dev.yml', open('../../assets/deploy-model-dev.yml', 'r').read())

zip_buffer.seek(0)

s3.put_object(Bucket=bucket_name, Key=key_name, Body=bytearray(zip_buffer.read()))

### Ok, now open the AWS console in another tab and go to the CodePipeline console to see the status of our building pipeline

> Finally, click here [NOTEBOOK](04_Check%20Progress%20and%20Test%20the%20endpoint.ipynb) to see the progress and test your endpoint