# Now, we can start a new training job

We'll send a zip file called **datasource.zip**, with the following structure:
 - training_data.json (Inputs for the training data)
 - baseline_data.json (Statistics for the monitoring schedule)

In [None]:
import time
import sagemaker
import boto3
import os

bucket = sagemaker.Session().default_bucket()
artifact_bucket = os.environ['ARTIFACT_BUCKET']
prefix = os.environ['MODEL_NAME']

print('default bucket: {}'.format(bucket))
print('artifact bucket: {}'.format(artifact_bucket))

###  Upload training data

Validate and upload the training and validation datasets to s3

In [None]:
!ls -R input/data/

In [None]:
training_uri = sagemaker_session.upload_data(path='input/data/training', key_prefix=prefix+'/input/training')
validation_uri = sagemaker_session.upload_data(path='input/data/validation', key_prefix=prefix+'/input/validation')
output_uri = 's3://{}/{}'.format(bucket, prefix)

print('Training uri: {}'.format(training_uri))
print('Validation uri: {}'.format(validation_uri))
print('Model output uri: {}'.format(output_uri))

### Upload baseline data

Validate, and upload the baseline input to s3

In [None]:
# Inspect the output predictions (NOTE: if using scientific format these will be treated as strings)
baseline_file = 'output/data/predictions.csv'

!head -2 $baseline_file

In [None]:
baseline_uri = sagemaker_session.upload_data(path=baseline_file, key_prefix=prefix+'/input/baseline')

print('Baseline uri: {}'.format(baseline_uri))

### Define input data

Upload the input data to kick off the training process

In [None]:
input_data = {
    'UpdatedAt': time.time(),
    'TrainingUri': training_uri,
    'ValidationUri': validation_uri,
    'BaselineUri': baseline_uri
}

## Alright! Now it's time to start the training process¶

In [None]:
import boto3
import io
import zipfile
import json

s3 = boto3.client('s3')

key_name = "data-source.zip"

zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'a') as zf:
    zf.writestr('inputData.json', json.dumps(input_data))

zip_buffer.seek(0)

s3.put_object(Bucket=artifact_bucket, Key=key_name, Body=bytearray(zip_buffer.read()))

### Ok, now open the AWS console in another tab and go to the CodePipeline console to see the status of our building pipeline

> Finally, click here [NOTEBOOK](04_Check%20Progress%20and%20Test%20the%20endpoint.ipynb) to see the progress and test your endpoint