# Vertex Tabular Binary Classification with .CustomTrainingJob()

<center><img src="../images/03.png"/></center>

## Set Constants

In [13]:
PROJECT_ID = 'jchavezar-demo'
REGION = 'us-central1'
DATASET_URI = 'gs://vtx-datasets-public/ecommerce/datasets.csv'
MODEL_URI = 'gs://vtx-models/pytorch/ecommerce'
STAGING_URI = 'gs://vtx-staging/pytorch/ecommerce/'
TRAIN_IMAGE_URI = 'us-docker.pkg.dev/vertex-ai/training/pytorch-xla.1-11:latest'

## Create Folder Structure

```
source
     └─── trainer
          |  train.py
          |

```

In [6]:
!rm -fr source
!mkdir -p source/trainer

## Intro

Below we have the code for the training, it was made with PyTorch by building a neural network with these components:

- 2 types of features set: categorical and numerical.
- Shape detection of embedding layer for categorical.
- Drouput to avoid overfit during the training.
- Batch Normalization to standarize the data.
- 1 input layer, shape: 114x32: 
  - 114 is the number of total features (categorical and numerical) after the embedding.
  - 32 is the number of the neurons.
- Activation function applied to the last input layer to fix non-linearity.
- 1 output layer, shape: 32x2.

The following diagram shows the neural netowkr with steps ordered used during the Model building class: ShelterOutcomeModel.

<center><img src="../images/04-pytorch-nn.png"/></center>

In [14]:
%%writefile source/trainer/train2.py

import os

print(os.environ['AIP_MODEL_DIR'])

Overwriting source/trainer/train2.py


## Training Job (CustomJob.from_local_script)

To speed up the training a GPU NVIDIA Tesla T4 is used, it should take around 2 minutes to finish.

In [15]:
from google.cloud import aiplatform as aip

customJob = aip.CustomJob.from_local_script(
    display_name = 'test',
    script_path = 'source/trainer/train2.py',
    container_uri = TRAIN_IMAGE_URI,
    replica_count = 1,
    machine_type = 'n1-standard-4',
    staging_bucket = STAGING_URI,
    base_output_dir = MODEL_URI
)

customJob.run()

Training script copied to:
gs://vtx-staging/pytorch/ecommerce/aiplatform-2022-11-16-15:40:05.233-aiplatform_custom_trainer_script-0.1.tar.gz.
Creating CustomJob
CustomJob created. Resource name: projects/569083142710/locations/us-central1/customJobs/1217814406000279552
To use this CustomJob in another session:
custom_job = aiplatform.CustomJob.get('projects/569083142710/locations/us-central1/customJobs/1217814406000279552')
View Custom Job:
https://console.cloud.google.com/ai/platform/locations/us-central1/training/1217814406000279552?project=569083142710
CustomJob projects/569083142710/locations/us-central1/customJobs/1217814406000279552 current state:
JobState.JOB_STATE_PENDING
CustomJob projects/569083142710/locations/us-central1/customJobs/1217814406000279552 current state:
JobState.JOB_STATE_PENDING
CustomJob projects/569083142710/locations/us-central1/customJobs/1217814406000279552 current state:
JobState.JOB_STATE_PENDING
CustomJob projects/569083142710/locations/us-central1/cus

In [22]:
x = 'gs://vtx-models/pytorch/ecommerce/model/'
'/'.join(x.split('/')[3:])

'pytorch/ecommerce/model/'

In [26]:
x.split('/')[2]

'vtx-models'