# Vertex AI Image Classifier — Starter Notebook

This starter notebook provides a clear skeleton to run a PyTorch tabular training
pipeline and (optionally) deploy to Vertex AI. Replace placeholders with your
own data and project-specific code.

## How to use

1. Install dependencies (see `requirements.txt`).
2. Fill in the `PROJECT_ID`, `GCS_BUCKET`, and dataset paths.
3. Run cells sequentially.


In [None]:
# Install dependencies (run once in a fresh environment)
!pip install -r requirements.txt


## Authentication

If you run this locally and plan to use Vertex AI, authenticate with Google Cloud:

```bash
gcloud auth login
gcloud config set project <YOUR_PROJECT_ID>
gcloud auth application-default login
```


In [None]:
# Imports
import os
import pandas as pd
import numpy as np
import torch
from sklearn.model_selection import train_test_split

print('imports ok')


## Configuration — update these variables
Fill your Google Cloud project, GCS bucket, and dataset path here.


In [None]:
PROJECT_ID = '<YOUR_PROJECT_ID>'
GCS_BUCKET = 'gs://<YOUR_BUCKET>'
LOCAL_DATA_PATH = 'data/your_dataset.csv'  # update

print(PROJECT_ID, GCS_BUCKET, LOCAL_DATA_PATH)


## Load data (placeholder)
Replace this with code to load your CSV/Parquet.


In [None]:
# Example placeholder for loading a CSV
# df = pd.read_csv(LOCAL_DATA_PATH)
# print(df.head())

# Sample synthetic data for quick test
from sklearn.datasets import load_iris
iris = load_iris(as_frame=True)
df = iris.frame
print(df.shape)
df.head()


## Preprocess and split
Add your feature engineering and train/test split here.


In [None]:
# Example train/test split
X = df.drop(columns=['target'])
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(X_train.shape, X_test.shape)


## Model training (PyTorch placeholder)
Replace this with your model architecture and training loop.


In [None]:
import torch.nn as nn

class SimpleNet(nn.Module):
    def __init__(self, in_dim, out_dim):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(in_dim, 64),
            nn.ReLU(),
            nn.Linear(64, out_dim)
        )
    def forward(self, x):
        return self.net(x)

# Dummy example — convert pandas to tensors when using real data
in_dim = X_train.shape[1]
out_dim = len(set(y_train))
model = SimpleNet(in_dim, out_dim)
print(model)


## Save model artifact (placeholder)
Save your trained model and upload to GCS if you plan to deploy via Vertex AI.


In [None]:
# Example: save PyTorch model locally
# torch.save(model.state_dict(), 'model.pt')


## Vertex AI: submit training job (high-level example)
This is a high-level example using `google-cloud-aiplatform`. Replace with your own training container or script.


In [None]:
from google.cloud import aiplatform

# Initialize the client (make sure PROJECT_ID set above)
aiplatform.init(project=PROJECT_ID, location='us-central1')

# Example: create a custom job (this is a high-level template)
# job = aiplatform.CustomJob.from_local_script(
#     display_name='pytorch-train',
#     script_path='train.py',
#     container_uri='gcr.io/your-project/your-train-image',
#     args=['--epochs', '10'],
#     replica_count=1,
# )
# job.run(sync=True)

print('Vertex AI client initialized (template)')


## Vertex AI: deploy model (high-level)
After training and creating a model resource, you can deploy it to an endpoint. See Google Cloud docs for exact commands.


## Notes & Next steps
- Replace placeholders with real scripts and data.
- Use `requirements.txt` to install dependencies.
- For reproducible pipelines, consider packaging training code into a container and using Vertex AI Pipelines.
