# Python GCP Setup For Notebook Kernels - Template

## 📋 About This Template

This notebook serves as a **template** for creating new workflow notebooks in this repository. It demonstrates the standardized environment setup pattern that all notebooks should use.

### What This Template Accomplishes

Each notebook needs initial setup that includes:
- **Authentication**: Ensuring the local session has access to GCP via Application Default Credentials (ADC)
- **API Enablement**: Activating the required Google Cloud APIs for the services you'll use
- **Package Management**: Installing the correct set of Python packages with environment-specific handling

### Key Benefits of Centralized Setup

This template uses **centralized setup code** ([`python_setup.py`](./python_setup.py)) hosted in this repository, which provides:
- ✅ **Single point of maintenance**: Updates to setup logic happen in one file, automatically benefiting all notebooks
- ✅ **Automatic environment detection**: Intelligently handles Colab vs local environments
- ✅ **Flexible package installation**: Choose between quick setup (PRIMARY), exact reproduction (ALL), or Colab-optimized (COLAB)
- ✅ **Automatic kernel restart handling**: Manages Colab kernel restarts when packages are installed

### How to Use This Template

To create a new workflow notebook:
1. **Copy this notebook** as your starting point
2. **Update this header cell** to describe your specific workflow/use case
3. **Modify the "Hardcoded Inputs" section** (API list, requirements URLs)
4. **Set your PROJECT_ID** in the user configurable section
5. **Start building your workflow** after the setup cells

The setup code will automatically:
- Detect whether you're in Colab or a local environment
- Authenticate and configure your GCP project
- Enable required APIs
- Install necessary packages
- Handle kernel restarts if needed (in Colab)

---
## Environment Setup

This section will authenticate your session, enable required Google Cloud APIs, and install necessary Python packages.

**Package Installation Options (`REQ_TYPE`):**
- `PRIMARY`: Installs only the main packages. Faster, but pip resolves sub-dependencies which may result in different versions than development.
- `ALL` (Default): Installs exact versions of all packages and dependencies. Best for perfectly reproducing the development environment.
- `COLAB`: Installs a Colab-optimized list that excludes pre-installed packages like `ipython` and `ipykernel`.

> **Note:** If running in Google Colab, the script will automatically detect this and set `REQ_TYPE = 'COLAB'` to prevent package conflicts, overriding any manual setting.

### Set Your Project ID

⚠️ **Action Required:** Replace the `PROJECT_ID` value below with your Google Cloud project ID before running this cell.

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with GCP project ID
REQ_TYPE = 'ALL' # Specify PRIMARY or ALL or COLAB

### Configuration

This cell defines the requirements files and Google Cloud APIs needed for this notebook. Run as-is without modification.

In [2]:
REQUIREMENTS_URLS = dict(
    PRIMARY = 'https://raw.githubusercontent.com/statmike/vertex-ai-mlops/refs/heads/main/core/requirements-brief.txt',
    ALL = 'https://raw.githubusercontent.com/statmike/vertex-ai-mlops/refs/heads/main/core/requirements.txt',
    COLAB = 'https://raw.githubusercontent.com/statmike/vertex-ai-mlops/refs/heads/main/core/requirements-colab.txt'
)

REQUIRED_APIS = [
    "bigquery.googleapis.com",
    "storage.googleapis.com",
]

### Run Setup

This cell downloads the centralized setup code and configures your environment. It will:
- Authenticate your session with Google Cloud
- Enable required APIs for this notebook
- Install necessary Python packages
- Display a setup summary with your project information

> **Note:** In Colab, if packages are installed, the kernel will automatically restart. After restart, continue from the next cell without re-running earlier cells.

In [7]:
import os, urllib.request

# Download and import setup code
url = 'https://raw.githubusercontent.com/statmike/vertex-ai-mlops/refs/heads/main/core/python_setup.py'
urllib.request.urlretrieve(url, 'python_setup_local.py')
import python_setup; os.remove('python_setup_local.py')

# Run setup
setup_info = python_setup.setup_environment(PROJECT_ID, REQ_TYPE, REQUIREMENTS_URLS, REQUIRED_APIS)


PYTHON GCP ENVIRONMENT SETUP

AUTHENTICATION
Checking for existing ADC...
✅ Existing ADC found.
✅ Project is correctly set to 'statmike-mlops-349915'.

API CHECK & ENABLE
✅ bigquery.googleapis.com is already enabled.
✅ storage.googleapis.com is already enabled.

PACKAGE MANAGEMENT
Checking and installing dependencies from: https://raw.githubusercontent.com/statmike/vertex-ai-mlops/refs/heads/main/core/requirements.txt
✅ All packages are already installed and up to date.

Google Cloud Project Information
PROJECT_ID     = statmike-mlops-349915
PROJECT_NUMBER = 1026793852137


SETUP SUMMARY
✅ Authentication:    Success
✅ API Configuration: Success
✅ Package Install:   Already up to date
✅ Project ID:        statmike-mlops-349915
✅ Project Number:    1026793852137



---
## Python Setup

### Imports

In [5]:
import subprocess

### Variables - User Set

### Variables - Auto Set

In [6]:
PROJECT_ID = subprocess.run(['gcloud', 'config', 'get-value', 'project'], capture_output=True, text=True, check=True).stdout.strip()
PROJECT_NUMBER = subprocess.run(['gcloud', 'projects', 'describe', PROJECT_ID, '--format=value(projectNumber)'], capture_output=True, text=True, check=True).stdout.strip()

print(f"\n{'='*50}\nGoogle Cloud Project Information\n{'='*50}\nPROJECT_ID     = {PROJECT_ID}\nPROJECT_NUMBER = {PROJECT_NUMBER}\n{'='*50}\n")


Google Cloud Project Information
PROJECT_ID     = statmike-mlops-349915
PROJECT_NUMBER = 1026793852137



### Configurations

### Client Setup

---
## Continue ...