From d2c92fa0a9fb1efd1f4c4e071c5d8ffdc5b478bb Mon Sep 17 00:00:00 2001 From: remzy-coder Date: Mon, 13 Oct 2025 18:28:30 -0400 Subject: [PATCH] Harden Cloud Shell deploy script with verification --- Makefile | 8 + README.md | 62 +++++- deploy/cloudshell_deploy.sh | 429 ++++++++++++++++++++++++++++++++++++ docs/architecture.md | 126 +++++++++++ 4 files changed, 624 insertions(+), 1 deletion(-) create mode 100644 Makefile create mode 100755 deploy/cloudshell_deploy.sh create mode 100644 docs/architecture.md diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..a8c4428 --- /dev/null +++ b/Makefile @@ -0,0 +1,8 @@ +SHELL := /bin/bash + +.PHONY: test lint + +test: lint + +lint: + shellcheck deploy/cloudshell_deploy.sh diff --git a/README.md b/README.md index 2e20527..0af34af 100644 --- a/README.md +++ b/README.md @@ -1 +1,61 @@ -# XMIND# XMIND +# XMIND Quant Sports AI Platform + +This repository packages architecture documentation and a Google Cloud Shell bootstrap script for deploying the XMIND advanced reasoning platform tailored for quant-level sports betting analysis. + +## Contents +- `docs/architecture.md` – Detailed system design covering services, agents, and workflows. +- `deploy/cloudshell_deploy.sh` – One-command bootstrap script to initialize Google Cloud resources. + +## Quick Start +1. Open [Google Cloud Shell](https://shell.cloud.google.com) in the project you wish to use. +2. Clone this repository or copy the `deploy/cloudshell_deploy.sh` script into Cloud Shell. +3. Confirm you have an active gcloud account (`gcloud auth login`) and set required environment variables: + ```bash + export PROJECT_ID="your-project-id" + export REGION="us-central1" + export USER_EMAIL="you@example.com" + export ZONE="us-central1-a" # Optional: overrides default REGION-derived zone + export APP_ENGINE_LOCATION="us-central" # Optional: override for App Engine app required by Cloud Scheduler + ``` + Optional overrides include: + - `SCHEDULER_LOCATION` to target a specific Cloud Scheduler region (defaults to `REGION`). + - `CLOUD_TASKS_QUEUE` to rename the orchestration queue (defaults to `orchestration-dispatch`). + - `BQ_DATASET`, `BQ_LOCATION`, `GCS_BUCKET`, and `ENVIRONMENT_LABEL` for data residency and labeling preferences. +4. Run the deploy script (it will validate credentials, set Cloud defaults, and double-check created resources): + ```bash + bash deploy/cloudshell_deploy.sh + ``` +5. Replace Secret Manager placeholders with your actual meta-engineered prompt and external API keys. +6. Review the created Pub/Sub subscriptions, Cloud Tasks queue, and Scheduler jobs to align them with your ingestion cadence. +7. Confirm the verification summary at the end of the script for Pub/Sub topics, Cloud Run services, Cloud Tasks queue, BigQuery dataset, and Cloud Storage bucket. +8. Build and deploy service code (orchestrator, verifier, UI, tooling) using Cloud Build and Cloud Run as described in the architecture doc. + +## Post-Deployment Checklist +- Configure Identity-Aware Proxy (IAP) for the Cloud Run UI. +- Wire API Gateway or load balancer routing for unified entry points. +- Connect Vertex AI Extensions/tools for sportsbook APIs, simulations, and compliance automation. +- Enable Cloud Monitoring dashboards, alerting policies, and QA/governance audit exports. +- Populate new BigQuery tables (`qa_audits`, `portfolio_positions`) and the `recommendation_summary` view with your historical data. + +## Provisioned Resources at a Glance +- **Networking:** Custom VPC and subnet with environment labels. +- **Storage:** Regional GCS bucket plus folders for prompts, model artifacts, and simulation outputs. +- **Messaging:** Pub/Sub topics and opinionated subscriptions for orchestrator, verifier, UI, and governance agents. +- **Data Warehouse:** BigQuery dataset seeded with live odds, bet recommendations, QA audits, portfolio positions, and metrics tables. +- **Orchestration:** Cloud Tasks queue and Cloud Scheduler jobs for odds refresh, QA sweeping, and governance alerts. +- **Security & Secrets:** Service accounts with scoped IAM roles (including Cloud Tasks enqueue, Pub/Sub pub/sub, BigQuery viewer access, and Run invoker bindings) and Secret Manager placeholders (including a governance policy slot). +- **Compute Skeletons:** Cloud Run services deployed with baseline environment variables and Vertex AI Workbench instance placeholder. + +## Notes +- The bootstrap script now performs pre-flight checks for `gcloud`, `bq`, and `gsutil` before provisioning resources. +- App Engine is created automatically (configurable via `APP_ENGINE_LOCATION`) so Cloud Scheduler jobs succeed on first run. +- Adjust IAM roles and resource sizing to match your risk tolerance and compliance requirements. +- Update the created Scheduler jobs to call your deployed service revisions once application images are available. + + +## Testing +Run ShellCheck to validate the bootstrap script syntax before deploying: +```bash +make test +``` +Ensure [`shellcheck`](https://www.shellcheck.net/) is installed in your environment or use Cloud Shell's package manager to install it (e.g., `sudo apt-get install -y shellcheck`). diff --git a/deploy/cloudshell_deploy.sh b/deploy/cloudshell_deploy.sh new file mode 100755 index 0000000..0fc40f5 --- /dev/null +++ b/deploy/cloudshell_deploy.sh @@ -0,0 +1,429 @@ +#!/bin/bash +set -euo pipefail + +# XMIND Cloud Shell bootstrap script for Google Cloud deployment +# Usage: +# export PROJECT_ID="your-gcp-project" +# export REGION="us-central1" # or preferred region supporting Vertex AI +# export USER_EMAIL="you@example.com" # IAM principal for IAP/roles +# bash cloudshell_deploy.sh + +if [[ -z "${PROJECT_ID:-}" || -z "${REGION:-}" || -z "${USER_EMAIL:-}" ]]; then + echo "Error: PROJECT_ID, REGION, and USER_EMAIL environment variables must be set before running." >&2 + exit 1 +fi + +for cmd in gcloud bq gsutil; do + if ! command -v "$cmd" >/dev/null 2>&1; then + echo "Error: required command '$cmd' not found in PATH." >&2 + exit 1 + fi +done + +ACTIVE_ACCOUNT=$(gcloud auth list --filter=status:ACTIVE --format="value(account)" 2>/dev/null || true) +if [[ -z "$ACTIVE_ACCOUNT" ]]; then + echo "Error: no active gcloud account found. Run 'gcloud auth login' before executing this script." >&2 + exit 1 +fi + +trap 'echo "Error: deployment failed; check the Cloud Shell output for details." >&2' ERR + +ZONE=${ZONE:-"${REGION}-a"} +ENVIRONMENT_LABEL=${ENVIRONMENT_LABEL:-"prod"} +BQ_DATASET=${BQ_DATASET:-"xmind_ai"} +GCS_BUCKET=${GCS_BUCKET:-"${PROJECT_ID}-xmind-artifacts"} +ARTIFACT_REPO=${ARTIFACT_REPO:-"xmind-services"} +NETWORK=${NETWORK:-"xmind-vpc"} +SUBNET=${SUBNET:-"${NETWORK}-subnet"} +REGISTRY_LOCATION=${REGISTRY_LOCATION:-"${REGION}"} +SCHEDULER_LOCATION=${SCHEDULER_LOCATION:-"${REGION}"} +CLOUD_TASKS_QUEUE=${CLOUD_TASKS_QUEUE:-"orchestration-dispatch"} +APP_ENGINE_LOCATION=${APP_ENGINE_LOCATION:-"us-central"} + +echo "==> Setting project ${PROJECT_ID}" +gcloud config set project "$PROJECT_ID" --quiet +gcloud projects describe "$PROJECT_ID" --format="value(projectId)" >/dev/null +gcloud config set compute/region "$REGION" --quiet +gcloud config set compute/zone "$ZONE" --quiet + +echo "==> Enabling required services" +gcloud services enable \ + compute.googleapis.com \ + artifactregistry.googleapis.com \ + run.googleapis.com \ + cloudbuild.googleapis.com \ + secretmanager.googleapis.com \ + aiplatform.googleapis.com \ + iam.googleapis.com \ + bigquery.googleapis.com \ + bigquerystorage.googleapis.com \ + pubsub.googleapis.com \ + dataflow.googleapis.com \ + monitoring.googleapis.com \ + logging.googleapis.com \ + cloudresourcemanager.googleapis.com \ + cloudscheduler.googleapis.com \ + cloudtasks.googleapis.com \ + notebooks.googleapis.com \ + servicenetworking.googleapis.com \ + workflows.googleapis.com \ + eventarc.googleapis.com \ + securetoken.googleapis.com \ + appengine.googleapis.com + +LABELS="env=${ENVIRONMENT_LABEL},app=xmind" + +if ! gcloud app describe >/dev/null 2>&1; then + echo "==> Creating App Engine application in ${APP_ENGINE_LOCATION} for Cloud Scheduler support" + gcloud app create --region="$APP_ENGINE_LOCATION" +fi + +if ! gcloud compute networks describe "$NETWORK" --format=json >/dev/null 2>&1; then + echo "==> Creating VPC ${NETWORK}" + gcloud compute networks create "$NETWORK" --subnet-mode=custom --bgp-routing-mode=regional +fi + +if ! gcloud compute networks subnets describe "$SUBNET" --region "$REGION" >/dev/null 2>&1; then + echo "==> Creating subnet ${SUBNET} in ${REGION}" + gcloud compute networks subnets create "$SUBNET" \ + --network="$NETWORK" \ + --region="$REGION" \ + --range=10.10.0.0/20 +fi + +echo "==> Creating Cloud Storage bucket ${GCS_BUCKET}" +if ! gsutil ls -b "gs://${GCS_BUCKET}" >/dev/null 2>&1; then + gsutil mb -p "$PROJECT_ID" -l "$REGION" "gs://${GCS_BUCKET}" +fi + +echo "==> Creating baseline Cloud Storage folders" +gsutil -m mkdir -p \ + "gs://${GCS_BUCKET}/model-artifacts" \ + "gs://${GCS_BUCKET}/simulation-outputs" \ + "gs://${GCS_BUCKET}/prompts" >/dev/null 2>&1 || true + +echo "==> Creating Artifact Registry repo ${ARTIFACT_REPO}" +if ! gcloud artifacts repositories describe "$ARTIFACT_REPO" --location="$REGISTRY_LOCATION" >/dev/null 2>&1; then + gcloud artifacts repositories create "$ARTIFACT_REPO" \ + --repository-format=DOCKER \ + --location="$REGISTRY_LOCATION" \ + --description="Container images for XMIND services" \ + --labels="$LABELS" +fi + +echo "==> Creating Pub/Sub topics" +for topic in odds-updates bet-recommendations qa-results orchestration-events governance-alerts; do + if ! gcloud pubsub topics describe "$topic" >/dev/null 2>&1; then + gcloud pubsub topics create "$topic" --labels="$LABELS" + fi +done + +echo "==> Creating Pub/Sub subscriptions" +declare -A PUBSUB_SUBSCRIPTIONS=( + [odds-updates]=odds-updates-orchestrator + [bet-recommendations]=bet-review-verifier + [qa-results]=qa-results-ui +) + +for topic in "${!PUBSUB_SUBSCRIPTIONS[@]}"; do + subscription="${PUBSUB_SUBSCRIPTIONS[$topic]}" + if ! gcloud pubsub subscriptions describe "$subscription" >/dev/null 2>&1; then + gcloud pubsub subscriptions create "$subscription" \ + --topic="$topic" \ + --retain-acked-messages \ + --message-retention-duration="604800s" + fi +done + +BQ_LOCATION=${BQ_LOCATION:-"US"} +echo "==> Creating BigQuery dataset ${BQ_DATASET}" +if ! bq --location="$BQ_LOCATION" show --format=none "$PROJECT_ID:$BQ_DATASET" >/dev/null 2>&1; then + bq --location="$BQ_LOCATION" mk --dataset "$PROJECT_ID:$BQ_DATASET" +fi + +echo "==> Creating BigQuery tables" +bq query --location="$BQ_LOCATION" --use_legacy_sql=false < Creating Secret Manager placeholders" +for secret in xmind-base-prompt sportsbook-api-key news-api-key orchestrator-config governance-policy; do + if ! gcloud secrets describe "$secret" >/dev/null 2>&1; then + gcloud secrets create "$secret" --replication-policy="automatic" --labels="$LABELS" + echo "PLACEHOLDER" | gcloud secrets versions add "$secret" --data-file=- >/dev/null + fi +done + +echo "==> Provisioning Cloud Run skeleton services" +for service in xmind-orchestrator xmind-verifier xmind-ui xmind-odds-aggregator; do + if ! gcloud run services describe "$service" --region="$REGION" >/dev/null 2>&1; then + gcloud run deploy "$service" \ + --image="us-docker.pkg.dev/cloudrun/container/hello" \ + --region="$REGION" \ + --platform=managed \ + --no-allow-unauthenticated \ + --labels="$LABELS" \ + --quiet + fi + gcloud run services update "$service" \ + --platform=managed \ + --region="$REGION" \ + --set-env-vars="PROJECT_ID=$PROJECT_ID,REGION=$REGION,BQ_DATASET=$BQ_DATASET" \ + --update-labels="$LABELS" \ + --quiet +done + +echo "==> Assigning IAM roles to ${USER_EMAIL}" +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="user:${USER_EMAIL}" \ + --role="roles/owner" + +echo "==> Granting Cloud Run Invoker to ${USER_EMAIL}" +for service in xmind-orchestrator xmind-verifier xmind-ui xmind-odds-aggregator; do + gcloud run services add-iam-policy-binding "$service" \ + --platform=managed \ + --member="user:${USER_EMAIL}" \ + --role="roles/run.invoker" \ + --region="$REGION" +done + +echo "==> Creating IAM service accounts" +declare -A SERVICE_ACCOUNTS=( + [orchestrator]="Orchestrates prompts and tool calls" + [verifier]="Runs QA checks" + [ui]="Serves user interface" + [pipelines]="Runs Vertex AI pipelines" + [governance]="Automates guardrails and compliance checks" +) + +for sa in "${!SERVICE_ACCOUNTS[@]}"; do + SA_EMAIL="$sa-sa@$PROJECT_ID.iam.gserviceaccount.com" + if ! gcloud iam service-accounts describe "$SA_EMAIL" >/dev/null 2>&1; then + gcloud iam service-accounts create "$sa-sa" \ + --display-name="XMIND ${sa^}" + fi +done + +declare -A SERVICE_BINDINGS=( + [xmind-orchestrator]=orchestrator-sa + [xmind-verifier]=verifier-sa + [xmind-ui]=ui-sa + [xmind-odds-aggregator]=orchestrator-sa +) + +for service in "${!SERVICE_BINDINGS[@]}"; do + gcloud run services update "$service" \ + --platform=managed \ + --region="$REGION" \ + --service-account="${SERVICE_BINDINGS[$service]}@$PROJECT_ID.iam.gserviceaccount.com" \ + --quiet +done + +echo "==> Granting service account roles" +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/aiplatform.user" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/secretmanager.secretAccessor" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.publisher" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.subscriber" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/cloudtasks.enqueuer" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:verifier-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/bigquery.dataViewer" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:verifier-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/secretmanager.secretAccessor" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:verifier-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.subscriber" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:verifier-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.publisher" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:ui-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/run.invoker" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:ui-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/bigquery.dataViewer" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:pipelines-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/aiplatform.admin" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:governance-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.subscriber" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:governance-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/secretmanager.secretAccessor" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:governance-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/bigquery.dataViewer" + +gcloud projects add-iam-policy-binding "$PROJECT_ID" \ + --member="serviceAccount:governance-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/pubsub.publisher" + +for service in xmind-odds-aggregator xmind-orchestrator xmind-verifier xmind-ui; do + gcloud run services add-iam-policy-binding "$service" \ + --platform=managed \ + --member="serviceAccount:orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="roles/run.invoker" \ + --region="$REGION" \ + --quiet +done + +echo "==> Creating Cloud Tasks queue ${CLOUD_TASKS_QUEUE}" +if ! gcloud tasks queues describe "$CLOUD_TASKS_QUEUE" --location="$REGION" >/dev/null 2>&1; then + gcloud tasks queues create "$CLOUD_TASKS_QUEUE" \ + --location="$REGION" \ + --max-attempts=5 \ + --max-retry-duration="3600s" +fi + +echo "==> Creating Scheduler jobs" +if ! gcloud scheduler jobs describe odds-refresh --location="$SCHEDULER_LOCATION" >/dev/null 2>&1; then + gcloud scheduler jobs create http odds-refresh \ + --location="$SCHEDULER_LOCATION" \ + --schedule="*/5 * * * *" \ + --uri="https://${REGION}-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/${PROJECT_ID}/services/xmind-odds-aggregator:run" \ + --http-method=POST \ + --oidc-service-account-email="orchestrator-sa@$PROJECT_ID.iam.gserviceaccount.com" +fi + +if ! gcloud scheduler jobs describe qa-sweeper --location="$SCHEDULER_LOCATION" >/dev/null 2>&1; then + gcloud scheduler jobs create pubsub qa-sweeper \ + --location="$SCHEDULER_LOCATION" \ + --schedule="0 * * * *" \ + --topic="qa-results" \ + --message-body='{"task":"sweep"}' +fi + +if ! gcloud scheduler jobs describe governance-audit --location="$SCHEDULER_LOCATION" >/dev/null 2>&1; then + gcloud scheduler jobs create pubsub governance-audit \ + --location="$SCHEDULER_LOCATION" \ + --schedule="30 2 * * *" \ + --topic="governance-alerts" \ + --message-body='{"task":"daily_governance_review"}' +fi + +echo "==> Creating Vertex AI workbench instance placeholder" +if ! gcloud notebooks instances describe xmind-lab --location="$REGION" >/dev/null 2>&1; then + gcloud notebooks instances create xmind-lab \ + --vm-image-project=deeplearning-platform-release \ + --vm-image-family=common-container \ + --location="$REGION" \ + --machine-type=n1-standard-4 \ + --boot-disk-size=200GB \ + --labels="$LABELS" +fi + +echo "==> Verifying critical resources" +gcloud run services list --region="$REGION" --platform=managed --filter="metadata.labels.app=xmind" --format="table(metadata.name,status.conditions[?type='Ready'].status)" +for topic in odds-updates bet-recommendations qa-results orchestration-events governance-alerts; do + gcloud pubsub topics describe "$topic" --format="value(name)" >/dev/null +done +gcloud tasks queues describe "$CLOUD_TASKS_QUEUE" --location="$REGION" --format="yaml" >/dev/null +bq show --format=none "$PROJECT_ID:$BQ_DATASET" >/dev/null +gsutil ls "gs://${GCS_BUCKET}" >/dev/null + +echo "==> Deployment bootstrap complete" +echo "Next steps:" +echo "1. Replace Secret Manager placeholders with your actual prompt and API keys." +echo "2. Build and deploy service images to Artifact Registry using Cloud Build." +echo "3. Review Cloud Run service revisions (service accounts, ingress, concurrency) and tighten access if needed." +echo "4. Wire API Gateway / IAP, Vertex AI Extensions, and finalize UI deployment." +echo "5. Configure alerting, dashboards, and CI/CD triggers aligned with your betting workflow." diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..dd3c064 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,126 @@ +# XMIND Quant Sports AI Platform Architecture + +## 1. Vision and Capabilities +The XMIND platform delivers a quant-grade sports analytics and betting copilot. It pairs an advanced reasoning foundation model with rich market/metrics data ingestion, multi-agent orchestration for verification and scenario analysis, and automated trading guardrails. The architecture focuses on modular Google Cloud services for scalability, resilience, and security, while allowing rapid experimentation with prompts and strategy logic. + +### Core Features +1. **Advanced Reasoning Model Hub** – Vertex AI-hosted large language model (e.g., Gemini 1.5 Pro) fine-tuned via adapters on proprietary sports data, exposed through Vertex AI Endpoints and Vertex AI Extensions for tool calling. +2. **Specialist Tooling** – API tools for odds aggregation, injury/news feeds, simulation engine, and bankroll/risk calculators accessible via API Gateway + Cloud Functions. +3. **Quality Assurance Agent** – Dedicated verifier agent running on Cloud Run, orchestrated through LangChain (or similar) with cross-check prompts, statistical sanity checks, and bet validation policies. +4. **Scenario Simulation & Portfolio Analytics** – Batch and streaming pipelines on Vertex AI Pipelines + BigQuery ML for Monte Carlo simulations, expected value tracking, and historical analysis. +5. **Live Risk Monitoring** – Pub/Sub + Dataflow (Apache Beam) streaming ingestion from sportsbooks, with Looker Studio dashboards fed by BigQuery. +6. **User Interface** – Secure web app on Cloud Run (Next.js/React) with Identity-Aware Proxy (IAP) for access, providing prompt console, bet builder, and analytics workspace. +7. **Governance & Auditability** – Cloud Logging, Cloud Audit Logs, Artifact Registry, Secret Manager, IAM least privilege policies, and a governance automation service account that monitors QA outcomes and risk alerts. + +## 2. High-Level Architecture Diagram (Textual) +``` +User/UI (Cloud Run Web App + IAP) + ↕ +API Gateway ─────────────┐ + ↕ │ +Advanced Reasoning API (Vertex AI Endpoint) + ↕ │ +Tooling Microservices (Cloud Run / Functions) ──▶ Odds Aggregators, Injuries, Sim Engine + ↕ +Verifier Agent (Cloud Run Service) + ↕ +Orchestration (Cloud Tasks + Pub/Sub) + +Data Ingestion Pipelines + - Streaming: Pub/Sub → Dataflow → BigQuery + - Batch: Cloud Storage → Vertex AI Pipelines / BigQuery + +Analytics & Storage + - Feature Store (Vertex AI Feature Store) + - Betting Ledger (BigQuery) + - Model Artifacts (Artifact Registry + Cloud Storage) + +Monitoring & Governance + - Cloud Monitoring & Alerting + - Cloud Logging / Audit Logs + - Secret Manager for API keys +``` + +## 3. Component Breakdown + +### 3.1 Model Layer +- **Vertex AI Model Endpoint:** Deploy Gemini 1.5 Pro or a custom fine-tuned base model. Configure automatic scaling and enable tool calling via Vertex Extensions for invoking REST APIs. +- **Prompt Orchestration:** Store your "meta-perfected" prompt in Secret Manager. A Cloud Run orchestration service assembles prompt + context (recent bets, odds updates) and calls the Vertex endpoint. Cloud Tasks queue `orchestration-dispatch` buffers prompt workflows so you can throttle concurrency and retry gracefully. +- **Retrieval Augmented Generation (RAG):** Use BigQuery and Cloud Storage for historical datasets; Vertex AI Search or PaLM Embeddings + Pinecone-compatible (e.g., AlloyDB AI vector) for semantic recall of player trends, team dynamics, and your proprietary notes. + +### 3.2 Tooling & Agents +- **Odds Aggregation Service:** Cloud Run service calling sportsbook APIs, normalizing prices, pushing to Pub/Sub. +- **Injury/News Feed:** Cloud Functions triggered by Webhooks or Cloud Scheduler to fetch from news APIs, stored in BigQuery and Firestore for low-latency reads. +- **Simulation Engine:** Vertex AI Pipelines notebooks using BigQuery ML, with scheduled runs for scenario planning, storing outputs in BigQuery tables accessible by the model. +- **Verifier Agent:** Independent Cloud Run service receiving candidate bets via Pub/Sub, running rule-based and statistical checks (Kelly criterion thresholds, edge validation). Uses BigQuery for historical edges and Cloud Monitoring metrics for drift detection. QA verdicts are persisted to the `qa_audits` table for downstream reporting. +- **Governance Agent:** Automation on Cloud Run (backed by the `governance-sa` identity) to enforce change management, requiring human approval via Firestore + Cloud Tasks before executing high-value wagers. Scheduler jobs publish to the `governance-alerts` topic to trigger daily policy reviews, and the service account now has BigQuery reader and Pub/Sub publisher rights so it can both inspect historical exposure and raise alerts. + +### 3.3 Data Platform +- **Storage Tiers:** + - Cloud Storage: Raw ingestion, model artifacts, prompt templates, simulation outputs. + - BigQuery: Curated datasets, live odds (`live_odds`), bet recommendations (`bet_recommendations`), QA audits (`qa_audits`), portfolio exposure (`portfolio_positions`), evaluation metrics, and a `recommendation_summary` view blending QA verdicts with model output. + - Vertex Feature Store: Structured features for training/updating predictive models. +- **ETL/ELT:** + - Streaming: Dataflow pipelines enrich odds and news data in-flight. + - Batch: Cloud Composer (Airflow) orchestrates scheduled jobs (stat scraping, model retraining). + +### 3.4 User Experience +- **Web App:** Next.js front-end served from Cloud Run (fully managed). Integrate with Google Identity Platform or Cloud Identity; restrict via IAP. +- **Prompt Console:** UI component for sending prompts to the advanced model, viewing responses, verifying results with QA agent outputs side-by-side. +- **Bet Ticket Builder:** Compose parlays, view implied probabilities, expected value, and QA agent verdict. +- **Portfolio Dashboard:** Embedded Looker Studio or custom D3 visualizations pulling from BigQuery. + +### 3.5 Operations & Security +- **CI/CD:** Cloud Build triggers on Cloud Source Repositories/GitHub, building Docker images pushed to Artifact Registry, then deploying to Cloud Run/Vertex AI. +- **Monitoring:** Cloud Monitoring dashboards, SLO-based alerts for latency, errors, model drift (Vertex AI Model Monitoring), and scheduled QA/governance sweeps via Cloud Scheduler and Pub/Sub. The bootstrap script grants orchestrator/verifier/governance service accounts publisher + subscriber roles plus Cloud Tasks enqueuer capability so observability hooks can emit events without manual IAM tweaks. +- **Secrets & Config:** Secret Manager for API keys, prompt text, database credentials. Use VPC Service Controls + private connectivity for sensitive services. App Engine bootstrap (created automatically by the script) unlocks Cloud Scheduler so cron triggers work on first run. +- **Cost Controls:** Budgets and alerts, usage caps on Vertex AI endpoints, and Cloud Scheduler to shut down idle resources. + +## 4. Data Flow Scenarios + +### 4.1 Real-Time Odds Update +1. Cloud Scheduler hits Odds Aggregator service every five minutes (default cron `*/5 * * * *`). +2. Service fetches bookmaker odds, normalizes lines, publishes to Pub/Sub `odds-updates`. +3. Dataflow enriches with team stats, writes to BigQuery `live_odds` table and caches to Memorystore (optional) for low-latency retrieval. +4. Vertex model receives prompt including latest odds via RAG retrieval; QA agent cross-checks implied probability before bet approval, logging decisions to `qa_audits`. + +### 4.2 Prompt Execution & Verification +1. User submits prompt via UI. +2. API Gateway authenticates, routes to Orchestration service. +3. Orchestration service retrieves base prompt from Secret Manager, supplements context, calls Vertex AI endpoint. +4. Response and supporting data sent to Verifier agent through Pub/Sub/Cloud Tasks. +5. QA agent verifies; results stored in BigQuery `bet_recommendations` with QA verdict appended to `qa_audits` and surfaced to UI via the `recommendation_summary` view. + +### 4.3 Model Retraining Loop +1. Cloud Composer DAG triggers Vertex AI Pipelines job weekly. +2. Pipeline pulls curated features from Feature Store, trains adapters/fine-tunes model. +3. Evaluation metrics logged to BigQuery and Cloud Monitoring; QA agent approves deployment via manual checkpoint. +4. If approved, new model version deployed to Vertex endpoint via CI/CD. + +## 5. Deployment Strategy +- **Infrastructure-as-Code:** Use Terraform for production; for rapid iteration, provide a Cloud Shell deploy script (see `deploy/cloudshell_deploy.sh`). +- **Environments:** Separate projects for dev/staging/prod with Shared VPC. Minimal script provisions core services in a single project with environment labels. +- **Secrets Bootstrapping:** Script creates Secret Manager placeholders; populate prompt and API keys manually after deployment. + +## 6. Security & Compliance Considerations +- Enable CMEK for BigQuery and Cloud Storage to control encryption keys. +- Configure VPC Service Controls to isolate data exfiltration paths. +- Log all model invocations and QA decisions for audit. Use Cloud Audit Logs export to BigQuery for compliance analytics. +- Implement responsible AI guardrails: Vertex AI Safety Filters, fairness evaluation, human-in-the-loop override before execution of real wagers. + +## 7. Future Enhancements +- Integrate reinforcement learning from human feedback (RLHF) loop for strategy refinement. +- Add on-chain execution layer (if required) via Cloud Run connectors to approved betting exchanges. +- Expand QA agent to include portfolio impact analysis and optional ensemble models (e.g., XGBoost risk model) to co-sign high-stake bets. +- Introduce streaming LLM evaluation metrics using Weights & Biases or Vertex Experiments. + +## 8. Reference Deploy Script +Refer to [`deploy/cloudshell_deploy.sh`](../deploy/cloudshell_deploy.sh) for a Cloud Shell-ready bootstrap script that enables APIs, sets up Artifact Registry, Cloud Storage, Pub/Sub, BigQuery datasets, and Cloud Run skeleton services required by the architecture. The script now performs gcloud auth/project validation, applies opinionated IAM bindings (Run invoker, BigQuery viewer, Pub/Sub pub/sub, Cloud Tasks enqueuer), and emits a verification summary so you can confirm one-shot success from the same terminal run. + +## 9. Next Steps +1. Review and customize environment variable defaults in `deploy/cloudshell_deploy.sh`. +2. Run the script from Google Cloud Shell to provision baseline infrastructure (script performs pre-flight binary checks and idempotent creation of resources). +3. Upload your prompt, API keys, and governance policy into Secret Manager entries created by the script. +4. Implement application code for the orchestration service, verifier agent, governance agent, and UI, using the provided infrastructure as a foundation. +5. Configure CI/CD pipelines (Cloud Build triggers) and monitoring dashboards tailored to your betting strategies. Update Scheduler job targets once production images are deployed. +