# 模型部署指南

> 使用 sm_deploy 工具库快速部署模型到 SageMaker Endpoint

---

## 目录

1. [环境准备](#1-环境准备)
2. [配置检查](#2-配置检查)
3. [部署 Real-Time Endpoint](#3-部署-real-time-endpoint)
4. [部署 Serverless Endpoint](#4-部署-serverless-endpoint)
5. [调用 Endpoint](#5-调用-endpoint)
6. [管理 Endpoint](#6-管理-endpoint)
7. [清理资源](#7-清理资源)


## 1. 环境准备

### 1.1 设置环境变量

> 根据您的项目修改以下配置


In [None]:
import sys
import os

# ============================================
# 添加 SDK 路径（如果从 S3 复制）
# ============================================
# sys.path.insert(0, './sdk')

# ============================================
# 必需配置 - 请根据实际情况修改
# ============================================

os.environ["COMPANY"] = "acme"  # 公司名称
os.environ["TEAM"] = "rc"       # 团队 ID
os.environ["PROJECT"] = "fraud-detection"  # 项目名称

# ============================================
# 团队名称映射（用于构建 Role ARN）
# ============================================
os.environ["TEAM_RC_FULLNAME"] = "RiskControl"

# ============================================
# VPC 配置（通常可从 Domain 自动发现）
# 如果自动发现失败，请取消注释并手动设置
# ============================================

# os.environ["VPC_ID"] = "vpc-xxx"
# os.environ["PRIVATE_SUBNET_1_ID"] = "subnet-xxx"
# os.environ["PRIVATE_SUBNET_2_ID"] = "subnet-yyy"
# os.environ["SG_SAGEMAKER_STUDIO"] = "sg-xxx"


## 2. 配置检查


In [None]:
from sm_deploy import get_config
from sm_deploy.config import print_config

# 获取并打印配置
config = get_config()
print_config(config)


In [None]:
# 验证 VPC 配置（将自动注入到 CreateModel）
print("VPC 配置:")
print(f"  VPC ID: {config.vpc_id}")
print(f"  Subnets: {config.subnet_ids}")
print(f"  Security Groups: {config.security_group_ids}")
print()
print("VpcConfig (将自动注入到 CreateModel):")
print(config.get_vpc_config())


## 3. 部署 Real-Time Endpoint

### 3.1 准备模型信息


In [None]:
import boto3
import sagemaker

session = sagemaker.Session()
region = session.boto_region_name

# ============================================
# 模型配置 - 请根据实际情况修改
# ============================================

# 模型文件路径
MODEL_DATA_URL = f"s3://{config.bucket}/models/sklearn/model.tar.gz"

# 推理镜像 URI（使用 AWS 提供的 SKLearn 镜像）
IMAGE_URI = sagemaker.image_uris.retrieve(
    framework="sklearn",
    region=region,
    version="1.2-1",
    py_version="py3",
    instance_type="ml.m5.large"
)

print(f"Model Data: {MODEL_DATA_URL}")
print(f"Image URI: {IMAGE_URI}")


### 3.2 一键部署


In [None]:
from sm_deploy import deploy_model

# 部署 Real-Time Endpoint
endpoint_name = deploy_model(
    model_name="sklearn-v1",
    model_data_url=MODEL_DATA_URL,
    image_uri=IMAGE_URI,
    instance_type="ml.t2.medium",  # 开发测试用小实例
    instance_count=1,
    wait=True  # 等待部署完成
)

print(f"\n✅ Endpoint 已部署: {endpoint_name}")


## 4. 部署 Serverless Endpoint

> Serverless 适合低流量、间歇性使用场景，按请求付费


In [None]:
# 部署 Serverless Endpoint
serverless_endpoint = deploy_model(
    model_name="sklearn-serverless",
    model_data_url=MODEL_DATA_URL,
    image_uri=IMAGE_URI,
    serverless=True,
    serverless_memory_mb=2048,
    serverless_max_concurrency=5,
    wait=True
)

print(f"\n✅ Serverless Endpoint 已部署: {serverless_endpoint}")


## 5. 调用 Endpoint


In [None]:
from sm_deploy import invoke_endpoint

# 测试数据
test_data = {
    "instances": [
        [0.5, 1.2, 0.3, 0.8, 0.1],
        [-0.2, 0.7, 1.1, 0.4, 0.6]
    ]
}

# 调用 Endpoint
result = invoke_endpoint(
    endpoint_name="sklearn-v1",
    data=test_data
)

print("推理结果:")
print(result)


## 6. 管理 Endpoint


In [None]:
from sm_deploy import list_endpoints
from sm_deploy.endpoint import describe_endpoint

# 列出所有 Endpoint
endpoints = list_endpoints()
print("项目 Endpoints:")
for ep in endpoints:
    print(f"  - {ep['name']}: {ep['status']}")


## 7. 清理资源

> ⚠️ **重要**: 使用完毕后务必删除 Endpoint，避免持续计费！


In [None]:
from sm_deploy import delete_endpoint

# 删除 Real-Time Endpoint（同时删除 Config 和 Model）
delete_endpoint(
    "sklearn-v1",
    delete_config=True,
    delete_model=True
)

# 删除 Serverless Endpoint
delete_endpoint(
    "sklearn-serverless",
    delete_config=True,
    delete_model=True
)
