# Lab-2.2.1: 企業級多模型倉庫架構設計

## 🎯 學習目標

- 理解企業級多模型倉庫的設計原則
- 實現模型間依賴關係管理
- 掌握資源分配與隔離策略
- 建立並發部署與版本衝突處理機制

## 🏢 企業案例: Netflix 推薦系統多模型架構

Netflix 同時部署超過 20 個模型：
- **推薦模型**: 個人化推薦、相似內容推薦
- **搜索模型**: 查詢理解、內容檢索
- **內容模型**: 縮圖生成、字幕翻譯
- **分析模型**: 觀看行為分析、流失預測

每個模型都有多個版本在生產環境中運行，需要統一的管理架構。

## 🛠️ 環境準備與依賴檢查

In [None]:
import os
import sys
import json
import yaml
import shutil
import subprocess
from pathlib import Path
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, asdict
from datetime import datetime
import logging

# 設定日誌
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

print("🚀 企業級多模型倉庫設計 - 環境檢查")
print(f"Python 版本: {sys.version}")
print(f"工作目錄: {os.getcwd()}")

# 檢查 Triton 相關工具
def check_triton_tools():
    tools = {
        'curl': 'HTTP 客戶端測試',
        'docker': 'Triton 容器管理',
        'nvidia-smi': 'GPU 監控',
    }
    
    for tool, desc in tools.items():
        try:
            result = subprocess.run([tool, '--version'], 
                                  capture_output=True, text=True, timeout=5)
            print(f"✅ {tool}: {desc} - 可用")
        except (subprocess.TimeoutExpired, FileNotFoundError):
            print(f"⚠️ {tool}: {desc} - 未安裝或不可用")

check_triton_tools()
print("\n✅ 環境檢查完成")

## 🏗️ 企業級模型倉庫架構設計

### 1. 模型分類與組織策略

In [None]:
@dataclass
class ModelMetadata:
    """模型元數據定義"""
    name: str
    version: str
    model_type: str  # 'classification', 'generation', 'embedding', 'ensemble'
    framework: str   # 'pytorch', 'tensorflow', 'onnx', 'python'
    domain: str      # 'recommendation', 'search', 'content', 'analytics'
    owner_team: str
    description: str
    created_at: str
    dependencies: List[str] = None
    resource_requirements: Dict[str, Any] = None
    sla_requirements: Dict[str, Any] = None
    
    def __post_init__(self):
        if self.dependencies is None:
            self.dependencies = []
        if self.resource_requirements is None:
            self.resource_requirements = {
                'memory_mb': 2048,
                'gpu_memory_mb': 4096,
                'cpu_cores': 2,
                'max_batch_size': 32
            }
        if self.sla_requirements is None:
            self.sla_requirements = {
                'max_latency_p99_ms': 100,
                'min_throughput_rps': 10,
                'availability': 0.999
            }

class EnterpriseModelRepository:
    """企業級模型倉庫管理器"""
    
    def __init__(self, base_path: str = "./enterprise_model_repository"):
        self.base_path = Path(base_path)
        self.metadata_file = self.base_path / "repository_metadata.json"
        self.models = {}
        self._initialize_repository()
    
    def _initialize_repository(self):
        """初始化倉庫結構"""
        self.base_path.mkdir(exist_ok=True)
        
        # 創建企業級目錄結構
        domains = ['recommendation', 'search', 'content', 'analytics', 'shared']
        for domain in domains:
            (self.base_path / domain).mkdir(exist_ok=True)
        
        # 載入現有元數據
        if self.metadata_file.exists():
            with open(self.metadata_file, 'r', encoding='utf-8') as f:
                data = json.load(f)
                self.models = {k: ModelMetadata(**v) for k, v in data.items()}
        
        logger.info(f"企業模型倉庫初始化完成: {self.base_path}")
    
    def register_model(self, metadata: ModelMetadata) -> bool:
        """註冊新模型到倉庫"""
        model_id = f"{metadata.domain}/{metadata.name}"
        
        # 檢查依賴關係
        if not self._validate_dependencies(metadata.dependencies):
            logger.error(f"模型 {model_id} 依賴檢查失敗")
            return False
        
        # 檢查資源衝突
        if not self._check_resource_conflicts(metadata):
            logger.warning(f"模型 {model_id} 可能存在資源衝突")
        
        # 創建模型目錄結構
        model_path = self.base_path / metadata.domain / metadata.name
        version_path = model_path / metadata.version
        version_path.mkdir(parents=True, exist_ok=True)
        
        # 生成 Triton 配置
        config = self._generate_triton_config(metadata)
        config_path = version_path / "config.pbtxt"
        with open(config_path, 'w') as f:
            f.write(config)
        
        # 更新元數據
        self.models[model_id] = metadata
        self._save_metadata()
        
        logger.info(f"✅ 模型 {model_id} v{metadata.version} 註冊成功")
        return True
    
    def _validate_dependencies(self, dependencies: List[str]) -> bool:
        """驗證模型依賴關係"""
        for dep in dependencies:
            if dep not in self.models:
                logger.error(f"依賴模型 {dep} 不存在")
                return False
        return True
    
    def _check_resource_conflicts(self, metadata: ModelMetadata) -> bool:
        """檢查資源衝突"""
        total_gpu_memory = sum(
            model.resource_requirements.get('gpu_memory_mb', 0)
            for model in self.models.values()
        )
        
        new_gpu_memory = metadata.resource_requirements.get('gpu_memory_mb', 0)
        
        # 假設總 GPU 記憶體為 24GB
        max_gpu_memory = 24 * 1024  # MB
        
        if total_gpu_memory + new_gpu_memory > max_gpu_memory:
            logger.warning(f"GPU 記憶體可能不足: {total_gpu_memory + new_gpu_memory}MB > {max_gpu_memory}MB")
            return False
        
        return True
    
    def _generate_triton_config(self, metadata: ModelMetadata) -> str:
        """生成 Triton 配置檔案"""
        backend = {
            'pytorch': 'pytorch',
            'tensorflow': 'tensorflow',
            'onnx': 'onnxruntime',
            'python': 'python'
        }.get(metadata.framework, 'python')
        
        config = f'''
name: "{metadata.name}"
backend: "{backend}"
max_batch_size: {metadata.resource_requirements['max_batch_size']}

# 輸入配置 (示例)
input [
  {{
    name: "INPUT"
    data_type: TYPE_FP32
    dims: [ -1 ]
  }}
]

# 輸出配置 (示例)
output [
  {{
    name: "OUTPUT"
    data_type: TYPE_FP32
    dims: [ -1 ]
  }}
]

# 動態批處理配置
dynamic_batching {{
  max_queue_delay_microseconds: {metadata.sla_requirements['max_latency_p99_ms'] * 1000}
}}

# 實例群組配置
instance_group [
  {{
    count: 1
    kind: KIND_GPU
    gpus: [ 0 ]
  }}
]

# 模型元數據
parameters: {{
  key: "domain"
  value: {{ string_value: "{metadata.domain}" }}
}}
parameters: {{
  key: "owner_team"
  value: {{ string_value: "{metadata.owner_team}" }}
}}
parameters: {{
  key: "sla_latency_p99_ms"
  value: {{ string_value: "{metadata.sla_requirements['max_latency_p99_ms']}" }}
}}
'''.strip()
        
        return config
    
    def _save_metadata(self):
        """保存元數據到檔案"""
        data = {k: asdict(v) for k, v in self.models.items()}
        with open(self.metadata_file, 'w', encoding='utf-8') as f:
            json.dump(data, f, indent=2, ensure_ascii=False)
    
    def list_models(self, domain: Optional[str] = None) -> List[ModelMetadata]:
        """列出模型"""
        if domain:
            return [model for model_id, model in self.models.items() 
                   if model.domain == domain]
        return list(self.models.values())
    
    def get_model_dependencies(self, model_id: str) -> List[str]:
        """獲取模型依賴圖"""
        if model_id not in self.models:
            return []
        
        def get_deps(mid: str, visited: set) -> List[str]:
            if mid in visited:
                return []  # 避免循環依賴
            
            visited.add(mid)
            deps = []
            
            if mid in self.models:
                for dep in self.models[mid].dependencies:
                    deps.append(dep)
                    deps.extend(get_deps(dep, visited.copy()))
            
            return deps
        
        return get_deps(model_id, set())

# 初始化企業模型倉庫
repo = EnterpriseModelRepository()
print("\n✅ 企業級模型倉庫初始化完成")

### 2. Netflix 案例：建立推薦系統模型群組

In [None]:
# 定義 Netflix 推薦系統模型群組
netflix_models = [
    ModelMetadata(
        name="user_embedding",
        version="1.0",
        model_type="embedding",
        framework="pytorch",
        domain="recommendation",
        owner_team="recommendation_platform",
        description="用戶向量化模型，將用戶行為轉換為稠密向量",
        created_at=datetime.now().isoformat(),
        dependencies=[],
        resource_requirements={
            'memory_mb': 4096,
            'gpu_memory_mb': 6144,
            'cpu_cores': 4,
            'max_batch_size': 128
        },
        sla_requirements={
            'max_latency_p99_ms': 50,
            'min_throughput_rps': 100,
            'availability': 0.9999
        }
    ),
    ModelMetadata(
        name="content_embedding",
        version="2.1",
        model_type="embedding",
        framework="pytorch",
        domain="recommendation",
        owner_team="content_intelligence",
        description="內容向量化模型，提取影視內容特徵",
        created_at=datetime.now().isoformat(),
        dependencies=[],
        resource_requirements={
            'memory_mb': 6144,
            'gpu_memory_mb': 8192,
            'cpu_cores': 6,
            'max_batch_size': 64
        },
        sla_requirements={
            'max_latency_p99_ms': 75,
            'min_throughput_rps': 50,
            'availability': 0.9995
        }
    ),
    ModelMetadata(
        name="recommendation_ranker",
        version="3.0",
        model_type="classification",
        framework="pytorch",
        domain="recommendation",
        owner_team="recommendation_platform",
        description="推薦排序模型，基於用戶和內容特徵進行個人化排序",
        created_at=datetime.now().isoformat(),
        dependencies=["recommendation/user_embedding", "recommendation/content_embedding"],
        resource_requirements={
            'memory_mb': 8192,
            'gpu_memory_mb': 12288,
            'cpu_cores': 8,
            'max_batch_size': 32
        },
        sla_requirements={
            'max_latency_p99_ms': 100,
            'min_throughput_rps': 25,
            'availability': 0.9999
        }
    )
]

# 註冊模型到倉庫
print("📊 註冊 Netflix 推薦系統模型群組...")
for model in netflix_models:
    success = repo.register_model(model)
    if success:
        print(f"✅ {model.domain}/{model.name} v{model.version} 註冊成功")
    else:
        print(f"❌ {model.domain}/{model.name} v{model.version} 註冊失敗")

print(f"\n📈 倉庫統計:")
print(f"- 總模型數量: {len(repo.models)}")
print(f"- 推薦領域模型: {len(repo.list_models('recommendation'))}")

### 3. 搜索領域模型添加

In [None]:
# 添加搜索領域模型
search_models = [
    ModelMetadata(
        name="query_understanding",
        version="1.5",
        model_type="classification",
        framework="pytorch",
        domain="search",
        owner_team="search_experience",
        description="查詢理解模型，解析用戶搜索意圖和實體識別",
        created_at=datetime.now().isoformat(),
        dependencies=[],
        resource_requirements={
            'memory_mb': 3072,
            'gpu_memory_mb': 4096,
            'cpu_cores': 4,
            'max_batch_size': 64
        },
        sla_requirements={
            'max_latency_p99_ms': 80,
            'min_throughput_rps': 200,
            'availability': 0.9995
        }
    ),
    ModelMetadata(
        name="content_retrieval",
        version="2.0",
        model_type="embedding",
        framework="onnx",
        domain="search",
        owner_team="search_experience",
        description="內容檢索模型，基於語義相似度檢索相關內容",
        created_at=datetime.now().isoformat(),
        dependencies=["search/query_understanding"],
        resource_requirements={
            'memory_mb': 2048,
            'gpu_memory_mb': 3072,
            'cpu_cores': 2,
            'max_batch_size': 128
        },
        sla_requirements={
            'max_latency_p99_ms': 60,
            'min_throughput_rps': 150,
            'availability': 0.999
        }
    )
]

# 註冊搜索模型
print("🔍 註冊搜索領域模型...")
for model in search_models:
    success = repo.register_model(model)
    if success:
        print(f"✅ {model.domain}/{model.name} v{model.version} 註冊成功")
    else:
        print(f"❌ {model.domain}/{model.name} v{model.version} 註冊失敗")

# 驗證依賴關係
print("\n🔗 依賴關係分析:")
for model_id in repo.models.keys():
    deps = repo.get_model_dependencies(model_id)
    if deps:
        print(f"- {model_id}: 依賴 {', '.join(deps)}")
    else:
        print(f"- {model_id}: 無依賴")

## 📊 倉庫結構視覺化與分析

In [None]:
def visualize_repository_structure():
    """視覺化倉庫結構"""
    print("🏗️ 企業級模型倉庫結構:")
    print("="*60)
    
    # 按領域分組顯示
    domains = {}
    for model_id, model in repo.models.items():
        if model.domain not in domains:
            domains[model.domain] = []
        domains[model.domain].append(model)
    
    for domain, models in domains.items():
        print(f"\n📁 {domain.upper()} 領域 ({len(models)} 個模型)")
        print("-" * 40)
        
        for model in models:
            deps_info = f" (依賴: {len(model.dependencies)} 個)" if model.dependencies else " (無依賴)"
            gpu_mb = model.resource_requirements.get('gpu_memory_mb', 0)
            latency = model.sla_requirements.get('max_latency_p99_ms', 0)
            
            print(f"  📦 {model.name} v{model.version}")
            print(f"     框架: {model.framework} | GPU: {gpu_mb}MB | 延遲: {latency}ms{deps_info}")
            print(f"     團隊: {model.owner_team}")
            print(f"     描述: {model.description[:50]}...")
            print()

def analyze_resource_usage():
    """分析資源使用情況"""
    print("\n📈 資源使用分析:")
    print("="*60)
    
    total_memory = 0
    total_gpu_memory = 0
    total_cpu_cores = 0
    
    framework_count = {}
    domain_resources = {}
    
    for model in repo.models.values():
        # 累計資源
        total_memory += model.resource_requirements.get('memory_mb', 0)
        total_gpu_memory += model.resource_requirements.get('gpu_memory_mb', 0)
        total_cpu_cores += model.resource_requirements.get('cpu_cores', 0)
        
        # 框架統計
        framework_count[model.framework] = framework_count.get(model.framework, 0) + 1
        
        # 領域資源統計
        if model.domain not in domain_resources:
            domain_resources[model.domain] = {'memory': 0, 'gpu_memory': 0, 'models': 0}
        
        domain_resources[model.domain]['memory'] += model.resource_requirements.get('memory_mb', 0)
        domain_resources[model.domain]['gpu_memory'] += model.resource_requirements.get('gpu_memory_mb', 0)
        domain_resources[model.domain]['models'] += 1
    
    print(f"💾 總記憶體需求: {total_memory/1024:.1f} GB")
    print(f"🎮 總 GPU 記憶體需求: {total_gpu_memory/1024:.1f} GB")
    print(f"⚡ 總 CPU 核心需求: {total_cpu_cores} 核")
    
    print(f"\n🔧 框架分佈:")
    for framework, count in framework_count.items():
        percentage = (count / len(repo.models)) * 100
        print(f"  - {framework}: {count} 個模型 ({percentage:.1f}%)")
    
    print(f"\n📊 領域資源分佈:")
    for domain, resources in domain_resources.items():
        print(f"  - {domain}:")
        print(f"    模型數量: {resources['models']}")
        print(f"    記憶體: {resources['memory']/1024:.1f} GB")
        print(f"    GPU 記憶體: {resources['gpu_memory']/1024:.1f} GB")

def analyze_sla_requirements():
    """分析 SLA 需求"""
    print(f"\n⏱️ SLA 需求分析:")
    print("="*60)
    
    latencies = []
    throughputs = []
    availabilities = []
    
    print("模型名稱\t\t延遲(ms)\t吞吐量(RPS)\t可用性")
    print("-" * 60)
    
    for model in repo.models.values():
        latency = model.sla_requirements.get('max_latency_p99_ms', 0)
        throughput = model.sla_requirements.get('min_throughput_rps', 0)
        availability = model.sla_requirements.get('availability', 0)
        
        latencies.append(latency)
        throughputs.append(throughput)
        availabilities.append(availability)
        
        print(f"{model.name[:15]:<15}\t{latency:<8}\t{throughput:<12}\t{availability*100:.2f}%")
    
    if latencies:
        print(f"\n📊 統計摘要:")
        print(f"  平均延遲: {sum(latencies)/len(latencies):.1f} ms")
        print(f"  平均吞吐量: {sum(throughputs)/len(throughputs):.1f} RPS")
        print(f"  平均可用性: {(sum(availabilities)/len(availabilities))*100:.3f}%")
        print(f"  最嚴格延遲要求: {min(latencies)} ms")
        print(f"  最高吞吐量要求: {max(throughputs)} RPS")

# 執行分析
visualize_repository_structure()
analyze_resource_usage()
analyze_sla_requirements()

## 🔄 模型部署順序規劃

In [None]:
class DeploymentPlanner:
    """模型部署順序規劃器"""
    
    def __init__(self, repository: EnterpriseModelRepository):
        self.repo = repository
    
    def create_deployment_plan(self) -> List[List[str]]:
        """創建部署計劃 - 考慮依賴關係的拓撲排序"""
        # 建立依賴圖
        graph = {}
        in_degree = {}
        
        for model_id in self.repo.models.keys():
            graph[model_id] = []
            in_degree[model_id] = 0
        
        # 建立邊和入度
        for model_id, model in self.repo.models.items():
            for dep in model.dependencies:
                if dep in graph:
                    graph[dep].append(model_id)
                    in_degree[model_id] += 1
        
        # Kahn's 算法進行拓撲排序
        deployment_waves = []
        queue = [model_id for model_id, degree in in_degree.items() if degree == 0]
        
        while queue:
            current_wave = queue.copy()
            deployment_waves.append(current_wave)
            queue.clear()
            
            for model_id in current_wave:
                for neighbor in graph[model_id]:
                    in_degree[neighbor] -= 1
                    if in_degree[neighbor] == 0:
                        queue.append(neighbor)
        
        return deployment_waves
    
    def estimate_deployment_time(self, deployment_waves: List[List[str]]) -> Dict[str, Any]:
        """估算部署時間"""
        # 假設部署時間與模型大小相關
        base_deployment_time = 2  # 基礎部署時間（分鐘）
        
        wave_times = []
        total_time = 0
        
        for i, wave in enumerate(deployment_waves):
            wave_time = 0
            for model_id in wave:
                model = self.repo.models[model_id]
                # 根據 GPU 記憶體需求估算部署時間
                gpu_memory_gb = model.resource_requirements.get('gpu_memory_mb', 0) / 1024
                deployment_time = base_deployment_time + (gpu_memory_gb * 0.5)
                wave_time = max(wave_time, deployment_time)  # 並行部署取最長時間
            
            wave_times.append(wave_time)
            total_time += wave_time
        
        return {
            'total_time_minutes': total_time,
            'wave_times': wave_times,
            'total_waves': len(deployment_waves)
        }
    
    def generate_deployment_script(self, deployment_waves: List[List[str]]) -> str:
        """生成部署腳本"""
        script_lines = [
            "#!/bin/bash",
            "# 企業級多模型部署腳本",
            "# 自動生成 - 請勿手動修改",
            "",
            "set -e  # 發生錯誤時立即退出",
            "",
            "echo '🚀 開始企業級多模型部署'",
            "echo '==============================================='",
            ""
        ]
        
        for i, wave in enumerate(deployment_waves, 1):
            script_lines.extend([
                f"echo '📦 第 {i} 波部署: {len(wave)} 個模型'",
                "echo '-----------------------------------------------'",
                ""
            ])
            
            # 並行部署同一波的模型
            for model_id in wave:
                domain, name = model_id.split('/')
                model = self.repo.models[model_id]
                
                script_lines.extend([
                    f"echo '  🔄 部署 {model_id} v{model.version}'",
                    f"# 健康檢查",
                    f"curl -f http://localhost:8000/v2/models/{name}/ready || {{",
                    f"  echo '❌ 模型 {name} 部署失敗'",
                    f"  exit 1",
                    f"}}",
                    f"echo '✅ 模型 {name} 部署成功'",
                    ""
                ])
            
            if i < len(deployment_waves):
                script_lines.extend([
                    "echo '⏳ 等待當前波部署完成...'",
                    "sleep 30",
                    ""
                ])
        
        script_lines.extend([
            "echo '✅ 所有模型部署完成！'",
            "echo '📊 最終狀態檢查...'",
            "curl -s http://localhost:8000/v2/models | jq .",
            "echo '🎉 企業級多模型平台部署成功！'"
        ])
        
        return "\n".join(script_lines)

# 創建部署計劃
planner = DeploymentPlanner(repo)
deployment_waves = planner.create_deployment_plan()
time_estimate = planner.estimate_deployment_time(deployment_waves)

print("📅 智能部署計劃:")
print("="*60)

for i, wave in enumerate(deployment_waves, 1):
    print(f"\n🌊 第 {i} 波部署 (預估 {time_estimate['wave_times'][i-1]:.1f} 分鐘):")
    for model_id in wave:
        model = repo.models[model_id]
        deps_info = f" (依賴: {', '.join(model.dependencies)})" if model.dependencies else " (無依賴)"
        print(f"  📦 {model_id} v{model.version}{deps_info}")

print(f"\n⏱️ 總部署時間估算: {time_estimate['total_time_minutes']:.1f} 分鐘")
print(f"📊 總部署波數: {time_estimate['total_waves']}")

# 生成部署腳本
deployment_script = planner.generate_deployment_script(deployment_waves)
script_path = repo.base_path / "deploy_models.sh"

with open(script_path, 'w') as f:
    f.write(deployment_script)

# 設定執行權限
os.chmod(script_path, 0o755)

print(f"\n📜 部署腳本已生成: {script_path}")
print("執行方式: ./deploy_models.sh")

## 🔍 倉庫驗證與健康檢查

In [None]:
class RepositoryValidator:
    """倉庫驗證器"""
    
    def __init__(self, repository: EnterpriseModelRepository):
        self.repo = repository
        self.issues = []
        self.warnings = []
    
    def validate_all(self) -> Dict[str, Any]:
        """執行完整驗證"""
        self.issues.clear()
        self.warnings.clear()
        
        print("🔍 執行倉庫健康檢查...")
        print("="*60)
        
        # 各項檢查
        self._check_dependency_cycles()
        self._check_resource_constraints()
        self._check_sla_consistency()
        self._check_file_structure()
        self._check_naming_conventions()
        
        return {
            'total_issues': len(self.issues),
            'total_warnings': len(self.warnings),
            'issues': self.issues,
            'warnings': self.warnings,
            'status': 'healthy' if len(self.issues) == 0 else 'needs_attention'
        }
    
    def _check_dependency_cycles(self):
        """檢查循環依賴"""
        print("🔄 檢查循環依賴...")
        
        def has_cycle(model_id: str, visited: set, rec_stack: set) -> bool:
            visited.add(model_id)
            rec_stack.add(model_id)
            
            if model_id in self.repo.models:
                for dep in self.repo.models[model_id].dependencies:
                    if dep not in visited:
                        if has_cycle(dep, visited, rec_stack):
                            return True
                    elif dep in rec_stack:
                        return True
            
            rec_stack.remove(model_id)
            return False
        
        visited = set()
        for model_id in self.repo.models.keys():
            if model_id not in visited:
                if has_cycle(model_id, visited, set()):
                    self.issues.append(f"檢測到循環依賴，涉及模型: {model_id}")
        
        if not any("循環依賴" in issue for issue in self.issues):
            print("  ✅ 無循環依賴")
    
    def _check_resource_constraints(self):
        """檢查資源約束"""
        print("💾 檢查資源約束...")
        
        total_gpu_memory = sum(
            model.resource_requirements.get('gpu_memory_mb', 0)
            for model in self.repo.models.values()
        )
        
        # 假設系統限制
        max_gpu_memory = 32 * 1024  # 32GB
        max_cpu_cores = 64
        
        if total_gpu_memory > max_gpu_memory:
            self.issues.append(
                f"GPU 記憶體需求超出限制: {total_gpu_memory/1024:.1f}GB > {max_gpu_memory/1024}GB"
            )
        elif total_gpu_memory > max_gpu_memory * 0.8:
            self.warnings.append(
                f"GPU 記憶體使用率較高: {total_gpu_memory/1024:.1f}GB (80%+ 容量)"
            )
        
        total_cpu_cores = sum(
            model.resource_requirements.get('cpu_cores', 0)
            for model in self.repo.models.values()
        )
        
        if total_cpu_cores > max_cpu_cores:
            self.issues.append(
                f"CPU 核心需求超出限制: {total_cpu_cores} > {max_cpu_cores}"
            )
        
        print(f"  📊 GPU 記憶體使用: {total_gpu_memory/1024:.1f}GB / {max_gpu_memory/1024}GB")
        print(f"  ⚡ CPU 核心使用: {total_cpu_cores} / {max_cpu_cores}")
    
    def _check_sla_consistency(self):
        """檢查 SLA 一致性"""
        print("⏱️ 檢查 SLA 一致性...")
        
        for model_id, model in self.repo.models.items():
            sla = model.sla_requirements
            
            # 檢查依賴鏈的 SLA 一致性
            for dep_id in model.dependencies:
                if dep_id in self.repo.models:
                    dep_model = self.repo.models[dep_id]
                    dep_sla = dep_model.sla_requirements
                    
                    # 依賴模型的延遲應該更嚴格
                    if dep_sla['max_latency_p99_ms'] >= sla['max_latency_p99_ms']:
                        self.warnings.append(
                            f"SLA 不一致: {model_id} 依賴 {dep_id}，但依賴延遲要求不夠嚴格"
                        )
            
            # 檢查不合理的 SLA 設定
            if sla['max_latency_p99_ms'] < 10:
                self.warnings.append(
                    f"SLA 可能過於嚴格: {model_id} 延遲要求 {sla['max_latency_p99_ms']}ms"
                )
            
            if sla['availability'] > 0.9999:
                self.warnings.append(
                    f"可用性要求極高: {model_id} 要求 {sla['availability']*100:.3f}% 可用性"
                )
    
    def _check_file_structure(self):
        """檢查檔案結構"""
        print("📁 檢查檔案結構...")
        
        for model_id, model in self.repo.models.items():
            domain, name = model_id.split('/')
            model_path = self.repo.base_path / domain / name / model.version
            config_path = model_path / "config.pbtxt"
            
            if not model_path.exists():
                self.issues.append(f"模型目錄不存在: {model_path}")
            elif not config_path.exists():
                self.issues.append(f"配置檔案不存在: {config_path}")
    
    def _check_naming_conventions(self):
        """檢查命名規範"""
        print("📝 檢查命名規範...")
        
        import re
        
        # 模型名稱應該使用小寫和底線
        name_pattern = re.compile(r'^[a-z][a-z0-9_]*$')
        
        for model_id, model in self.repo.models.items():
            if not name_pattern.match(model.name):
                self.warnings.append(
                    f"命名不規範: {model.name} (建議使用小寫字母、數字和底線)"
                )
            
            # 版本號應該遵循語義化版本
            version_pattern = re.compile(r'^\d+\.\d+(\..+)?$')
            if not version_pattern.match(model.version):
                self.warnings.append(
                    f"版本號不規範: {model.version} (建議使用語義化版本如 1.0, 2.1.3)"
                )

# 執行驗證
validator = RepositoryValidator(repo)
validation_result = validator.validate_all()

print(f"\n📋 驗證結果摘要:")
print("="*60)
print(f"🔴 嚴重問題: {validation_result['total_issues']} 個")
print(f"🟡 警告: {validation_result['total_warnings']} 個")
print(f"📊 整體狀態: {validation_result['status'].upper()}")

if validation_result['issues']:
    print(f"\n🔴 需要修復的問題:")
    for i, issue in enumerate(validation_result['issues'], 1):
        print(f"  {i}. {issue}")

if validation_result['warnings']:
    print(f"\n🟡 建議改進的警告:")
    for i, warning in enumerate(validation_result['warnings'], 1):
        print(f"  {i}. {warning}")

if validation_result['status'] == 'healthy':
    print(f"\n✅ 恭喜！倉庫結構健康，可以開始部署")
else:
    print(f"\n⚠️ 建議修復上述問題後再進行部署")

## 📈 實際目錄結構展示

In [None]:
def show_directory_tree(path: Path, prefix: str = "", max_depth: int = 3, current_depth: int = 0):
    """顯示目錄樹結構"""
    if current_depth > max_depth:
        return
    
    if path.is_dir():
        print(f"{prefix}📁 {path.name}/")
        
        try:
            children = sorted(path.iterdir())
            for i, child in enumerate(children):
                is_last = i == len(children) - 1
                child_prefix = prefix + ("    " if is_last else "│   ")
                print(f"{prefix}{'└── ' if is_last else '├── '}", end="")
                
                if child.is_dir():
                    print(f"📁 {child.name}/")
                    if current_depth < max_depth:
                        for j, grandchild in enumerate(sorted(child.iterdir())):
                            is_last_grand = j == len(list(child.iterdir())) - 1
                            grand_prefix = child_prefix + ("    " if is_last_grand else "│   ")
                            icon = "📄" if grandchild.is_file() else "📁"
                            suffix = "" if grandchild.is_file() else "/"
                            print(f"{child_prefix}{'└── ' if is_last_grand else '├── '}{icon} {grandchild.name}{suffix}")
                else:
                    file_size = child.stat().st_size
                    size_str = f" ({file_size} bytes)" if file_size < 1024 else f" ({file_size/1024:.1f} KB)"
                    print(f"📄 {child.name}{size_str}")
        except PermissionError:
            print(f"{prefix}    ❌ 權限拒絕")

print("🏗️ 企業級模型倉庫實際結構:")
print("="*80)
show_directory_tree(repo.base_path)

# 顯示配置檔案範例
print("\n📋 配置檔案範例 (recommendation/user_embedding/1.0/config.pbtxt):")
print("="*80)
config_path = repo.base_path / "recommendation" / "user_embedding" / "1.0" / "config.pbtxt"
if config_path.exists():
    with open(config_path, 'r') as f:
        config_content = f.read()
        print(config_content)
else:
    print("❌ 配置檔案不存在")

# 顯示元數據檔案
print("\n📊 倉庫元數據 (repository_metadata.json):")
print("="*80)
if repo.metadata_file.exists():
    with open(repo.metadata_file, 'r', encoding='utf-8') as f:
        metadata_content = f.read()
        # 只顯示前500字元避免過長
        if len(metadata_content) > 500:
            print(metadata_content[:500] + "\n... (內容過長，已截斷)")
        else:
            print(metadata_content)
else:
    print("❌ 元數據檔案不存在")

## 📝 實驗總結與下一步

### 🎯 本實驗完成的學習目標

✅ **企業級模型倉庫架構設計**
- 建立了分層式的模型組織結構
- 實現了完整的模型元數據管理
- 設計了資源分配與隔離策略

✅ **模型依賴關係管理**
- 實現了依賴關係驗證機制
- 建立了智能部署順序規劃
- 提供了循環依賴檢測功能

✅ **Netflix 級別的實際案例**
- 模擬了推薦系統多模型架構
- 展示了搜索領域模型整合
- 實現了企業級 SLA 管理

### 🚀 核心技術成果

1. **EnterpriseModelRepository**: 企業級模型倉庫管理系統
2. **DeploymentPlanner**: 智能部署順序規劃器
3. **RepositoryValidator**: 全面的倉庫健康檢查工具
4. **自動化配置生成**: Triton 配置檔案自動生成
5. **部署腳本生成**: 完整的自動化部署流程

### 📊 企業級特性

- **資源管理**: GPU/CPU 資源分配與衝突檢測
- **SLA 監控**: 延遲、吞吐量、可用性要求管理
- **依賴管理**: 拓撲排序確保正確部署順序
- **健康檢查**: 多層次的倉庫驗證機制
- **企業規範**: 命名規範、版本控制等最佳實踐

### 🎓 下一步學習路徑

準備好進入 **Lab-2.2.2: A/B 測試與版本控制**，我們將學習：
- 實現智能流量分配機制
- 建立統計顯著性測試
- 掌握漸進式部署策略 (Canary/Blue-Green)
- 設計模型性能比較框架

### 💡 延伸思考

1. 如何在現有企業環境中逐步導入這套模型管理架構？
2. 面對 100+ 模型的超大規模場景，需要哪些額外的設計考量？
3. 如何整合現有的 MLOps 工具鏈 (Kubeflow, MLflow 等)？
4. 在多雲環境下，如何確保模型倉庫的一致性和可移植性？

---

**🎉 恭喜完成企業級多模型倉庫架構設計！您已經掌握了 Netflix 級別的模型管理能力！**