# Project: The Forge (Distributed Factory)

**Goal**: Simulate a high-scale ML platform that unifies distributed training with high-performance serving.

## Components
1.  **Training**: `src.distributed_ray` (Parameter Server).
2.  **Serving**: `src.serving_optimization` (vLLM Lite).
3.  **Infrastructure**: Simulated Auto-Scaling.

In [None]:
import sys
import os
import time
sys.path.append(os.path.abspath('..'))

from src.distributed_ray.distributed_trainer import ParameterServer, ray
from src.serving_optimization.vllm_lite import VLLMLite

print("Modules loaded.")

### Step 1: Distributed Training Job
Launching a Ray cluster simulation to train a new model version.

In [None]:
ray.init()
print("Ray Cluster Online.")

# Spawn a parameter server with 5 workers
ps = ParameterServer.remote(num_workers=5)

print("Training Loop Started...")
for epoch in range(3):
    ray.get(ps.run_epoch.remote())
    
print("Training Complete. Model Artifact 'v2.0' saved.")

### Step 2: Hot-Swap Deploy to vLLM Engine
Deploying the newly trained weights to an optimized serving engine.

In [None]:
engine = VLLMLite()

# Simulate high-load traffic
user_requests = [f"Request {i}" for i in range(20)]
engine.add_prompts(user_requests)

print("Serving Traffic with Continuous Batching...")
engine.run_engine_loop(num_steps=100)