diff --git a/_posts/2025-11-18-signal-decision.md b/_posts/2025-11-18-signal-decision.md
new file mode 100644
index 0000000..7ab477f
--- /dev/null
+++ b/_posts/2025-11-18-signal-decision.md
@@ -0,0 +1,620 @@
+---
+layout: post
+title: "Signal-Decision Driven Architecture: Reshaping Semantic Routing at Scale"
+author: "vLLM Semantic Router Team"
+image: /assets/logos/vllm-logo-text-light.png
+---
+
+The earlier versions of vLLM Semantic Router relied on classification-based routing, a straightforward approach where user queries are classified into one of 14 MMLU domain categories, and then routed to corresponding models. While this worked for basic scenarios, we quickly discovered its limitations when building production AI systems for enterprises.
+
+Consider this real-world scenario: A user asks, "I need urgent help reviewing a security vulnerability in my authentication code." The classification-based router would identify this as a "computer science" query and route it to a general coding model. But it misses critical context:
+
+- The **urgency** signal that requires immediate attention
+- The **security** sensitivity that demands specialized expertise and jailbreak protection
+- The **code review** intent that benefits from reasoning capabilities
+- The **authentication** complexity that needs careful analysis
+
+This single example reveals the fundamental constraint: **classification-based routing captures only one dimension of user intent—the domain—while ignoring the rich, multi-dimensional signals embedded in natural language queries.**
+
+Today, we're introducing the **Signal-Decision Architecture**—a complete reimagining of semantic routing that scales from 14 fixed categories to unlimited intelligent routing decisions. This new architecture combines multi-dimensional signal extraction, flexible decision logic with AND/OR operators, and built-in plugin orchestration to deliver production-ready semantic intelligence.
+
+
+
+## The Problem: Why Classification-Based Routing Doesn't Scale
+
+The previous vLLM Semantic Router architecture followed a simple pipeline:
+
+```text
+User Prompt → MMLU Domain Classification → Model Selection
+```
+
+This approach has several fundamental limitations that prevent it from scaling to enterprise requirements.
+
+### Single-Dimensional Analysis
+
+Classification-based routing only considers the **domain** or **subject matter** of the query. It cannot capture:
+
+- **Urgency signals**: "urgent", "immediate", "critical"
+- **Security sensitivity**: "vulnerability", "exploit", "breach"
+- **Intent types**: code review, architecture design, troubleshooting
+- **Complexity levels**: simple FAQ vs. complex reasoning tasks
+- **Compliance requirements**: PII handling, regulatory constraints
+
+**Real Impact**: A medical query about "urgent patient data breach" gets routed to a medical model but lacks PII protection and security filtering—potentially violating HIPAA compliance.
+
+### Fixed Category Constraint
+
+Limited to 14 predefined MMLU categories (math, physics, computer science, business, etc.), making it impossible to:
+
+- Create custom categories for specific business domains
+- Define fine-grained routing rules within a domain
+- Scale beyond academic subject classification
+
+**Real Impact**: An enterprise with 50+ specialized use cases (legal contracts, financial compliance, medical diagnostics, code security audits) cannot express their routing requirements within 14 categories.
+
+### Inflexible Logic
+
+Cannot combine multiple conditions or implement complex routing strategies:
+
+- No support for AND/OR logic: "route to expert model only when query is both urgent AND security-related"
+- No priority-based selection when multiple conditions match
+- No conditional plugin application based on signal combinations
+
+**Real Impact**: Cannot implement layered routing strategies like "high-priority security issues get reasoning + jailbreak protection, while general questions get cached responses."
+
+
+
+## Introducing Signal-Decision Architecture
+
+The Signal-Decision Architecture fundamentally reimagines semantic routing by separating signal extraction from routing decisions and introducing a flexible decision engine with built-in plugin orchestration.
+
+### Architecture Overview
+
+
+
+The new architecture introduces three key innovations:
+
+1. **Multi-Signal Extraction**: Captures multiple dimensions of user intent simultaneously
+2. **Decision Engine**: Combines signals using flexible AND/OR logic with priority-based selection
+3. **Plugin Chain**: Provides built-in intelligence for caching, security, and optimization
+
+### Complete Request Flow
+
+
+
+## Core Concepts
+
+### Signals: Multi-Dimensional Prompt Analysis
+
+Instead of relying solely on domain classification, the Signal-Decision Architecture extracts three complementary types of signals from each user query. Each signal type leverages different AI/ML techniques and serves distinct purposes in the routing decision process.
+
+
+
+#### Keyword Signals: Interpretable Pattern Matching
+
+Keyword signals use regex-based pattern matching to detect specific terms or phrases in user queries. This approach provides **human-interpretable routing logic**—you can easily understand why a query matched a particular rule by examining the keywords.
+
+**Technical Approach**:
+
+- Compiled regex patterns for efficient matching
+- Support for AND/OR boolean operators
+- Case-sensitive and case-insensitive modes
+- No model inference required (zero ML overhead)
+
+**Key Advantage - Interpretability**: Unlike black-box ML models, keyword signals provide complete transparency. When debugging routing decisions, you can trace exactly which keywords triggered which rules. This is critical for compliance auditing and troubleshooting production issues.
+
+**Use Cases**:
+
+- Detect urgency markers: "urgent", "immediate", "asap", "critical"
+- Identify security keywords: "vulnerability", "exploit", "breach", "CVE"
+- Flag compliance terms: "HIPAA", "GDPR", "PII", "confidential"
+- Recognize intent patterns: "code review", "architecture design", "troubleshooting"
+
+#### Embedding Signals: Scalable Semantic Understanding
+
+Embedding signals use neural embedding models to compute semantic similarity between user queries and candidate phrases. This approach provides **scalable semantic matching** that understands intent beyond exact keyword matches.
+
+**Technical Approach**:
+
+- Pre-computed embeddings for candidate phrases (offline)
+- Runtime query embedding using lightweight models (e.g., sentence-transformers)
+- Cosine similarity computation with configurable thresholds
+- Multiple aggregation strategies: max (any match), mean (average similarity), any (threshold-based)
+
+**Key Advantage - Scalability**: Embedding-based matching scales to thousands of candidate phrases efficiently. Adding new routing patterns doesn't require retraining models—simply add new candidate phrases and compute their embeddings. This enables rapid iteration and customization for specific business domains.
+
+**Use Cases**:
+
+- Intent understanding: "I need help" → "technical support request"
+- Paraphrase matching: "How do I fix this bug?" ≈ "debugging assistance"
+- Cross-lingual routing: Semantic similarity works across languages with multilingual embeddings
+- Fuzzy matching: Handles typos, abbreviations, and informal language
+
+#### Domain Signals: Dataset-Driven Classification
+
+Domain signals use MMLU-trained classification models to identify the academic or professional domain of user queries. This approach provides **dataset-driven domain expertise** with support for custom domain expansion.
+
+**Technical Approach**:
+
+- Fine-tuned classification models on MMLU dataset (14 base categories)
+- Support for custom domain expansion via **LoRA adapters**
+- Multi-label classification for queries spanning multiple domains
+- Confidence scoring for domain predictions
+
+**Key Advantage - Extensibility via LoRA**: While the base model covers 14 MMLU categories, enterprises can train lightweight LoRA adapters to add **private domain categories** without retraining the entire model. For example:
+
+- Healthcare: Add "medical_imaging", "clinical_trials", "pharmaceutical_research"
+- Finance: Add "risk_modeling", "algorithmic_trading", "regulatory_compliance"
+- Legal: Add "contract_law", "intellectual_property", "litigation_support"
+
+This enables organizations to extend domain classification to their specific verticals while maintaining the base model's general knowledge.
+
+
+
+**Use Cases**:
+
+- Route to domain-specific expert models (math queries → math-expert)
+- Apply domain-appropriate policies (medical queries → PII protection)
+- Select specialized knowledge bases (legal queries → legal document retrieval)
+- Trigger domain-specific plugins (code queries → syntax validation)
+
+### Signal Comparison
+
+| Signal Type | Technique | Interpretability | Scalability | Extensibility |
+|------------|-----------|------------------|-------------|---------------|
+| Keyword | Regex matching | High (transparent rules) | Medium (manual patterns) | Manual addition |
+| Embedding | Neural embeddings | Low (black-box similarity) | High (thousands of phrases) | Add phrases dynamically |
+| Domain | MMLU + LoRA | Medium (domain labels) | Medium (14+ categories) | LoRA adapters for custom domains |
+
+### Why Three Signal Types?
+
+The three signal types are **complementary**, not redundant:
+
+- **Keyword signals** provide fast, interpretable matching for known patterns
+- **Embedding signals** handle semantic variations and scale to large phrase sets
+- **Domain signals** leverage academic datasets and enable domain-specific expertise
+
+By combining all three, the Signal-Decision Architecture captures multiple dimensions of user intent simultaneously, enabling far more sophisticated routing logic than any single signal type could achieve.
+
+### Decisions: Flexible Routing Logic
+
+Decisions are the core routing rules that combine multiple signals using AND/OR logic to determine model selection and plugin configuration.
+
+#### Decision Structure
+
+Each decision consists of:
+
+**Signal Combination**: AND/OR logic combining multiple signal conditions
+
+- AND: All conditions must match (high precision)
+- OR: Any condition matches (high recall)
+
+**Priority**: Integer value for conflict resolution when multiple decisions match
+
+- Higher priority wins
+- Enables layered routing strategies
+
+**Model Reference**: Specifies which model (and optional LoRA adapter) to use
+
+- Supports base models with domain-specific LoRA adapters
+- Configures reasoning mode and effort level
+
+**Plugin Chain**: Ordered list of plugins to apply
+
+- Semantic caching for cost optimization
+- Jailbreak detection for security
+- PII protection for compliance
+- System prompt injection for behavior control
+- Header mutation for metadata propagation
+
+#### Decision Evaluation Flow
+
+
+
+When multiple decisions match, the system selects the one with the highest priority. If no decisions match, the system falls back to the default model.
+
+### Plugins: Built-in Intelligence
+
+The architecture includes five built-in plugins that can be configured per decision:
+
+| Plugin | Purpose | Key Features |
+|--------|---------|--------------|
+| **semantic-cache** | Cache similar queries | Configurable similarity threshold, cost optimization |
+| **jailbreak** | Detect prompt injection attacks | Threshold-based detection, request blocking |
+| **pii** | Protect sensitive information | Redact/hash/mask modes, GDPR/HIPAA compliance |
+| **system_prompt** | Inject custom instructions | Replace or insert mode, role customization |
+| **header_mutation** | Modify HTTP headers | Add/update/delete headers, metadata propagation |
+
+Plugins execute in the configured order, with each plugin able to modify the request, block execution, or add metadata for downstream processing.
+
+#### Plugin Chain Execution Flow
+
+
+
+## Scaling from 14 to Unlimited
+
+The Signal-Decision Architecture removes the fundamental constraint of fixed categories. Here's how it scales:
+
+### Traditional Approach (Limited)
+
+```text
+14 MMLU Categories → 14 Routing Rules → 14 Model Selections
+```
+
+**Constraints**:
+
+- Cannot create custom categories
+- Cannot combine multiple conditions
+- Cannot apply different policies per rule
+- Cannot scale beyond domain classification
+
+### Signal-Decision Approach (Unlimited)
+
+```text
+3 Signal Types × N Conditions × AND/OR Logic → Unlimited Decisions
+```
+
+**Capabilities**:
+
+- Create unlimited custom routing rules
+- Combine multiple signals with flexible logic
+- Apply unique plugin chains per decision
+- Scale to enterprise complexity
+
+### Scalability Example
+
+Consider an enterprise IT support system:
+
+**Traditional Routing**: Limited to 14 domain-based routes
+
+- "computer_science" → code-model
+- "engineering" → engineering-model
+- (12 more fixed categories)
+
+**Signal-Decision Routing**: Hundreds of specialized routes
+
+- Urgent + Security + Computer Science → security-expert + reasoning + jailbreak
+- Code Review + High Complexity → architecture-model + reasoning
+- FAQ + General → cached-model + semantic-cache
+- Medical + PII Detected → medical-expert + PII-protection + disclaimer
+- Legal + Confidential → law-expert + PII-hash + audit-headers
+- (Hundreds more custom combinations)
+
+Each decision can have unique model selection, reasoning configuration, and plugin chains—enabling fine-grained control at scale.
+
+## Kubernetes-Native Design
+
+The Signal-Decision Architecture is designed for cloud-native environments with two Custom Resource Definitions (CRDs):
+
+### Complete Example: Enterprise IT Support System
+
+Let's walk through a complete example that demonstrates how IntelligentPool and IntelligentRoute work together to build an enterprise IT support routing system.
+
+#### IntelligentPool: Define Model Pool
+
+First, we define the available models and their LoRA adapters:
+
+```yaml
+apiVersion: vllm.ai/v1alpha1
+kind: IntelligentPool
+metadata:
+ name: enterprise-model-pool
+ namespace: vllm-system
+spec:
+ defaultModel: "general-assistant"
+ models:
+ - name: "qwen-base"
+ reasoningFamily: "qwen3"
+ loras:
+ - name: "security-expert"
+ description: "Specialized for security vulnerability analysis"
+ - name: "code-reviewer"
+ description: "Expert in code review and best practices"
+ - name: "architecture-expert"
+ description: "System architecture and design patterns"
+ - name: "devops-expert"
+ description: "Infrastructure and deployment expertise"
+ - name: "general-assistant"
+ reasoningFamily: "gpt"
+```
+
+This pool defines:
+
+- A base model "qwen-base" with 4 specialized LoRA adapters
+- A fallback "general-assistant" model for non-specialized queries
+- Reasoning family configuration for each model
+
+#### IntelligentRoute: Define Routing Logic
+
+Next, we define the routing decisions with multi-signal extraction:
+
+```yaml
+apiVersion: vllm.ai/v1alpha1
+kind: IntelligentRoute
+metadata:
+ name: enterprise-model-routing
+ namespace: vllm-system
+spec:
+ # Define signals for extraction
+ signals:
+ keywords:
+ - name: "urgency"
+ patterns: ["urgent", "critical", "immediate", "asap", "emergency"]
+ operator: "OR"
+ - name: "security"
+ patterns: ["security", "vulnerability", "exploit", "breach", "CVE"]
+ operator: "OR"
+ - name: "code-review"
+ patterns: ["code review", "pull request", "PR review", "review code"]
+ operator: "OR"
+
+ embeddings:
+ - name: "technical-support"
+ candidates:
+ - "I need help with technical issues"
+ - "troubleshooting assistance required"
+ - "debugging support needed"
+ threshold: 0.75
+ aggregation: "max"
+ - name: "architecture-design"
+ candidates:
+ - "system architecture design"
+ - "design patterns and best practices"
+ - "scalability and performance optimization"
+ threshold: 0.80
+ aggregation: "max"
+
+ domains:
+ - name: "computer-science"
+ description: "computer science and programming queries"
+
+ # Define routing decisions
+ decisions:
+ # Priority 100: Urgent security issues
+ - name: "urgent-security"
+ description: "Critical security vulnerabilities requiring immediate expert attention"
+ priority: 100
+ rule:
+ operator: "AND"
+ conditions:
+ - signal: "urgency"
+ type: "keyword"
+ - signal: "security"
+ type: "keyword"
+ - signal: "computer-science"
+ type: "domain"
+ modelRef:
+ name: "qwen-base"
+ lora: "security-expert"
+ reasoning:
+ enabled: true
+ effort: "high"
+ plugins:
+ - type: "jailbreak"
+ config:
+ threshold: 0.8
+ action: "block"
+ - type: "pii"
+ config:
+ mode: "redact"
+ entities: ["email", "ip_address", "api_key"]
+ - type: "header_mutation"
+ config:
+ add:
+ - key: "X-Priority"
+ value: "critical"
+ - key: "X-Security-Alert"
+ value: "true"
+
+ # Priority 80: Code review requests
+ - name: "code-review"
+ description: "Code review and best practices guidance"
+ priority: 80
+ rule:
+ operator: "AND"
+ conditions:
+ - signal: "code-review"
+ type: "keyword"
+ - signal: "computer-science"
+ type: "domain"
+ modelRef:
+ name: "qwen-base"
+ lora: "code-reviewer"
+ reasoning:
+ enabled: true
+ effort: "medium"
+ plugins:
+ - type: "semantic-cache"
+ config:
+ threshold: 0.85
+ ttl: "1h"
+ - type: "system_prompt"
+ config:
+ mode: "insert"
+ content: "You are an expert code reviewer. Focus on security, performance, and maintainability."
+
+ # Priority 60: Architecture design
+ - name: "architecture-design"
+ description: "System architecture and design patterns"
+ priority: 60
+ rule:
+ operator: "AND"
+ conditions:
+ - signal: "architecture-design"
+ type: "embedding"
+ - signal: "computer-science"
+ type: "domain"
+ modelRef:
+ name: "qwen-base"
+ lora: "architecture-expert"
+ reasoning:
+ enabled: true
+ effort: "high"
+ plugins:
+ - type: "semantic-cache"
+ config:
+ threshold: 0.80
+ ttl: "2h"
+
+ # Priority 40: General technical support
+ - name: "general-support"
+ description: "General technical support and troubleshooting"
+ priority: 40
+ rule:
+ operator: "OR"
+ conditions:
+ - signal: "technical-support"
+ type: "embedding"
+ - signal: "computer-science"
+ type: "domain"
+ modelRef:
+ name: "qwen-base"
+ plugins:
+ - type: "semantic-cache"
+ config:
+ threshold: 0.90
+ ttl: "4h"
+```
+
+This configuration demonstrates:
+
+**Multi-Signal Extraction**:
+
+- 3 keyword signals (urgency, security, code-review)
+- 2 embedding signals (technical-support, architecture-design)
+- 1 domain signal (computer-science)
+
+**Layered Decision Logic**:
+
+- Priority 100: Urgent + Security + CS → security-expert + high reasoning + jailbreak + PII protection
+- Priority 80: Code Review + CS → code-reviewer + medium reasoning + cache + custom prompt
+- Priority 60: Architecture Design + CS → architecture-expert + high reasoning + cache
+- Priority 40: General Support → base model + aggressive cache
+
+**Plugin Orchestration**:
+
+- Security-critical queries get jailbreak detection and PII protection
+- Code reviews get semantic caching and custom system prompts
+- Architecture queries get longer cache TTL (2h vs 1h)
+- General queries get aggressive caching (0.90 threshold, 4h TTL)
+
+**Fallback Behavior**:
+
+- If no decision matches, route to defaultModel ("general-assistant")
+- If multiple decisions match, select highest priority
+
+### Dynamic Configuration Flow
+
+
+
+The Kubernetes-native design enables:
+
+- Zero-downtime configuration updates
+- GitOps workflows for change management
+- Multi-cluster deployment strategies
+- Namespace-based isolation and RBAC
+
+## Real-World Applications
+
+### Enterprise IT Support
+
+**Challenge**: Route support tickets based on urgency, technical domain, and security sensitivity.
+
+**Solution**: Multi-layered decisions with priority-based selection
+
+- Priority 100: Urgent + Security + CS → security-expert + reasoning + jailbreak
+- Priority 80: Technical Support + Debugging → code-expert + semantic-cache
+- Priority 60: General Questions → general-model + aggressive-cache
+
+**Results**: Appropriate model selection, cost optimization through caching, security protection for sensitive issues.
+
+### Healthcare Platform
+
+**Challenge**: HIPAA compliance requiring PII protection and medical disclaimers.
+
+**Solution**: Domain-based routing with mandatory compliance plugins
+
+- Health Domain → medical-expert + PII-redaction + disclaimer-prompt + audit-headers
+
+**Results**: Automatic PII protection, consistent disclaimers, audit trail for compliance.
+
+### Financial Services
+
+**Challenge**: Multi-layered security with PII protection, jailbreak detection, and cost optimization.
+
+**Solution**: Comprehensive plugin chain for financial queries
+
+- Economics Domain → finance-expert + jailbreak + PII-hash + disclaimer + cache + compliance-headers
+
+**Results**: Enterprise-grade security, regulatory compliance, cost efficiency.
+
+### Educational Platform
+
+**Challenge**: Personalized learning experiences based on subject and learning intent.
+
+**Solution**: Intent-based routing with customized teaching styles
+
+- Math + Learning Intent → math-expert + reasoning + patient-tutor-prompt + cache
+- Science + Tutorial → science-expert + engaging-educator-prompt
+
+**Results**: Personalized teaching approaches, appropriate reasoning for complex topics, cost optimization.
+
+### Code Assistant
+
+**Challenge**: Different complexity levels require different model capabilities.
+
+**Solution**: Complexity-aware routing with reasoning control
+
+- Architecture Design → reasoning-model + high-effort + complexity-header
+- Code Review → code-expert + medium-reasoning + cache
+- Simple Questions → code-expert + cache-only
+
+**Results**: Optimal model selection, cost-effective reasoning usage, fast responses for simple queries.
+
+## Future Roadmap
+
+The Signal-Decision Architecture provides a foundation for future enhancements across multiple dimensions:
+
+### Routing Core Performance Optimization
+
+**Radix Tree for Keyword Matching**: Replace regex-based keyword matching with radix tree data structures to achieve faster pattern matching for thousands of keyword patterns. This will enable enterprises to define 10,000+ keyword rules with consistent performance.
+
+**HNSW Index for Embedding Search**: Implement Hierarchical Navigable Small World (HNSW) graphs for approximate nearest neighbor search in embedding space. This will significantly improve embedding signal performance while supporting millions of candidate phrases.
+
+**Parallel LoRA for Decode-Only Models**: Enable parallel execution of multiple LoRA adapters during the decode phase, allowing a single base model to serve multiple specialized domains simultaneously. This will reduce model switching overhead and improve throughput for multi-tenant deployments.
+
+### Feature Enhancements
+
+**Visual Configuration Console**: Web-based UI for creating and managing decisions without YAML editing, with real-time validation and testing capabilities.
+
+**Custom Plugin Framework**: SDK for developing custom plugins with community marketplace, enabling enterprises to build domain-specific intelligence layers.
+
+**Advanced Analytics**: Real-time monitoring of decision performance, signal effectiveness, and cost optimization opportunities with ML-driven recommendations.
+
+**Model Evaluation via Multi-Turn Dialogue**: Intelligent model selection through multi-turn conversation evaluation. The system automatically engages multiple candidate models in parallel conversations, using LLM-as-a-Judge to assess response quality across dimensions like coherence, relevance, safety, and domain expertise. This enables dynamic routing optimization based on actual model performance rather than static rules.
+
+**Intent-Aware Internal/External Model Selection**: Smart routing between internal private models and external APIs (OpenAI, Anthropic, etc.) based on intent analysis. Sensitive data and proprietary information automatically route to internal models for privacy and compliance, while general queries leverage external APIs for broader knowledge. Cost, latency, and compliance requirements are balanced dynamically based on query characteristics.
+
+
+
+## Conclusion
+
+The Signal-Decision Architecture represents a fundamental shift in how we think about semantic routing. By moving from fixed classification to flexible signal-based decisions, we enable:
+
+**Unlimited Scalability**: From 14 categories to unlimited custom routing rules
+
+**Multi-Dimensional Intelligence**: Capture keyword, embedding, and domain signals simultaneously
+
+**Flexible Logic**: Combine signals with AND/OR operators and priority-based selection
+
+**Built-in Security**: Integrated plugins for jailbreak detection, PII protection, and compliance
+
+**Cloud-Native Design**: Kubernetes CRDs with dynamic configuration and zero-downtime updates
+
+Whether you're building an enterprise AI gateway, a multi-tenant SaaS platform, or an industry-specific AI assistant, the Signal-Decision Architecture provides the scalability, flexibility, and intelligence needed for production deployments.
+
+## Getting Started
+
+Ready to try Signal-Decision routing?
+
+Join our community to share feedback and learn from other users building intelligent routing systems at scale.
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png b/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png
index 9583712..316b22c 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png and b/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png differ
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png b/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png
index 165c6aa..b002190 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png and b/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png differ
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png b/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png
index 5603566..24b340c 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png and b/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png differ
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png b/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png
index 3cca6be..4d2c7dd 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png and b/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png differ
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png b/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png
index 6913b2e..b2588dd 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png and b/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png differ
diff --git a/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png b/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png
index ab89acf..0b30b2e 100644
Binary files a/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png and b/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png differ
diff --git a/assets/figures/2025-shm-ipc-cache/processes1.png b/assets/figures/2025-shm-ipc-cache/processes1.png
index 64c8520..31949e0 100644
Binary files a/assets/figures/2025-shm-ipc-cache/processes1.png and b/assets/figures/2025-shm-ipc-cache/processes1.png differ
diff --git a/assets/figures/2025-shm-ipc-cache/processes2.png b/assets/figures/2025-shm-ipc-cache/processes2.png
index aaeb1b9..15aafe4 100644
Binary files a/assets/figures/2025-shm-ipc-cache/processes2.png and b/assets/figures/2025-shm-ipc-cache/processes2.png differ
diff --git a/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png b/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png
index 5e0720b..ad433cb 100644
Binary files a/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png and b/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png differ
diff --git a/assets/figures/2025-torch-compile/figure1.png b/assets/figures/2025-torch-compile/figure1.png
index 675465a..17cdbd4 100644
Binary files a/assets/figures/2025-torch-compile/figure1.png and b/assets/figures/2025-torch-compile/figure1.png differ
diff --git a/assets/figures/2025-torch-compile/figure2.png b/assets/figures/2025-torch-compile/figure2.png
index 5d7bdf6..791097e 100644
Binary files a/assets/figures/2025-torch-compile/figure2.png and b/assets/figures/2025-torch-compile/figure2.png differ
diff --git a/assets/figures/2025-torch-compile/figure3.png b/assets/figures/2025-torch-compile/figure3.png
index c58f6fe..9e32c89 100644
Binary files a/assets/figures/2025-torch-compile/figure3.png and b/assets/figures/2025-torch-compile/figure3.png differ
diff --git a/assets/figures/2025-torch-compile/figure4.png b/assets/figures/2025-torch-compile/figure4.png
index 973fa92..5fdcf88 100644
Binary files a/assets/figures/2025-torch-compile/figure4.png and b/assets/figures/2025-torch-compile/figure4.png differ
diff --git a/assets/figures/2025-torch-compile/figure5_b.png b/assets/figures/2025-torch-compile/figure5_b.png
index ffa3693..2fb1742 100644
Binary files a/assets/figures/2025-torch-compile/figure5_b.png and b/assets/figures/2025-torch-compile/figure5_b.png differ
diff --git a/assets/figures/2025-torch-compile/figure6.png b/assets/figures/2025-torch-compile/figure6.png
index 579d276..a2e647e 100644
Binary files a/assets/figures/2025-torch-compile/figure6.png and b/assets/figures/2025-torch-compile/figure6.png differ
diff --git a/assets/figures/2025-torch-compile/figure7.png b/assets/figures/2025-torch-compile/figure7.png
index 258e17e..d237222 100644
Binary files a/assets/figures/2025-torch-compile/figure7.png and b/assets/figures/2025-torch-compile/figure7.png differ
diff --git a/assets/figures/2025-torch-compile/figure8.png b/assets/figures/2025-torch-compile/figure8.png
index 81b236e..a17cb12 100644
Binary files a/assets/figures/2025-torch-compile/figure8.png and b/assets/figures/2025-torch-compile/figure8.png differ
diff --git a/assets/figures/2025-vllm-anatomy/chunked_pt1.png b/assets/figures/2025-vllm-anatomy/chunked_pt1.png
index b51abe0..50f4e23 100644
Binary files a/assets/figures/2025-vllm-anatomy/chunked_pt1.png and b/assets/figures/2025-vllm-anatomy/chunked_pt1.png differ
diff --git a/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png b/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png
index 6c54cc3..12b3e7f 100644
Binary files a/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png and b/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png differ
diff --git a/assets/figures/2025-vllm-anatomy/engine_constructor.png b/assets/figures/2025-vllm-anatomy/engine_constructor.png
index 44fcf31..7e3b811 100644
Binary files a/assets/figures/2025-vllm-anatomy/engine_constructor.png and b/assets/figures/2025-vllm-anatomy/engine_constructor.png differ
diff --git a/assets/figures/2025-vllm-anatomy/engine_loop.png b/assets/figures/2025-vllm-anatomy/engine_loop.png
index 165bf57..16790f2 100644
Binary files a/assets/figures/2025-vllm-anatomy/engine_loop.png and b/assets/figures/2025-vllm-anatomy/engine_loop.png differ
diff --git a/assets/figures/2025-vllm-anatomy/fsm.png b/assets/figures/2025-vllm-anatomy/fsm.png
index ca80f04..dba842b 100644
Binary files a/assets/figures/2025-vllm-anatomy/fsm.png and b/assets/figures/2025-vllm-anatomy/fsm.png differ
diff --git a/assets/figures/2025-vllm-anatomy/fsm2.png b/assets/figures/2025-vllm-anatomy/fsm2.png
index 679efb2..8945aa0 100644
Binary files a/assets/figures/2025-vllm-anatomy/fsm2.png and b/assets/figures/2025-vllm-anatomy/fsm2.png differ
diff --git a/assets/figures/2025-vllm-anatomy/fwd_pass.png b/assets/figures/2025-vllm-anatomy/fwd_pass.png
index 89d066e..bba2bed 100644
Binary files a/assets/figures/2025-vllm-anatomy/fwd_pass.png and b/assets/figures/2025-vllm-anatomy/fwd_pass.png differ
diff --git a/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png b/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png
index 2988826..710a498 100644
Binary files a/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png and b/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png differ
diff --git a/assets/figures/2025-vllm-anatomy/latency_diagram.png b/assets/figures/2025-vllm-anatomy/latency_diagram.png
index b514d81..d2072c7 100644
Binary files a/assets/figures/2025-vllm-anatomy/latency_diagram.png and b/assets/figures/2025-vllm-anatomy/latency_diagram.png differ
diff --git a/assets/figures/2025-vllm-anatomy/multiprocexecutor.png b/assets/figures/2025-vllm-anatomy/multiprocexecutor.png
index 59c2977..4203c2e 100644
Binary files a/assets/figures/2025-vllm-anatomy/multiprocexecutor.png and b/assets/figures/2025-vllm-anatomy/multiprocexecutor.png differ
diff --git a/assets/figures/2025-vllm-anatomy/pd.png b/assets/figures/2025-vllm-anatomy/pd.png
index 3dec052..7c236c5 100644
Binary files a/assets/figures/2025-vllm-anatomy/pd.png and b/assets/figures/2025-vllm-anatomy/pd.png differ
diff --git a/assets/figures/2025-vllm-anatomy/prefix_pt1.png b/assets/figures/2025-vllm-anatomy/prefix_pt1.png
index 3053390..206ed48 100644
Binary files a/assets/figures/2025-vllm-anatomy/prefix_pt1.png and b/assets/figures/2025-vllm-anatomy/prefix_pt1.png differ
diff --git a/assets/figures/2025-vllm-anatomy/prefix_pt2.png b/assets/figures/2025-vllm-anatomy/prefix_pt2.png
index 022e176..6cb450e 100644
Binary files a/assets/figures/2025-vllm-anatomy/prefix_pt2.png and b/assets/figures/2025-vllm-anatomy/prefix_pt2.png differ
diff --git a/assets/figures/2025-vllm-anatomy/prefix_pt3.png b/assets/figures/2025-vllm-anatomy/prefix_pt3.png
index bbaed33..cc8370f 100644
Binary files a/assets/figures/2025-vllm-anatomy/prefix_pt3.png and b/assets/figures/2025-vllm-anatomy/prefix_pt3.png differ
diff --git a/assets/figures/2025-vllm-anatomy/roofline.png b/assets/figures/2025-vllm-anatomy/roofline.png
index 4a74598..a77e43c 100644
Binary files a/assets/figures/2025-vllm-anatomy/roofline.png and b/assets/figures/2025-vllm-anatomy/roofline.png differ
diff --git a/assets/figures/2025-vllm-anatomy/server_setup.png b/assets/figures/2025-vllm-anatomy/server_setup.png
index 4186d77..5fc5b86 100644
Binary files a/assets/figures/2025-vllm-anatomy/server_setup.png and b/assets/figures/2025-vllm-anatomy/server_setup.png differ
diff --git a/assets/figures/2025-vllm-anatomy/specdec_pt1.png b/assets/figures/2025-vllm-anatomy/specdec_pt1.png
index 917dfcb..46b523e 100644
Binary files a/assets/figures/2025-vllm-anatomy/specdec_pt1.png and b/assets/figures/2025-vllm-anatomy/specdec_pt1.png differ
diff --git a/assets/figures/2025-vllm-anatomy/specdec_pt2.png b/assets/figures/2025-vllm-anatomy/specdec_pt2.png
index a6feeb3..fe7635a 100644
Binary files a/assets/figures/2025-vllm-anatomy/specdec_pt2.png and b/assets/figures/2025-vllm-anatomy/specdec_pt2.png differ
diff --git a/assets/figures/2025-vllm-nvidia-nemotron/figure1.png b/assets/figures/2025-vllm-nvidia-nemotron/figure1.png
index 82361bc..aad2f71 100644
Binary files a/assets/figures/2025-vllm-nvidia-nemotron/figure1.png and b/assets/figures/2025-vllm-nvidia-nemotron/figure1.png differ
diff --git a/assets/figures/2025-vllm-nvidia-nemotron/figure2.png b/assets/figures/2025-vllm-nvidia-nemotron/figure2.png
index e4fa070..d823b28 100644
Binary files a/assets/figures/2025-vllm-nvidia-nemotron/figure2.png and b/assets/figures/2025-vllm-nvidia-nemotron/figure2.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png b/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png
old mode 100755
new mode 100644
index 66d709d..2bac669
Binary files a/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png and b/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png b/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png
old mode 100755
new mode 100644
index f6666d5..96b6d8c
Binary files a/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png and b/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png b/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png
old mode 100755
new mode 100644
index 3db598d..f1cf796
Binary files a/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png and b/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png b/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png
old mode 100755
new mode 100644
index b4ea55d..d425c0f
Binary files a/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png and b/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png b/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png
old mode 100755
new mode 100644
index 3a458b1..959d0cb
Binary files a/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png and b/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/thread-load1.png b/assets/figures/2025-vllm-on-intel-arc/thread-load1.png
old mode 100755
new mode 100644
index 3bfc748..5faf160
Binary files a/assets/figures/2025-vllm-on-intel-arc/thread-load1.png and b/assets/figures/2025-vllm-on-intel-arc/thread-load1.png differ
diff --git a/assets/figures/2025-vllm-on-intel-arc/thread-load2.png b/assets/figures/2025-vllm-on-intel-arc/thread-load2.png
old mode 100755
new mode 100644
index 28aabec..ace59b6
Binary files a/assets/figures/2025-vllm-on-intel-arc/thread-load2.png and b/assets/figures/2025-vllm-on-intel-arc/thread-load2.png differ
diff --git a/assets/figures/2025-vllm-sleep-mode/sleepmode.png b/assets/figures/2025-vllm-sleep-mode/sleepmode.png
index 4a918ec..93c5ffa 100644
Binary files a/assets/figures/2025-vllm-sleep-mode/sleepmode.png and b/assets/figures/2025-vllm-sleep-mode/sleepmode.png differ
diff --git a/assets/figures/agent-lightning/1_rewards.png b/assets/figures/agent-lightning/1_rewards.png
index 0bd0711..3e348f0 100644
Binary files a/assets/figures/agent-lightning/1_rewards.png and b/assets/figures/agent-lightning/1_rewards.png differ
diff --git a/assets/figures/agent-lightning/2_having.png b/assets/figures/agent-lightning/2_having.png
index 2cf681c..4217167 100644
Binary files a/assets/figures/agent-lightning/2_having.png and b/assets/figures/agent-lightning/2_having.png differ
diff --git a/assets/figures/agent-lightning/3_agl.png b/assets/figures/agent-lightning/3_agl.png
index 886d0e2..af9b60a 100644
Binary files a/assets/figures/agent-lightning/3_agl.png and b/assets/figures/agent-lightning/3_agl.png differ
diff --git a/assets/figures/agent-lightning/4_tasks-spans-loop.svg b/assets/figures/agent-lightning/4_tasks-spans-loop.svg
index 75bf3d3..6f98f7e 100644
--- a/assets/figures/agent-lightning/4_tasks-spans-loop.svg
+++ b/assets/figures/agent-lightning/4_tasks-spans-loop.svg
@@ -1,17 +1 @@
-
+
\ No newline at end of file
diff --git a/assets/figures/aibrix/aibrix-diagram.png b/assets/figures/aibrix/aibrix-diagram.png
index 1797e01..0277861 100644
Binary files a/assets/figures/aibrix/aibrix-diagram.png and b/assets/figures/aibrix/aibrix-diagram.png differ
diff --git a/assets/figures/annimation0.gif b/assets/figures/annimation0.gif
index 9fccd03..9f1128d 100644
Binary files a/assets/figures/annimation0.gif and b/assets/figures/annimation0.gif differ
diff --git a/assets/figures/annimation1.gif b/assets/figures/annimation1.gif
index ff56fde..11b5554 100644
Binary files a/assets/figures/annimation1.gif and b/assets/figures/annimation1.gif differ
diff --git a/assets/figures/annimation2.gif b/assets/figures/annimation2.gif
index 260f05c..928a0e0 100644
Binary files a/assets/figures/annimation2.gif and b/assets/figures/annimation2.gif differ
diff --git a/assets/figures/annimation3.gif b/assets/figures/annimation3.gif
index 1f7446d..4f3053b 100644
Binary files a/assets/figures/annimation3.gif and b/assets/figures/annimation3.gif differ
diff --git a/assets/figures/beyond-text/io-plugins-flow.png b/assets/figures/beyond-text/io-plugins-flow.png
index 27291d5..37736e0 100644
Binary files a/assets/figures/beyond-text/io-plugins-flow.png and b/assets/figures/beyond-text/io-plugins-flow.png differ
diff --git a/assets/figures/beyond-text/models-diff.png b/assets/figures/beyond-text/models-diff.png
index daf1d70..72bec6e 100644
Binary files a/assets/figures/beyond-text/models-diff.png and b/assets/figures/beyond-text/models-diff.png differ
diff --git a/assets/figures/beyond-text/prithvi-prediction.png b/assets/figures/beyond-text/prithvi-prediction.png
index b0c836d..19e6f5e 100644
Binary files a/assets/figures/beyond-text/prithvi-prediction.png and b/assets/figures/beyond-text/prithvi-prediction.png differ
diff --git a/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png b/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png
index 00926ac..1e79027 100644
Binary files a/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png and b/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png differ
diff --git a/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png b/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png
index 2e1a9cc..e44580c 100644
Binary files a/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png and b/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png differ
diff --git a/assets/figures/deepseek-v3-2/dsa-explained.png b/assets/figures/deepseek-v3-2/dsa-explained.png
index d1aec6e..381217b 100644
Binary files a/assets/figures/deepseek-v3-2/dsa-explained.png and b/assets/figures/deepseek-v3-2/dsa-explained.png differ
diff --git a/assets/figures/deepseek-v3-2/mla-indexer-block.png b/assets/figures/deepseek-v3-2/mla-indexer-block.png
index 7fa824b..4b90fb7 100644
Binary files a/assets/figures/deepseek-v3-2/mla-indexer-block.png and b/assets/figures/deepseek-v3-2/mla-indexer-block.png differ
diff --git a/assets/figures/distributed-inference/kv_cache_effects.png b/assets/figures/distributed-inference/kv_cache_effects.png
index 6253d99..9a2e911 100644
Binary files a/assets/figures/distributed-inference/kv_cache_effects.png and b/assets/figures/distributed-inference/kv_cache_effects.png differ
diff --git a/assets/figures/distributed-inference/tp_strategies.png b/assets/figures/distributed-inference/tp_strategies.png
index bc87bc2..9a83999 100644
Binary files a/assets/figures/distributed-inference/tp_strategies.png and b/assets/figures/distributed-inference/tp_strategies.png differ
diff --git a/assets/figures/lfai/vllm-lfai-light.png b/assets/figures/lfai/vllm-lfai-light.png
index 8f44905..d190a39 100644
Binary files a/assets/figures/lfai/vllm-lfai-light.png and b/assets/figures/lfai/vllm-lfai-light.png differ
diff --git a/assets/figures/llama31/perf_llama3.png b/assets/figures/llama31/perf_llama3.png
index 51a2962..63293fe 100644
Binary files a/assets/figures/llama31/perf_llama3.png and b/assets/figures/llama31/perf_llama3.png differ
diff --git a/assets/figures/llama4/perf.png b/assets/figures/llama4/perf.png
index 0491837..0a60482 100644
Binary files a/assets/figures/llama4/perf.png and b/assets/figures/llama4/perf.png differ
diff --git a/assets/figures/lmsys_traffic.png b/assets/figures/lmsys_traffic.png
index fca8a79..18aec5e 100644
Binary files a/assets/figures/lmsys_traffic.png and b/assets/figures/lmsys_traffic.png differ
diff --git a/assets/figures/minimax-m1/benchmark.png b/assets/figures/minimax-m1/benchmark.png
index 45f3602..7309b2b 100644
Binary files a/assets/figures/minimax-m1/benchmark.png and b/assets/figures/minimax-m1/benchmark.png differ
diff --git a/assets/figures/minimax-m1/lightning_attention.png b/assets/figures/minimax-m1/lightning_attention.png
index a6e49ed..cdd51e5 100644
Binary files a/assets/figures/minimax-m1/lightning_attention.png and b/assets/figures/minimax-m1/lightning_attention.png differ
diff --git a/assets/figures/minimax-m1/moe.png b/assets/figures/minimax-m1/moe.png
index 712db33..27275d8 100644
Binary files a/assets/figures/minimax-m1/moe.png and b/assets/figures/minimax-m1/moe.png differ
diff --git a/assets/figures/notes-vllm-vs-deepspeed/s1.png b/assets/figures/notes-vllm-vs-deepspeed/s1.png
index 19ccfe7..b1792ed 100644
Binary files a/assets/figures/notes-vllm-vs-deepspeed/s1.png and b/assets/figures/notes-vllm-vs-deepspeed/s1.png differ
diff --git a/assets/figures/notes-vllm-vs-deepspeed/s2.png b/assets/figures/notes-vllm-vs-deepspeed/s2.png
index 00b7dc0..df43f83 100644
Binary files a/assets/figures/notes-vllm-vs-deepspeed/s2.png and b/assets/figures/notes-vllm-vs-deepspeed/s2.png differ
diff --git a/assets/figures/openrlhf-vllm/ray.png b/assets/figures/openrlhf-vllm/ray.png
index ca3c0c0..445cf28 100644
Binary files a/assets/figures/openrlhf-vllm/ray.png and b/assets/figures/openrlhf-vllm/ray.png differ
diff --git a/assets/figures/perf-v060/A100_70B.png b/assets/figures/perf-v060/A100_70B.png
index d42300f..a59ac1b 100644
Binary files a/assets/figures/perf-v060/A100_70B.png and b/assets/figures/perf-v060/A100_70B.png differ
diff --git a/assets/figures/perf-v060/A100_8B.png b/assets/figures/perf-v060/A100_8B.png
index a235dab..5edbaf5 100644
Binary files a/assets/figures/perf-v060/A100_8B.png and b/assets/figures/perf-v060/A100_8B.png differ
diff --git a/assets/figures/perf-v060/H100_70B.png b/assets/figures/perf-v060/H100_70B.png
index d14229d..40f13bf 100644
Binary files a/assets/figures/perf-v060/H100_70B.png and b/assets/figures/perf-v060/H100_70B.png differ
diff --git a/assets/figures/perf-v060/H100_8B.png b/assets/figures/perf-v060/H100_8B.png
index 7275df1..73b2680 100644
Binary files a/assets/figures/perf-v060/H100_8B.png and b/assets/figures/perf-v060/H100_8B.png differ
diff --git a/assets/figures/perf-v060/illustration-api-server.png b/assets/figures/perf-v060/illustration-api-server.png
index 36e3ad6..5e188ff 100644
Binary files a/assets/figures/perf-v060/illustration-api-server.png and b/assets/figures/perf-v060/illustration-api-server.png differ
diff --git a/assets/figures/perf-v060/illustration-async-output-processing.png b/assets/figures/perf-v060/illustration-async-output-processing.png
index 4c14ea3..f5ecc36 100644
Binary files a/assets/figures/perf-v060/illustration-async-output-processing.png and b/assets/figures/perf-v060/illustration-async-output-processing.png differ
diff --git a/assets/figures/perf-v060/illustration-multi-step.png b/assets/figures/perf-v060/illustration-multi-step.png
index 0407185..753173e 100644
Binary files a/assets/figures/perf-v060/illustration-multi-step.png and b/assets/figures/perf-v060/illustration-multi-step.png differ
diff --git a/assets/figures/perf-v060/llama70B_comparison.png b/assets/figures/perf-v060/llama70B_comparison.png
index 22dce6e..fddc92e 100644
Binary files a/assets/figures/perf-v060/llama70B_comparison.png and b/assets/figures/perf-v060/llama70B_comparison.png differ
diff --git a/assets/figures/perf-v060/llama8B_comparison.png b/assets/figures/perf-v060/llama8B_comparison.png
index f202da0..ee4d925 100644
Binary files a/assets/figures/perf-v060/llama8B_comparison.png and b/assets/figures/perf-v060/llama8B_comparison.png differ
diff --git a/assets/figures/perf-v060/overall_throughput.png b/assets/figures/perf-v060/overall_throughput.png
index 64c63d6..05fafba 100644
Binary files a/assets/figures/perf-v060/overall_throughput.png and b/assets/figures/perf-v060/overall_throughput.png differ
diff --git a/assets/figures/perf-v060/throughput.png b/assets/figures/perf-v060/throughput.png
index c6d2e26..8233fe3 100644
Binary files a/assets/figures/perf-v060/throughput.png and b/assets/figures/perf-v060/throughput.png differ
diff --git a/assets/figures/perf_a100_n1_dark.png b/assets/figures/perf_a100_n1_dark.png
index 97f331b..9d33780 100644
Binary files a/assets/figures/perf_a100_n1_dark.png and b/assets/figures/perf_a100_n1_dark.png differ
diff --git a/assets/figures/perf_a100_n1_light.png b/assets/figures/perf_a100_n1_light.png
index bd7186e..a518039 100644
Binary files a/assets/figures/perf_a100_n1_light.png and b/assets/figures/perf_a100_n1_light.png differ
diff --git a/assets/figures/perf_a100_n3_dark.png b/assets/figures/perf_a100_n3_dark.png
index 8d86cf5..8811407 100644
Binary files a/assets/figures/perf_a100_n3_dark.png and b/assets/figures/perf_a100_n3_dark.png differ
diff --git a/assets/figures/perf_a100_n3_light.png b/assets/figures/perf_a100_n3_light.png
index d900614..60dd755 100644
Binary files a/assets/figures/perf_a100_n3_light.png and b/assets/figures/perf_a100_n3_light.png differ
diff --git a/assets/figures/perf_a10g_n1_dark.png b/assets/figures/perf_a10g_n1_dark.png
index e46f5ff..85dfc48 100644
Binary files a/assets/figures/perf_a10g_n1_dark.png and b/assets/figures/perf_a10g_n1_dark.png differ
diff --git a/assets/figures/perf_a10g_n1_light.png b/assets/figures/perf_a10g_n1_light.png
index 89214a6..a7413f2 100644
Binary files a/assets/figures/perf_a10g_n1_light.png and b/assets/figures/perf_a10g_n1_light.png differ
diff --git a/assets/figures/perf_a10g_n3_dark.png b/assets/figures/perf_a10g_n3_dark.png
index 415ff78..bb003f5 100644
Binary files a/assets/figures/perf_a10g_n3_dark.png and b/assets/figures/perf_a10g_n3_dark.png differ
diff --git a/assets/figures/perf_a10g_n3_light.png b/assets/figures/perf_a10g_n3_light.png
index e3c959f..8f1ce93 100644
Binary files a/assets/figures/perf_a10g_n3_light.png and b/assets/figures/perf_a10g_n3_light.png differ
diff --git a/assets/figures/ptpc/PTPC-Diagram.png b/assets/figures/ptpc/PTPC-Diagram.png
index 81478b6..8681564 100644
Binary files a/assets/figures/ptpc/PTPC-Diagram.png and b/assets/figures/ptpc/PTPC-Diagram.png differ
diff --git a/assets/figures/ptpc/PTPC-tumbnail.png b/assets/figures/ptpc/PTPC-tumbnail.png
index 6550f81..ba96db4 100644
Binary files a/assets/figures/ptpc/PTPC-tumbnail.png and b/assets/figures/ptpc/PTPC-tumbnail.png differ
diff --git a/assets/figures/ptpc/PTPC121.png b/assets/figures/ptpc/PTPC121.png
index 42640d5..881bdd0 100644
Binary files a/assets/figures/ptpc/PTPC121.png and b/assets/figures/ptpc/PTPC121.png differ
diff --git a/assets/figures/ptpc/PTPCReqs.svg b/assets/figures/ptpc/PTPCReqs.svg
index e3ac0e9..cb84d5c 100644
--- a/assets/figures/ptpc/PTPCReqs.svg
+++ b/assets/figures/ptpc/PTPCReqs.svg
@@ -1 +1 @@
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/assets/figures/ptpc/PTPCSpeedup.svg b/assets/figures/ptpc/PTPCSpeedup.svg
index 9328387..9d4f364 100644
--- a/assets/figures/ptpc/PTPCSpeedup.svg
+++ b/assets/figures/ptpc/PTPCSpeedup.svg
@@ -1 +1 @@
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/assets/figures/qwen3-next/hybrid.png b/assets/figures/qwen3-next/hybrid.png
index f5f88d1..8c56c44 100644
Binary files a/assets/figures/qwen3-next/hybrid.png and b/assets/figures/qwen3-next/hybrid.png differ
diff --git a/assets/figures/qwen3-next/qwen.png b/assets/figures/qwen3-next/qwen.png
index 13992e3..09d566f 100644
Binary files a/assets/figures/qwen3-next/qwen.png and b/assets/figures/qwen3-next/qwen.png differ
diff --git a/assets/figures/semantic-router/full-params.png b/assets/figures/semantic-router/full-params.png
index a835b0d..0f55b3c 100644
Binary files a/assets/figures/semantic-router/full-params.png and b/assets/figures/semantic-router/full-params.png differ
diff --git a/assets/figures/semantic-router/lora.png b/assets/figures/semantic-router/lora.png
index 8e3e199..b213fd3 100644
Binary files a/assets/figures/semantic-router/lora.png and b/assets/figures/semantic-router/lora.png differ
diff --git a/assets/figures/semantic-router/modular.png b/assets/figures/semantic-router/modular.png
index 5e8f1db..ca33a77 100644
Binary files a/assets/figures/semantic-router/modular.png and b/assets/figures/semantic-router/modular.png differ
diff --git a/assets/figures/semantic-router/request.png b/assets/figures/semantic-router/request.png
index 202bb53..d8f740c 100644
Binary files a/assets/figures/semantic-router/request.png and b/assets/figures/semantic-router/request.png differ
diff --git a/assets/figures/semantic-router/signal-0.png b/assets/figures/semantic-router/signal-0.png
new file mode 100644
index 0000000..ab75793
Binary files /dev/null and b/assets/figures/semantic-router/signal-0.png differ
diff --git a/assets/figures/semantic-router/signal-1.png b/assets/figures/semantic-router/signal-1.png
new file mode 100644
index 0000000..ac1fa70
Binary files /dev/null and b/assets/figures/semantic-router/signal-1.png differ
diff --git a/assets/figures/semantic-router/signal-2.png b/assets/figures/semantic-router/signal-2.png
new file mode 100644
index 0000000..bfd2719
Binary files /dev/null and b/assets/figures/semantic-router/signal-2.png differ
diff --git a/assets/figures/semantic-router/signal-3.png b/assets/figures/semantic-router/signal-3.png
new file mode 100644
index 0000000..96bb5e0
Binary files /dev/null and b/assets/figures/semantic-router/signal-3.png differ
diff --git a/assets/figures/semantic-router/signal-4.png b/assets/figures/semantic-router/signal-4.png
new file mode 100644
index 0000000..0e7cd82
Binary files /dev/null and b/assets/figures/semantic-router/signal-4.png differ
diff --git a/assets/figures/semantic-router/signal-5.png b/assets/figures/semantic-router/signal-5.png
new file mode 100644
index 0000000..80b5c78
Binary files /dev/null and b/assets/figures/semantic-router/signal-5.png differ
diff --git a/assets/figures/semantic-router/signal-6.png b/assets/figures/semantic-router/signal-6.png
new file mode 100644
index 0000000..fbbd8fb
Binary files /dev/null and b/assets/figures/semantic-router/signal-6.png differ
diff --git a/assets/figures/semantic-router/signal-7.png b/assets/figures/semantic-router/signal-7.png
new file mode 100644
index 0000000..5ccf48f
Binary files /dev/null and b/assets/figures/semantic-router/signal-7.png differ
diff --git a/assets/figures/semantic-router/signal-8.png b/assets/figures/semantic-router/signal-8.png
new file mode 100644
index 0000000..072d7a3
Binary files /dev/null and b/assets/figures/semantic-router/signal-8.png differ
diff --git a/assets/figures/semantic-router/signal.png b/assets/figures/semantic-router/signal.png
new file mode 100644
index 0000000..214ee41
Binary files /dev/null and b/assets/figures/semantic-router/signal.png differ
diff --git a/assets/figures/spec-decode/figure1.png b/assets/figures/spec-decode/figure1.png
index 264951b..b6c2332 100644
Binary files a/assets/figures/spec-decode/figure1.png and b/assets/figures/spec-decode/figure1.png differ
diff --git a/assets/figures/spec-decode/figure10.png b/assets/figures/spec-decode/figure10.png
index 6905aa9..199e5bf 100644
Binary files a/assets/figures/spec-decode/figure10.png and b/assets/figures/spec-decode/figure10.png differ
diff --git a/assets/figures/spec-decode/figure2.png b/assets/figures/spec-decode/figure2.png
index 912b5c1..e29ded9 100644
Binary files a/assets/figures/spec-decode/figure2.png and b/assets/figures/spec-decode/figure2.png differ
diff --git a/assets/figures/spec-decode/figure3.png b/assets/figures/spec-decode/figure3.png
index 7243f88..4f3f8f9 100644
Binary files a/assets/figures/spec-decode/figure3.png and b/assets/figures/spec-decode/figure3.png differ
diff --git a/assets/figures/spec-decode/figure4.png b/assets/figures/spec-decode/figure4.png
index c6a04ff..2d49598 100644
Binary files a/assets/figures/spec-decode/figure4.png and b/assets/figures/spec-decode/figure4.png differ
diff --git a/assets/figures/spec-decode/figure5.png b/assets/figures/spec-decode/figure5.png
index c4ad93b..5d83cbf 100644
Binary files a/assets/figures/spec-decode/figure5.png and b/assets/figures/spec-decode/figure5.png differ
diff --git a/assets/figures/spec-decode/figure6.png b/assets/figures/spec-decode/figure6.png
index 7faecac..e10cfd6 100644
Binary files a/assets/figures/spec-decode/figure6.png and b/assets/figures/spec-decode/figure6.png differ
diff --git a/assets/figures/spec-decode/figure7.png b/assets/figures/spec-decode/figure7.png
index 6343d1d..ade4d82 100644
Binary files a/assets/figures/spec-decode/figure7.png and b/assets/figures/spec-decode/figure7.png differ
diff --git a/assets/figures/spec-decode/figure8.png b/assets/figures/spec-decode/figure8.png
index 4ffa0f1..bd87d3b 100644
Binary files a/assets/figures/spec-decode/figure8.png and b/assets/figures/spec-decode/figure8.png differ
diff --git a/assets/figures/spec-decode/figure9.png b/assets/figures/spec-decode/figure9.png
index b43df89..8ae8381 100644
Binary files a/assets/figures/spec-decode/figure9.png and b/assets/figures/spec-decode/figure9.png differ
diff --git a/assets/figures/stack/stack-itl.png b/assets/figures/stack/stack-itl.png
index 415ba85..6289627 100644
Binary files a/assets/figures/stack/stack-itl.png and b/assets/figures/stack/stack-itl.png differ
diff --git a/assets/figures/stack/stack-overview-2.png b/assets/figures/stack/stack-overview-2.png
index 38c462b..ecfa5fe 100644
Binary files a/assets/figures/stack/stack-overview-2.png and b/assets/figures/stack/stack-overview-2.png differ
diff --git a/assets/figures/stack/stack-panel.png b/assets/figures/stack/stack-panel.png
index adc2ad3..c095bbe 100644
Binary files a/assets/figures/stack/stack-panel.png and b/assets/figures/stack/stack-panel.png differ
diff --git a/assets/figures/stack/stack-table.png b/assets/figures/stack/stack-table.png
index 5c8a2c8..501e3e2 100644
Binary files a/assets/figures/stack/stack-table.png and b/assets/figures/stack/stack-table.png differ
diff --git a/assets/figures/stack/stack-thumbnail.png b/assets/figures/stack/stack-thumbnail.png
index 5f94c3a..360f762 100644
Binary files a/assets/figures/stack/stack-thumbnail.png and b/assets/figures/stack/stack-thumbnail.png differ
diff --git a/assets/figures/stack/stack-ttft.png b/assets/figures/stack/stack-ttft.png
index 961969e..8d2deec 100644
Binary files a/assets/figures/stack/stack-ttft.png and b/assets/figures/stack/stack-ttft.png differ
diff --git a/assets/figures/struct-decode-intro/mermaid-intro.svg b/assets/figures/struct-decode-intro/mermaid-intro.svg
index 23e1c4b..f03017e 100644
--- a/assets/figures/struct-decode-intro/mermaid-intro.svg
+++ b/assets/figures/struct-decode-intro/mermaid-intro.svg
@@ -1,271 +1 @@
-
-
+
\ No newline at end of file
diff --git a/assets/figures/struct-decode-intro/shogoth-gpt.png b/assets/figures/struct-decode-intro/shogoth-gpt.png
index 597fbd7..0f70673 100644
Binary files a/assets/figures/struct-decode-intro/shogoth-gpt.png and b/assets/figures/struct-decode-intro/shogoth-gpt.png differ
diff --git a/assets/figures/struct-decode-intro/vllm-new-xgrammar.png b/assets/figures/struct-decode-intro/vllm-new-xgrammar.png
index eb1e4c5..83cc338 100644
Binary files a/assets/figures/struct-decode-intro/vllm-new-xgrammar.png and b/assets/figures/struct-decode-intro/vllm-new-xgrammar.png differ
diff --git a/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png b/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png
index 59a3f27..48ebe91 100644
Binary files a/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png and b/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png differ
diff --git a/assets/figures/transformers-backend/transformers-backend.png b/assets/figures/transformers-backend/transformers-backend.png
index b57a948..e4bb6e6 100644
Binary files a/assets/figures/transformers-backend/transformers-backend.png and b/assets/figures/transformers-backend/transformers-backend.png differ
diff --git a/assets/figures/v1/persistent_batch.png b/assets/figures/v1/persistent_batch.png
index 9e0b05a..e6c1abe 100644
Binary files a/assets/figures/v1/persistent_batch.png and b/assets/figures/v1/persistent_batch.png differ
diff --git a/assets/figures/v1/torch_compile_cuda_graph.png b/assets/figures/v1/torch_compile_cuda_graph.png
index c46b643..490e1d2 100644
Binary files a/assets/figures/v1/torch_compile_cuda_graph.png and b/assets/figures/v1/torch_compile_cuda_graph.png differ
diff --git a/assets/figures/v1/v1_llama.png b/assets/figures/v1/v1_llama.png
index e228e60..034900c 100644
Binary files a/assets/figures/v1/v1_llama.png and b/assets/figures/v1/v1_llama.png differ
diff --git a/assets/figures/v1/v1_prefix_caching.png b/assets/figures/v1/v1_prefix_caching.png
index 1dcd341..3f42379 100644
Binary files a/assets/figures/v1/v1_prefix_caching.png and b/assets/figures/v1/v1_prefix_caching.png differ
diff --git a/assets/figures/v1/v1_qwen2vl.png b/assets/figures/v1/v1_qwen2vl.png
index 36e87ff..7af1353 100644
Binary files a/assets/figures/v1/v1_qwen2vl.png and b/assets/figures/v1/v1_qwen2vl.png differ
diff --git a/assets/figures/v1/v1_scheduling.png b/assets/figures/v1/v1_scheduling.png
index 1922530..1b82fce 100644
Binary files a/assets/figures/v1/v1_scheduling.png and b/assets/figures/v1/v1_scheduling.png differ
diff --git a/assets/figures/v1/v1_server_architecture.png b/assets/figures/v1/v1_server_architecture.png
index 7acae43..60387e0 100644
Binary files a/assets/figures/v1/v1_server_architecture.png and b/assets/figures/v1/v1_server_architecture.png differ
diff --git a/assets/figures/v1/v1_tp_architecture.png b/assets/figures/v1/v1_tp_architecture.png
index 51cb98c..bef64d0 100644
Binary files a/assets/figures/v1/v1_tp_architecture.png and b/assets/figures/v1/v1_tp_architecture.png differ
diff --git a/assets/figures/v1/vLLM_V1_Logo.png b/assets/figures/v1/vLLM_V1_Logo.png
index 45608a7..347302b 100644
Binary files a/assets/figures/v1/vLLM_V1_Logo.png and b/assets/figures/v1/vLLM_V1_Logo.png differ
diff --git a/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png b/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png
index 906601b..2cbfd7e 100644
Binary files a/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png and b/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png differ
diff --git a/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png b/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png
index 762e19a..16be796 100644
Binary files a/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png and b/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png differ
diff --git a/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png b/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png
index e0298ce..001f775 100644
Binary files a/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png and b/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png differ
diff --git a/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png b/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png
index 5b53e34..8491cad 100644
Binary files a/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png and b/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png differ
diff --git a/assets/figures/vllm-meetup/image-2.png b/assets/figures/vllm-meetup/image-2.png
index 8177145..e2d5ab9 100644
Binary files a/assets/figures/vllm-meetup/image-2.png and b/assets/figures/vllm-meetup/image-2.png differ
diff --git a/assets/figures/vllm-meetup/image-3.png b/assets/figures/vllm-meetup/image-3.png
index a48b310..7bfc52e 100644
Binary files a/assets/figures/vllm-meetup/image-3.png and b/assets/figures/vllm-meetup/image-3.png differ
diff --git a/assets/figures/vllm-meetup/image-6.png b/assets/figures/vllm-meetup/image-6.png
index 626daab..852bad7 100644
Binary files a/assets/figures/vllm-meetup/image-6.png and b/assets/figures/vllm-meetup/image-6.png differ
diff --git a/assets/figures/vllm-meetup/vllm_meetup_Daniele.png b/assets/figures/vllm-meetup/vllm_meetup_Daniele.png
index f7d1cbe..785fdda 100644
Binary files a/assets/figures/vllm-meetup/vllm_meetup_Daniele.png and b/assets/figures/vllm-meetup/vllm_meetup_Daniele.png differ
diff --git a/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg b/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg
index 88b3216..6dfc31a 100644
Binary files a/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg and b/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg differ
diff --git a/assets/figures/vllm-meetup/vllm_meetup_HSkim.png b/assets/figures/vllm-meetup/vllm_meetup_HSkim.png
index 1e2ab2d..0253a3d 100644
Binary files a/assets/figures/vllm-meetup/vllm_meetup_HSkim.png and b/assets/figures/vllm-meetup/vllm_meetup_HSkim.png differ
diff --git a/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg b/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg
index 7aec446..945888c 100644
Binary files a/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg and b/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg differ
diff --git a/assets/figures/vllm-serving-amd/405b1.png b/assets/figures/vllm-serving-amd/405b1.png
index f7f1220..9c79b67 100644
Binary files a/assets/figures/vllm-serving-amd/405b1.png and b/assets/figures/vllm-serving-amd/405b1.png differ
diff --git a/assets/figures/vllm-serving-amd/405b2.png b/assets/figures/vllm-serving-amd/405b2.png
index 6f13475..cd1ad2f 100644
Binary files a/assets/figures/vllm-serving-amd/405b2.png and b/assets/figures/vllm-serving-amd/405b2.png differ
diff --git a/assets/figures/vllm-serving-amd/70b1.png b/assets/figures/vllm-serving-amd/70b1.png
index 3e63586..589a93c 100644
Binary files a/assets/figures/vllm-serving-amd/70b1.png and b/assets/figures/vllm-serving-amd/70b1.png differ
diff --git a/assets/figures/vllm-serving-amd/70b2.png b/assets/figures/vllm-serving-amd/70b2.png
index 18d02c7..76c240d 100644
Binary files a/assets/figures/vllm-serving-amd/70b2.png and b/assets/figures/vllm-serving-amd/70b2.png differ
diff --git a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png
index 55d4984..dcebe9c 100644
Binary files a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png
index 9a7d352..4abc967 100644
Binary files a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png
index 0beca3b..5a92452 100644
Binary files a/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png and b/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png
index e6ba812..6f85474 100644
Binary files a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png
index b3fbc65..9eb63f4 100644
Binary files a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png
index 8494354..4e594c5 100644
Binary files a/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png and b/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png
index d7b295d..3b6601a 100644
Binary files a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png
index 055861d..78b0478 100644
Binary files a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png
index d073d66..7edc49d 100644
Binary files a/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png and b/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png
index 38e0f4a..bd5d30e 100644
Binary files a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png
index eae2f8d..efff374 100644
Binary files a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png
index 9ec4c58..68d43ba 100644
Binary files a/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png and b/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png
index cd69306..47369ff 100644
Binary files a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png
index 2014944..593ef85 100644
Binary files a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png
index eee86c4..7fd7319 100644
Binary files a/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png and b/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png
index 84e4e98..c34e5d1 100644
Binary files a/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png
index d6f0969..ee6830c 100644
Binary files a/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png b/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png
index eaf8cac..40f81bf 100644
Binary files a/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png and b/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png
index 0588e24..863aca9 100644
Binary files a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png
index e51969a..5319f2a 100644
Binary files a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png
index bd74c38..22373ae 100644
Binary files a/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png and b/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png b/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png
index 92df368..bec0828 100644
Binary files a/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png and b/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png
index 1d70cbc..7333a8c 100644
Binary files a/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png b/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png
index 707ddee..acd4c66 100644
Binary files a/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png and b/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png differ
diff --git a/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png b/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png
index 30bb547..e0bbc6a 100644
Binary files a/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png and b/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png differ
diff --git a/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png b/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png
index 094b467..15658df 100644
Binary files a/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png and b/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png differ
diff --git a/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png b/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png
index f941145..64aeee6 100644
Binary files a/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png and b/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png differ
diff --git a/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png b/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png
index 871eb43..f01fd41 100644
Binary files a/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png and b/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png differ
diff --git a/assets/figures/vllm-tpu/vllm-serve-model.png b/assets/figures/vllm-tpu/vllm-serve-model.png
index 7559f2f..af203a1 100644
Binary files a/assets/figures/vllm-tpu/vllm-serve-model.png and b/assets/figures/vllm-tpu/vllm-serve-model.png differ
diff --git a/assets/figures/vllm-tpu/vllm-tpu.png b/assets/figures/vllm-tpu/vllm-tpu.png
index 3e02628..81b1c8a 100644
Binary files a/assets/figures/vllm-tpu/vllm-tpu.png and b/assets/figures/vllm-tpu/vllm-tpu.png differ
diff --git a/assets/figures/vllm-tpu/whats-new.png b/assets/figures/vllm-tpu/whats-new.png
index 9e4b4ad..b068df8 100644
Binary files a/assets/figures/vllm-tpu/whats-new.png and b/assets/figures/vllm-tpu/whats-new.png differ
diff --git a/assets/logos/vllm-logo-only-light.png b/assets/logos/vllm-logo-only-light.png
index 7aaf174..1755b5d 100644
Binary files a/assets/logos/vllm-logo-only-light.png and b/assets/logos/vllm-logo-only-light.png differ
diff --git a/assets/logos/vllm-logo-text-dark.png b/assets/logos/vllm-logo-text-dark.png
index 959a42f..7e87a0e 100644
Binary files a/assets/logos/vllm-logo-text-dark.png and b/assets/logos/vllm-logo-text-dark.png differ
diff --git a/assets/logos/vllm-logo-text-light.png b/assets/logos/vllm-logo-text-light.png
index 1ead997..b92a889 100644
Binary files a/assets/logos/vllm-logo-text-light.png and b/assets/logos/vllm-logo-text-light.png differ