diff --git a/AGENTS.md b/AGENTS.md
index adb6e879..35e49628 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -193,13 +193,13 @@ The goal of this repository is to revamp this documentation repo so that it prov
│ │ ├── streaming
│ │ │ ├── async-iterators.md
│ │ │ ├── callback-handlers.md
-│ │ │ └── overview.md
+│ │ │ └── index.md
│ │ └── tools
│ │ ├── community-tools-package.md
│ │ ├── executors.md
│ │ ├── mcp-tools.md
│ │ ├── python-tools.md
-│ │ └── tools_overview.md
+│ │ └── index.md
│ ├── deploy
│ │ ├── deploy_to_amazon_ec2.md
│ │ ├── deploy_to_amazon_eks.md
diff --git a/docs/user-guide/concepts/agents/agent-loop.md b/docs/user-guide/concepts/agents/agent-loop.md
index 4bdb4a3a..4a8f6c30 100644
--- a/docs/user-guide/concepts/agents/agent-loop.md
+++ b/docs/user-guide/concepts/agents/agent-loop.md
@@ -110,7 +110,7 @@ Solutions:
### Inappropriate Tool Selection
-When the model consistently picks the wrong tool, the problem is usually ambiguous tool descriptions. Review the descriptions from the model's perspective. If two tools have overlapping descriptions, the model has no basis for choosing between them. See [Tools Overview](../tools/tools_overview.md) for guidance on writing effective descriptions.
+When the model consistently picks the wrong tool, the problem is usually ambiguous tool descriptions. Review the descriptions from the model's perspective. If two tools have overlapping descriptions, the model has no basis for choosing between them. See [Tools Overview](../tools/index.md) for guidance on writing effective descriptions.
### MaxTokensReachedException
diff --git a/docs/user-guide/concepts/experimental/bidirectional-streaming/agent.md b/docs/user-guide/concepts/experimental/bidirectional-streaming/agent.md
index 94021197..594b30f4 100644
--- a/docs/user-guide/concepts/experimental/bidirectional-streaming/agent.md
+++ b/docs/user-guide/concepts/experimental/bidirectional-streaming/agent.md
@@ -233,7 +233,7 @@ See [Model Providers](models/nova_sonic.md) for provider-specific options.
`BidiAgent` supports many of the same constructs as `Agent`:
-- **[Tools](../../tools/tools_overview.md)**: Function calling works identically
+- **[Tools](../../tools/index.md)**: Function calling works identically
- **[Hooks](hooks.md)**: Lifecycle event handling with bidirectional-specific events
- **[Session Management](session-management.md)**: Conversation persistence across sessions
- **[Tool Executors](../../tools/executors.md)**: Concurrent and custom execution patterns
diff --git a/docs/user-guide/concepts/experimental/bidirectional-streaming/events.md b/docs/user-guide/concepts/experimental/bidirectional-streaming/events.md
index 49167859..1fbbb953 100644
--- a/docs/user-guide/concepts/experimental/bidirectional-streaming/events.md
+++ b/docs/user-guide/concepts/experimental/bidirectional-streaming/events.md
@@ -6,7 +6,7 @@ Bidirectional streaming events enable real-time monitoring and processing of aud
## Event Model
-Bidirectional streaming uses a different event model than [standard streaming](../../streaming/overview.md):
+Bidirectional streaming uses a different event model than [standard streaming](../../streaming/index.md):
**Standard Streaming:**
@@ -322,7 +322,7 @@ Events for tool execution during conversations. Bidirectional streaming reuses t
#### ToolUseStreamEvent
-Emitted when the model requests tool execution. See [Tools Overview](../../tools/tools_overview.md) for details.
+Emitted when the model requests tool execution. See [Tools Overview](../../tools/index.md) for details.
```python
{
diff --git a/docs/user-guide/concepts/interrupts.md b/docs/user-guide/concepts/interrupts.md
index e9b1d02f..3bef3d61 100644
--- a/docs/user-guide/concepts/interrupts.md
+++ b/docs/user-guide/concepts/interrupts.md
@@ -156,7 +156,7 @@ agent = Agent(
```
-> ⚠️ Interrupts are not supported in [direct tool calls](./tools/tools_overview.md#direct-method-calls) (i.e., calls such as `agent.tool.my_tool()`).
+> ⚠️ Interrupts are not supported in [direct tool calls](./tools/index.md#direct-method-calls) (i.e., calls such as `agent.tool.my_tool()`).
### Components
diff --git a/docs/user-guide/concepts/multi-agent/graph.md b/docs/user-guide/concepts/multi-agent/graph.md
index ca204e12..7e8baa1d 100644
--- a/docs/user-guide/concepts/multi-agent/graph.md
+++ b/docs/user-guide/concepts/multi-agent/graph.md
@@ -334,7 +334,7 @@ async for event in graph.stream_async("Research and analyze market trends"):
print(f"Graph completed: {result.status}")
```
-See the [streaming overview](../streaming/overview.md#multi-agent-events) for details on all multi-agent event types.
+See the [streaming overview](../streaming/index.md#multi-agent-events) for details on all multi-agent event types.
## Graph Results
diff --git a/docs/user-guide/concepts/multi-agent/swarm.md b/docs/user-guide/concepts/multi-agent/swarm.md
index 3f333d59..7bed43c8 100644
--- a/docs/user-guide/concepts/multi-agent/swarm.md
+++ b/docs/user-guide/concepts/multi-agent/swarm.md
@@ -225,7 +225,7 @@ async for event in swarm.stream_async("Design and implement a REST API"):
print(f"\nSwarm completed: {result.status}")
```
-See the [streaming overview](../streaming/overview.md#multi-agent-events) for details on all multi-agent event types.
+See the [streaming overview](../streaming/index.md#multi-agent-events) for details on all multi-agent event types.
## Swarm Results
diff --git a/docs/user-guide/concepts/streaming/async-iterators.md b/docs/user-guide/concepts/streaming/async-iterators.md
index 01914d0d..0cef378f 100644
--- a/docs/user-guide/concepts/streaming/async-iterators.md
+++ b/docs/user-guide/concepts/streaming/async-iterators.md
@@ -2,7 +2,7 @@
Async iterators provide asynchronous streaming of agent events, allowing you to process events as they occur in real-time. This approach is ideal for asynchronous frameworks where you need fine-grained control over async execution flow.
-For a complete list of available events including text generation, tool usage, lifecycle, and reasoning events, see the [streaming overview](./overview.md#event-types).
+For a complete list of available events including text generation, tool usage, lifecycle, and reasoning events, see the [streaming overview](./index.md#event-types).
## Basic Usage
diff --git a/docs/user-guide/concepts/streaming/callback-handlers.md b/docs/user-guide/concepts/streaming/callback-handlers.md
index 3ca9f791..d3421bf2 100644
--- a/docs/user-guide/concepts/streaming/callback-handlers.md
+++ b/docs/user-guide/concepts/streaming/callback-handlers.md
@@ -4,7 +4,7 @@
Callback handlers allow you to intercept and process events as they happen during agent execution in Python. This enables real-time monitoring, custom output formatting, and integration with external systems through function-based event handling.
-For a complete list of available events including text generation, tool usage, lifecycle, and reasoning events, see the [streaming overview](./overview.md#event-types).
+For a complete list of available events including text generation, tool usage, lifecycle, and reasoning events, see the [streaming overview](./index.md#event-types).
> **Note:** For asynchronous applications, consider [async iterators](./async-iterators.md) instead.
diff --git a/docs/user-guide/concepts/streaming/overview.md b/docs/user-guide/concepts/streaming/index.md
similarity index 100%
rename from docs/user-guide/concepts/streaming/overview.md
rename to docs/user-guide/concepts/streaming/index.md
diff --git a/docs/user-guide/concepts/tools/custom-tools.md b/docs/user-guide/concepts/tools/custom-tools.md
index 233f706a..9c6a3ff3 100644
--- a/docs/user-guide/concepts/tools/custom-tools.md
+++ b/docs/user-guide/concepts/tools/custom-tools.md
@@ -363,7 +363,7 @@ Tools can access their execution context to interact with the invoking agent, cu
=== "Python"
- Async tools can yield intermediate results to provide real-time progress updates. Each yielded value becomes a [streaming event](../streaming/overview.md), with the final value serving as the tool's return result:
+ Async tools can yield intermediate results to provide real-time progress updates. Each yielded value becomes a [streaming event](../streaming/index.md), with the final value serving as the tool's return result:
```python
from datetime import datetime
diff --git a/docs/user-guide/concepts/tools/tools_overview.md b/docs/user-guide/concepts/tools/index.md
similarity index 100%
rename from docs/user-guide/concepts/tools/tools_overview.md
rename to docs/user-guide/concepts/tools/index.md
diff --git a/docs/user-guide/deploy/operating-agents-in-production.md b/docs/user-guide/deploy/operating-agents-in-production.md
index a6fc8aed..9a53360b 100644
--- a/docs/user-guide/deploy/operating-agents-in-production.md
+++ b/docs/user-guide/deploy/operating-agents-in-production.md
@@ -48,7 +48,7 @@ agent = Agent(
)
```
-See [Adding Tools to Agents](../concepts/tools/tools_overview.md/#adding-tools-to-agents) and [Auto reloading tools](../concepts/tools/tools_overview.md#auto-loading-and-reloading-tools) for more information.
+See [Adding Tools to Agents](../concepts/tools/index.md/#adding-tools-to-agents) and [Auto reloading tools](../concepts/tools/index.md#auto-loading-and-reloading-tools) for more information.
### Security Considerations
@@ -150,6 +150,6 @@ Operating Strands agents in production requires careful consideration of configu
- [Conversation Management](../../user-guide/concepts/agents/conversation-management.md)
- [Streaming - Async Iterator](../../user-guide/concepts/streaming/async-iterators.md)
-- [Tool Development](../../user-guide/concepts/tools/tools_overview.md)
+- [Tool Development](../../user-guide/concepts/tools/index.md)
- [Guardrails](../../user-guide/safety-security/guardrails.md)
- [Responsible AI](../../user-guide/safety-security/responsible-ai.md)
diff --git a/docs/user-guide/evals-sdk/eval-sop.md b/docs/user-guide/evals-sdk/eval-sop.md
index 4680e167..4f4ffc43 100644
--- a/docs/user-guide/evals-sdk/eval-sop.md
+++ b/docs/user-guide/evals-sdk/eval-sop.md
@@ -231,47 +231,49 @@ Generates insights and recommendations:
The evaluation plan follows a comprehensive structured format with detailed analysis and implementation guidance:
- # Evaluation Plan for QA+Search Agent
-
- ## 1. Evaluation Requirements
- - **User Input:** "generate an evaluation plan for this qa agent..."
- - **Interpreted Evaluation Requirements:** Evaluate the QA agent's ability to answer questions using web search capabilities...
-
- ## 2. Agent Analysis
- | **Attribute** | **Details** |
- | :-------------------- | :---------------------------------------------------------- |
- | **Agent Name** | QA+Search |
- | **Purpose** | Answer questions by searching the web using Tavily API... |
- | **Core Capabilities** | Web search integration, information synthesis... |
-
- **Agent Architecture Diagram:**
- (Mermaid diagram showing User Query → Agent → WebSearchTool → Tavily API flow)
-
- ## 3. Evaluation Metrics
- ### Answer Quality Score
- - **Evaluation Area:** Final response quality
- - **Method:** LLM-as-Judge (using OutputEvaluator with custom rubric)
- - **Scoring Scale:** 0.0 to 1.0
- - **Pass Threshold:** 0.75 or higher
-
- ## 4. Test Data Generation
- - **Simple Factual Questions**: Questions requiring basic web search...
- - **Multi-Step Reasoning Questions**: Questions requiring synthesis...
-
- ## 5. Evaluation Implementation Design
- ### 5.1 Evaluation Code Structure
- ./ # Repository root directory
- ├── requirements.txt # Consolidated dependencies
- └── eval/ # Evaluation workspace
- ├── README.md # Running instructions
- ├── run_evaluation.py # Strands Evals SDK implementation
- └── results/ # Evaluation outputs
-
- ## 6. Progress Tracking
- ### 6.1 User Requirements Log
- | **Timestamp** | **Source** | **Requirement** |
- | :------------ | :--------- | :-------------- |
- | 2025-12-01 | eval sop | Generate evaluation plan... |
+```markdown
+# Evaluation Plan for QA+Search Agent
+
+## 1. Evaluation Requirements
+- **User Input:** "generate an evaluation plan for this qa agent..."
+- **Interpreted Evaluation Requirements:** Evaluate the QA agent's ability to answer questions using web search capabilities...
+
+## 2. Agent Analysis
+| **Attribute** | **Details** |
+| :-------------------- | :---------------------------------------------------------- |
+| **Agent Name** | QA+Search |
+| **Purpose** | Answer questions by searching the web using Tavily API... |
+| **Core Capabilities** | Web search integration, information synthesis... |
+
+**Agent Architecture Diagram:**
+(Mermaid diagram showing User Query → Agent → WebSearchTool → Tavily API flow)
+
+## 3. Evaluation Metrics
+### Answer Quality Score
+- **Evaluation Area:** Final response quality
+- **Method:** LLM-as-Judge (using OutputEvaluator with custom rubric)
+- **Scoring Scale:** 0.0 to 1.0
+- **Pass Threshold:** 0.75 or higher
+
+## 4. Test Data Generation
+- **Simple Factual Questions**: Questions requiring basic web search...
+- **Multi-Step Reasoning Questions**: Questions requiring synthesis...
+
+## 5. Evaluation Implementation Design
+### 5.1 Evaluation Code Structure
+./ # Repository root directory
+├── requirements.txt # Consolidated dependencies
+└── eval/ # Evaluation workspace
+ ├── README.md # Running instructions
+ ├── run_evaluation.py # Strands Evals SDK implementation
+ └── results/ # Evaluation outputs
+
+## 6. Progress Tracking
+### 6.1 User Requirements Log
+| **Timestamp** | **Source** | **Requirement** |
+| :------------ | :--------- | :-------------- |
+| 2025-12-01 | eval sop | Generate evaluation plan... |
+```
### Generated Test Cases
Test cases are generated in JSONL format with structured metadata:
@@ -288,58 +290,60 @@ Test cases are generated in JSONL format with structured metadata:
The evaluation report provides comprehensive analysis with actionable insights:
- # Agent Evaluation Report for QA+Search Agent
-
- ## Executive Summary
- - **Test Scale**: 2 test cases
- - **Success Rate**: 100%
- - **Overall Score**: 1.000 (Perfect)
- - **Status**: Excellent
- - **Action Priority**: Continue monitoring; consider expanding test coverage...
-
- ## Evaluation Results
- ### Test Case Coverage
- - **Simple Factual Questions (Geography)**: Questions requiring basic factual information...
- - **Simple Factual Questions (Sports/Time-sensitive)**: Questions requiring current event information...
-
- ### Results
- | **Metric** | **Score** | **Target** | **Status** |
- | :---------------------- | :-------- | :--------- | :--------- |
- | Answer Quality Score | 1.00 | 0.75+ | Pass ✅ |
- | Overall Test Pass Rate | 100% | 75%+ | Pass ✅ |
-
- ## Agent Success Analysis
- ### Strengths
- - **Perfect Accuracy**: The agent correctly answered 100% of test questions...
- - **Evidence**: Both test cases scored 1.0/1.0 (perfect scores)
- - **Contributing Factors**: Effective use of web search tool...
-
- ## Agent Failure Analysis
- ### No Failures Detected
- The evaluation identified zero failures across all test cases...
-
- ## Action Items & Recommendations
- ### Expand Test Coverage - Priority 1 (Enhancement)
- - **Description**: Increase the number and diversity of test cases...
- - **Actions**:
- - [ ] Add 5-10 additional test cases covering edge cases
- - [ ] Include multi-step reasoning scenarios
- - [ ] Add test cases for error conditions
-
- ## Artifacts & Reproduction
- ### Reference Materials
- - **Agent Code**: `qa_agent/qa_agent.py`
- - **Test Cases**: `eval/test-cases.jsonl`
- - **Results**: `eval/results/.../evaluation_report.json`
-
- ### Reproduction Steps
- source .venv/bin/activate
- python eval/run_evaluation.py
-
- ## Evaluation Limitations and Improvement
- ### Test Data Improvement
- - **Current Limitations**: Only 2 test cases, limited scenario diversity...
- - **Recommended Improvements**: Increase test case count to 10-20 cases...
+```markdown
+# Agent Evaluation Report for QA+Search Agent
+
+## Executive Summary
+- **Test Scale**: 2 test cases
+- **Success Rate**: 100%
+- **Overall Score**: 1.000 (Perfect)
+- **Status**: Excellent
+- **Action Priority**: Continue monitoring; consider expanding test coverage...
+
+## Evaluation Results
+### Test Case Coverage
+- **Simple Factual Questions (Geography)**: Questions requiring basic factual information...
+- **Simple Factual Questions (Sports/Time-sensitive)**: Questions requiring current event information...
+
+### Results
+| **Metric** | **Score** | **Target** | **Status** |
+| :---------------------- | :-------- | :--------- | :--------- |
+| Answer Quality Score | 1.00 | 0.75+ | Pass ✅ |
+| Overall Test Pass Rate | 100% | 75%+ | Pass ✅ |
+
+## Agent Success Analysis
+### Strengths
+- **Perfect Accuracy**: The agent correctly answered 100% of test questions...
+- **Evidence**: Both test cases scored 1.0/1.0 (perfect scores)
+- **Contributing Factors**: Effective use of web search tool...
+
+## Agent Failure Analysis
+### No Failures Detected
+The evaluation identified zero failures across all test cases...
+
+## Action Items & Recommendations
+### Expand Test Coverage - Priority 1 (Enhancement)
+- **Description**: Increase the number and diversity of test cases...
+- **Actions**:
+ - [ ] Add 5-10 additional test cases covering edge cases
+ - [ ] Include multi-step reasoning scenarios
+ - [ ] Add test cases for error conditions
+
+## Artifacts & Reproduction
+### Reference Materials
+- **Agent Code**: `qa_agent/qa_agent.py`
+- **Test Cases**: `eval/test-cases.jsonl`
+- **Results**: `eval/results/.../evaluation_report.json`
+
+### Reproduction Steps
+source .venv/bin/activate
+python eval/run_evaluation.py
+
+## Evaluation Limitations and Improvement
+### Test Data Improvement
+- **Current Limitations**: Only 2 test cases, limited scenario diversity...
+- **Recommended Improvements**: Increase test case count to 10-20 cases...
+```
## Best Practices
diff --git a/docs/user-guide/evals-sdk/evaluators/overview.md b/docs/user-guide/evals-sdk/evaluators/index.md
similarity index 99%
rename from docs/user-guide/evals-sdk/evaluators/overview.md
rename to docs/user-guide/evals-sdk/evaluators/index.md
index 7e0f27b8..262bed2a 100644
--- a/docs/user-guide/evals-sdk/evaluators/overview.md
+++ b/docs/user-guide/evals-sdk/evaluators/index.md
@@ -313,5 +313,5 @@ def compare_agent_versions(cases: list, agents: dict) -> dict:
## Related Documentation
- [Quickstart Guide](../quickstart.md): Get started with Strands Evals
-- [Simulators Overview](../simulators/overview.md): Learn about simulators
+- [Simulators Overview](../simulators/index.md): Learn about simulators
- [Experiment Generator](../experiment_generator.md): Generate test cases automatically
diff --git a/docs/user-guide/evals-sdk/simulators/overview.md b/docs/user-guide/evals-sdk/simulators/index.md
similarity index 100%
rename from docs/user-guide/evals-sdk/simulators/overview.md
rename to docs/user-guide/evals-sdk/simulators/index.md
diff --git a/docs/user-guide/evals-sdk/simulators/user_simulation.md b/docs/user-guide/evals-sdk/simulators/user_simulation.md
index 68a1419e..8ad609e7 100644
--- a/docs/user-guide/evals-sdk/simulators/user_simulation.md
+++ b/docs/user-guide/evals-sdk/simulators/user_simulation.md
@@ -662,7 +662,7 @@ while user_sim.has_next():
## Related Documentation
-- [Simulators Overview](overview.md): Learn about the ActorSimulator and simulator framework
+- [Simulators Overview](index.md): Learn about the ActorSimulator and simulator framework
- [Quickstart Guide](../quickstart.md): Get started with Strands Evals
- [Helpfulness Evaluator](../evaluators/helpfulness_evaluator.md): Evaluate conversation helpfulness
- [Goal Success Rate Evaluator](../evaluators/goal_success_rate_evaluator.md): Assess goal completion
\ No newline at end of file
diff --git a/docs/user-guide/quickstart/index.md b/docs/user-guide/quickstart/overview.md
similarity index 100%
rename from docs/user-guide/quickstart/index.md
rename to docs/user-guide/quickstart/overview.md
diff --git a/mkdocs.yml b/mkdocs.yml
index de3cedb5..feef936c 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -34,6 +34,7 @@ theme:
- content.code.copy
- content.tabs.link
- content.code.select
+ - navigation.indexes
- navigation.instant
- navigation.instant.prefetch
- navigation.instant.progress
@@ -84,7 +85,7 @@ nav:
- User Guide:
- Welcome: README.md
- Quickstart:
- - Overview: user-guide/quickstart/index.md
+ - Getting Started: user-guide/quickstart/overview.md
- Python: user-guide/quickstart/python.md
- TypeScript: user-guide/quickstart/typescript.md
- Concepts:
@@ -97,7 +98,7 @@ nav:
- Structured Output: user-guide/concepts/agents/structured-output.md
- Conversation Management: user-guide/concepts/agents/conversation-management.md
- Tools:
- - Overview: user-guide/concepts/tools/tools_overview.md
+ - Overview: user-guide/concepts/tools/index.md
- Creating Custom Tools: user-guide/concepts/tools/custom-tools.md
- Model Context Protocol (MCP): user-guide/concepts/tools/mcp-tools.md
- Executors: user-guide/concepts/tools/executors.md
@@ -121,7 +122,7 @@ nav:
- CLOVA Studio community: user-guide/concepts/model-providers/clova-studio.md
- FireworksAI community: user-guide/concepts/model-providers/fireworksai.md
- Streaming:
- - Overview: user-guide/concepts/streaming/quickstart.md
+ - Overview: user-guide/concepts/streaming/index.md
- Async Iterators: user-guide/concepts/streaming/async-iterators.md
- Callback Handlers: user-guide/concepts/streaming/callback-handlers.md
- Multi-agent:
@@ -163,7 +164,7 @@ nav:
- Getting Started: user-guide/evals-sdk/quickstart.md
- Eval SOP: user-guide/evals-sdk/eval-sop.md
- Evaluators:
- - Overview: user-guide/evals-sdk/evaluators/overview.md
+ - Overview: user-guide/evals-sdk/evaluators/index.md
- Output: user-guide/evals-sdk/evaluators/output_evaluator.md
- Trajectory: user-guide/evals-sdk/evaluators/trajectory_evaluator.md
- Interactions: user-guide/evals-sdk/evaluators/interactions_evaluator.md
@@ -175,7 +176,7 @@ nav:
- Custom: user-guide/evals-sdk/evaluators/custom_evaluator.md
- Experiment Generator: user-guide/evals-sdk/experiment_generator.md
- Simulators:
- - Overview: user-guide/evals-sdk/simulators/overview.md
+ - Overview: user-guide/evals-sdk/simulators/index.md
- User Simulation: user-guide/evals-sdk/simulators/user_simulation.md
- How-To Guides:
- Experiment Management: user-guide/evals-sdk/how-to/experiment_management.md