diff --git a/backends/apple/mps/setup.md b/backends/apple/mps/setup.md
index c8fdfeb98e4..697d93ea659 100644
--- a/backends/apple/mps/setup.md
+++ b/backends/apple/mps/setup.md
@@ -111,7 +111,7 @@ python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp
 ```
 
 ### Profiling:
-1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) while you're exporting your model.
+1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model.
 ```bash
 cd executorch
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
diff --git a/docs/source/build-run-coreml.md b/docs/source/build-run-coreml.md
index 52755773eed..6e0cc802df6 100644
--- a/docs/source/build-run-coreml.md
+++ b/docs/source/build-run-coreml.md
@@ -87,7 +87,7 @@ cd executorch
 
 Note that profiling is supported on [macOS](https://developer.apple.com/macos) >= 14.4.
 
-1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) when exporting your model.
+1. [Optional] Generate an [ETRecord](./etrecord.rst) when exporting your model.
 ```bash
 cd executorch
 
@@ -108,7 +108,7 @@ cd executorch
 ./coreml_executor_runner --model_path mv3_coreml_all.pte --profile_model --etdump_path etdump.etdp
 ```
 
-4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the [ETDump](./sdk-etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./sdk-etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
+4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the [ETDump](./sdk-etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
 ```bash
 python examples/apple/coreml/scripts/inspector_cli.py --etdump_path etdump.etdp --etrecord_path mv3_coreml.bin
 ```
diff --git a/docs/source/bundled-io.md b/docs/source/bundled-io.md
new file mode 100644
index 00000000000..776c37a5da3
--- /dev/null
+++ b/docs/source/bundled-io.md
@@ -0,0 +1,554 @@
+# Bundled Program -- a Tool for ExecuTorch Model Validation
+
+## Introduction
+`BundledProgram` is a wrapper around the core ExecuTorch program designed to help users wrapping test cases with the model they deploy. `BundledProgram` is not necessarily a core part of the program and not needed for its execution, but is particularly important for various other use-cases, such as model correctness evaluation, including e2e testing during the model bring-up process.
+
+Overall, the procedure can be broken into two stages, and in each stage we are supporting:
+
+* **Emit stage**: Bundling the test I/O cases along with the ExecuTorch program, serializing into flatbuffer.
+* **Runtime stage**: Accessing, executing, and verifying the bundled test cases during runtime.
+
+## Emit stage
+This stage mainly focuses on the creation of a `BundledProgram` and dumping it out to the disk as a flatbuffer file. The main procedure is as follow:
+1. Create a model and emit its ExecuTorch program.
+2. Construct a `List[MethodTestSuite]` to record all test cases that needs to be bundled.
+3. Generate `BundledProgram` by using the emited model and `List[MethodTestSuite]`.
+4. Serialize the `BundledProgram` and dump it out to the disk.
+
+### Step 1: Create a Model and Emit its ExecuTorch Program.
+
+ExecuTorch Program can be emitted from user's model by using ExecuTorch APIs. Follow the [Generate Sample ExecuTorch program](./getting-started-setup.md) or [Exporting to ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
+
+### Step 2: Construct `List[MethodTestSuite]` to hold test info
+
+In `BundledProgram`, we create two new classes, `MethodTestCase` and `MethodTestSuite`, to hold essential info for ExecuTorch program verification.
+
+`MethodTestCase` represents a single testcase. Each `MethodTestCase` contains inputs and expected outputs for a single execution.
+
+:::{dropdown} `MethodTestCase`
+
+```{eval-rst}
+.. autofunction:: executorch.devtools.bundled_program.config.MethodTestCase.__init__
+    :noindex:
+```
+:::
+
+`MethodTestSuite` contains all testing info for single method, including a str representing method name, and a `List[MethodTestCase]` for all testcases:
+
+:::{dropdown} `MethodTestSuite`
+
+```{eval-rst}
+.. autofunction:: executorch.devtools.bundled_program.config.MethodTestSuite
+    :noindex:
+```
+:::
+
+Since each model may have multiple inference methods, we need to generate `List[MethodTestSuite]` to hold all essential infos.
+
+
+### Step 3: Generate `BundledProgram`
+
+We provide `BundledProgram` class under `executorch/devtools/bundled_program/core.py` to bundled the `ExecutorchProgram`-like variable, including
+                            `ExecutorchProgram`, `MultiMethodExecutorchProgram` or `ExecutorchProgramManager`, with the `List[MethodTestSuite]`:
+
+:::{dropdown} `BundledProgram`
+
+```{eval-rst}
+.. autofunction:: executorch.devtools.bundled_program.core.BundledProgram.__init__
+    :noindex:
+```
+:::
+
+Construtor of `BundledProgram `will do sannity check internally to see if the given `List[MethodTestSuite]` matches the given Program's requirements. Specifically:
+1. The method_names of each `MethodTestSuite` in `List[MethodTestSuite]` for should be also in program. Please notice that it is no need to set testcases for every method in the Program.
+2. The metadata of each testcase should meet the requirement of the coresponding inference methods input.
+
+### Step 4: Serialize `BundledProgram` to Flatbuffer.
+
+To serialize `BundledProgram` to make runtime APIs use it, we provide two APIs, both under `executorch/devtools/bundled_program/serialize/__init__.py`.
+
+:::{dropdown} Serialize and Deserialize
+
+```{eval-rst}
+.. currentmodule:: executorch.devtools.bundled_program.serialize
+.. autofunction:: serialize_from_bundled_program_to_flatbuffer
+    :noindex:
+```
+
+```{eval-rst}
+.. currentmodule:: executorch.devtools.bundled_program.serialize
+.. autofunction:: deserialize_from_flatbuffer_to_bundled_program
+    :noindex:
+```
+:::
+
+### Emit Example
+
+Here is a flow highlighting how to generate a `BundledProgram` given a PyTorch model and the representative inputs we want to test it along with.
+
+```python
+import torch
+
+from executorch.exir import to_edge
+from executorch.devtools import BundledProgram
+
+from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
+from executorch.devtools.bundled_program.serialize import (
+    serialize_from_bundled_program_to_flatbuffer,
+)
+from torch.export import export, export_for_training
+
+
+# Step 1: ExecuTorch Program Export
+class SampleModel(torch.nn.Module):
+    """An example model with multi-methods. Each method has multiple input and single output"""
+
+    def __init__(self) -> None:
+        super().__init__()
+        self.a: torch.Tensor = 3 * torch.ones(2, 2, dtype=torch.int32)
+        self.b: torch.Tensor = 2 * torch.ones(2, 2, dtype=torch.int32)
+
+    def forward(self, x: torch.Tensor, q: torch.Tensor) -> torch.Tensor:
+        z = x.clone()
+        torch.mul(self.a, x, out=z)
+        y = x.clone()
+        torch.add(z, self.b, out=y)
+        torch.add(y, q, out=y)
+        return y
+
+
+# Inference method name of SampleModel we want to bundle testcases to.
+# Notices that we do not need to bundle testcases for every inference methods.
+method_name = "forward"
+model = SampleModel()
+
+# Inputs for graph capture.
+capture_input = (
+    (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
+    (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
+)
+
+# Export method's FX Graph.
+method_graph = export(
+    export_for_training(model, capture_input).module(),
+    capture_input,
+)
+
+
+# Emit the traced method into ET Program.
+et_program = to_edge(method_graph).to_executorch()
+
+# Step 2: Construct MethodTestSuite for Each Method
+
+# Prepare the Test Inputs.
+
+# Number of input sets to be verified
+n_input = 10
+
+# Input sets to be verified.
+inputs = [
+    # Each list below is a individual input set.
+    # The number of inputs, dtype and size of each input follow Program's spec.
+    [
+        (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
+        (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
+    ]
+    for _ in range(n_input)
+]
+
+# Generate Test Suites
+method_test_suites = [
+    MethodTestSuite(
+        method_name=method_name,
+        test_cases=[
+            MethodTestCase(
+                inputs=input,
+                expected_outputs=(getattr(model, method_name)(*input), ),
+            )
+            for input in inputs
+        ],
+    ),
+]
+
+# Step 3: Generate BundledProgram
+bundled_program = BundledProgram(et_program, method_test_suites)
+
+# Step 4: Serialize BundledProgram to flatbuffer.
+serialized_bundled_program = serialize_from_bundled_program_to_flatbuffer(
+    bundled_program
+)
+save_path = "bundled_program.bpte"
+with open(save_path, "wb") as f:
+    f.write(serialized_bundled_program)
+
+```
+
+We can also regenerate `BundledProgram` from flatbuffer file if needed:
+
+```python
+from executorch.devtools.bundled_program.serialize import deserialize_from_flatbuffer_to_bundled_program
+save_path = "bundled_program.bpte"
+with open(save_path, "rb") as f:
+    serialized_bundled_program = f.read()
+
+regenerate_bundled_program = deserialize_from_flatbuffer_to_bundled_program(serialized_bundled_program)
+```
+
+## Runtime Stage
+This stage mainly focuses on executing the model with the bundled inputs and and comparing the model's output with the bundled expected output. We provide multiple APIs to handle the key parts of it.
+
+
+### Get ExecuTorch Program Pointer from `BundledProgram` Buffer
+We need the pointer to ExecuTorch program to do the execution. To unify the process of loading and executing `BundledProgram` and Program flatbuffer, we create an API:
+
+:::{dropdown} `GetProgramData`
+
+```{eval-rst}
+.. doxygenfunction:: torch::executor::bundled_program::GetProgramData
+```
+:::
+
+Here's an example of how to use the `GetProgramData` API:
+```c++
+// Assume that the user has read the contents of the file into file_data using
+// whatever method works best for their application. The file could contain
+// either BundledProgram data or Program data.
+void* file_data = ...;
+size_t file_data_len = ...;
+
+// If file_data contains a BundledProgram, GetProgramData() will return a
+// pointer to the Program data embedded inside it. Otherwise it will return
+// file_data, which already pointed to Program data.
+const void* program_ptr;
+size_t program_len;
+status = torch::executor::bundled_program::GetProgramData(
+    file_data, file_data_len, &program_ptr, &program_len);
+ET_CHECK_MSG(
+    status == Error::Ok,
+    "GetProgramData() failed with status 0x%" PRIx32,
+    status);
+```
+
+### Load Bundled Input to Method
+To execute the program on the bundled input, we need to load the bundled input into the method. Here we provided an API called `torch::executor::bundled_program::LoadBundledInput`:
+
+:::{dropdown} `LoadBundledInput`
+
+```{eval-rst}
+.. doxygenfunction:: torch::executor::bundled_program::LoadBundledInput
+```
+:::
+
+### Verify the Method's Output.
+We call `torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput` to verify the method's output with bundled expected outputs. Here's the details of this API:
+
+:::{dropdown} `VerifyResultWithBundledExpectedOutput`
+
+```{eval-rst}
+.. doxygenfunction:: torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput
+```
+:::
+
+
+### Runtime Example
+
+Here we provide an example about how to run the bundled program step by step. Most of the code is borrowed from [executor_runner](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp), and please review that file if you need more info and context:
+
+```c++
+// method_name is the name for the method we want to test
+// memory_manager is the executor::MemoryManager variable for executor memory allocation.
+// program is the ExecuTorch program.
+Result<Method> method = program->load_method(method_name, &memory_manager);
+
+ET_CHECK_MSG(
+    method.ok(),
+    "load_method() failed with status 0x%" PRIx32,
+    method.error());
+
+// Load testset_idx-th input in the buffer to plan
+status = torch::executor::bundled_program::LoadBundledInput(
+        *method,
+        program_data.bundled_program_data(),
+        FLAGS_testset_idx);
+ET_CHECK_MSG(
+    status == Error::Ok,
+    "LoadBundledInput failed with status 0x%" PRIx32,
+    status);
+
+// Execute the plan
+status = method->execute();
+ET_CHECK_MSG(
+    status == Error::Ok,
+    "method->execute() failed with status 0x%" PRIx32,
+    status);
+
+// Verify the result.
+status = torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput(
+        *method,
+        program_data.bundled_program_data(),
+        FLAGS_testset_idx,
+        FLAGS_rtol,
+        FLAGS_atol);
+ET_CHECK_MSG(
+    status == Error::Ok,
+    "Bundle verification failed with status 0x%" PRIx32,
+    status);
+
+```
+
+## Common Errors
+
+Errors will be raised if `List[MethodTestSuites]` doesn't match the `Program`. Here're two common situations:
+
+### Test input doesn't match model's requirement.
+
+Each inference method of PyTorch model has its own requirement for the inputs, like number of input, the dtype of each input, etc. `BundledProgram` will raise error if test input not meet the requirement.
+
+Here's the example of the dtype of test input not meet model's requirement:
+
+```python
+import torch
+
+from executorch.exir import to_edge
+from executorch.devtools import BundledProgram
+
+from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
+from torch.export import export
+
+
+class Module(torch.nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.a = 3 * torch.ones(2, 2, dtype=torch.float)
+        self.b = 2 * torch.ones(2, 2, dtype=torch.float)
+
+    def forward(self, x):
+        out_1 = torch.ones(2, 2, dtype=torch.float)
+        out_2 = torch.ones(2, 2, dtype=torch.float)
+        torch.mul(self.a, x, out=out_1)
+        torch.add(out_1, self.b, out=out_2)
+        return out_2
+
+
+model = Module()
+method_names = ["forward"]
+
+inputs = (torch.ones(2, 2, dtype=torch.float), )
+
+# Find each method of model needs to be traced my its name, export its FX Graph.
+method_graph = export(
+    export_for_training(model, inputs).module(),
+    inputs,
+)
+
+# Emit the traced methods into ET Program.
+et_program = to_edge(method_graph).to_executorch()
+
+# number of input sets to be verified
+n_input = 10
+
+# Input sets to be verified for each inference methods.
+# To simplify, here we create same inputs for all methods.
+inputs = {
+    # Inference method name corresponding to its test cases.
+    m_name: [
+        # NOTE: executorch program needs torch.float, but here is torch.int
+        [
+            torch.randint(-5, 5, (2, 2), dtype=torch.int),
+        ]
+        for _ in range(n_input)
+    ]
+    for m_name in method_names
+}
+
+# Generate Test Suites
+method_test_suites = [
+    MethodTestSuite(
+        method_name=m_name,
+        test_cases=[
+            MethodTestCase(
+                inputs=input,
+                expected_outputs=(getattr(model, m_name)(*input),),
+            )
+            for input in inputs[m_name]
+        ],
+    )
+    for m_name in method_names
+]
+
+# Generate BundledProgram
+
+bundled_program = BundledProgram(et_program, method_test_suites)
+```
+
+:::{dropdown} Raised Error
+
+```
+The input tensor tensor([[-2,  0],
+        [-2, -1]], dtype=torch.int32) dtype shall be torch.float32, but now is torch.int32
+---------------------------------------------------------------------------
+AssertionError                            Traceback (most recent call last)
+Cell In[1], line 72
+     56 method_test_suites = [
+     57     MethodTestSuite(
+     58         method_name=m_name,
+   (...)
+     67     for m_name in method_names
+     68 ]
+     70 # Step 3: Generate BundledProgram
+---> 72 bundled_program = create_bundled_program(program, method_test_suites)
+File /executorch/devtools/bundled_program/core.py:276, in create_bundled_program(program, method_test_suites)
+    264 """Create bp_schema.BundledProgram by bundling the given program and method_test_suites together.
+    265
+    266 Args:
+   (...)
+    271     The `BundledProgram` variable contains given ExecuTorch program and test cases.
+    272 """
+    274 method_test_suites = sorted(method_test_suites, key=lambda x: x.method_name)
+--> 276 assert_valid_bundle(program, method_test_suites)
+    278 bundled_method_test_suites: List[bp_schema.BundledMethodTestSuite] = []
+    280 # Emit data and metadata of bundled tensor
+File /executorch/devtools/bundled_program/core.py:219, in assert_valid_bundle(program, method_test_suites)
+    215 # type of tensor input should match execution plan
+    216 if type(cur_plan_test_inputs[j]) == torch.Tensor:
+    217     # pyre-fixme[16]: Undefined attribute [16]: Item `bool` of `typing.Union[bool, float, int, torch._tensor.Tensor]`
+    218     # has no attribute `dtype`.
+--> 219     assert cur_plan_test_inputs[j].dtype == get_input_dtype(
+    220         program, program_plan_id, j
+    221     ), "The input tensor {} dtype shall be {}, but now is {}".format(
+    222         cur_plan_test_inputs[j],
+    223         get_input_dtype(program, program_plan_id, j),
+    224         cur_plan_test_inputs[j].dtype,
+    225     )
+    226 elif type(cur_plan_test_inputs[j]) in (
+    227     int,
+    228     bool,
+    229     float,
+    230 ):
+    231     assert type(cur_plan_test_inputs[j]) == get_input_type(
+    232         program, program_plan_id, j
+    233     ), "The input primitive dtype shall be {}, but now is {}".format(
+    234         get_input_type(program, program_plan_id, j),
+    235         type(cur_plan_test_inputs[j]),
+    236     )
+AssertionError: The input tensor tensor([[-2,  0],
+        [-2, -1]], dtype=torch.int32) dtype shall be torch.float32, but now is torch.int32
+
+```
+
+:::
+
+### Method name in `BundleConfig` does not exist.
+
+Another common error would be the method name in any `MethodTestSuite` does not exist in Model. `BundledProgram` will raise error and show the non-exist method name:
+
+```python
+import torch
+
+from executorch.exir import to_edge
+from executorch.devtools import BundledProgram
+
+from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
+from torch.export import export
+
+
+class Module(torch.nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.a = 3 * torch.ones(2, 2, dtype=torch.float)
+        self.b = 2 * torch.ones(2, 2, dtype=torch.float)
+
+    def forward(self, x):
+        out_1 = torch.ones(2, 2, dtype=torch.float)
+        out_2 = torch.ones(2, 2, dtype=torch.float)
+        torch.mul(self.a, x, out=out_1)
+        torch.add(out_1, self.b, out=out_2)
+        return out_2
+
+
+model = Module()
+method_names = ["forward"]
+
+inputs = (torch.ones(2, 2, dtype=torch.float),)
+
+# Find each method of model needs to be traced my its name, export its FX Graph.
+method_graph = export(
+    export_for_training(model, inputs).module(),
+    inputs,
+)
+
+# Emit the traced methods into ET Program.
+et_program = to_edge(method_graph).to_executorch()
+
+# number of input sets to be verified
+n_input = 10
+
+# Input sets to be verified for each inference methods.
+# To simplify, here we create same inputs for all methods.
+inputs = {
+    # Inference method name corresponding to its test cases.
+    m_name: [
+        [
+            torch.randint(-5, 5, (2, 2), dtype=torch.float),
+        ]
+        for _ in range(n_input)
+    ]
+    for m_name in method_names
+}
+
+# Generate Test Suites
+method_test_suites = [
+    MethodTestSuite(
+        method_name=m_name,
+        test_cases=[
+            MethodTestCase(
+                inputs=input,
+                expected_outputs=(getattr(model, m_name)(*input),),
+            )
+            for input in inputs[m_name]
+        ],
+    )
+    for m_name in method_names
+]
+
+# NOTE: MISSING_METHOD_NAME is not an inference method in the above model.
+method_test_suites[0].method_name = "MISSING_METHOD_NAME"
+
+# Generate BundledProgram
+bundled_program = BundledProgram(et_program, method_test_suites)
+
+```
+
+:::{dropdown} Raised Error
+
+```
+All method names in bundled config should be found in program.execution_plan,          but {'MISSING_METHOD_NAME'} does not include.
+---------------------------------------------------------------------------
+AssertionError                            Traceback (most recent call last)
+Cell In[3], line 73
+     70 method_test_suites[0].method_name = "MISSING_METHOD_NAME"
+     72 # Generate BundledProgram
+---> 73 bundled_program = create_bundled_program(program, method_test_suites)
+File /executorch/devtools/bundled_program/core.py:276, in create_bundled_program(program, method_test_suites)
+    264 """Create bp_schema.BundledProgram by bundling the given program and method_test_suites together.
+    265
+    266 Args:
+   (...)
+    271     The `BundledProgram` variable contains given ExecuTorch program and test cases.
+    272 """
+    274 method_test_suites = sorted(method_test_suites, key=lambda x: x.method_name)
+--> 276 assert_valid_bundle(program, method_test_suites)
+    278 bundled_method_test_suites: List[bp_schema.BundledMethodTestSuite] = []
+    280 # Emit data and metadata of bundled tensor
+File /executorch/devtools/bundled_program/core.py:141, in assert_valid_bundle(program, method_test_suites)
+    138 method_name_of_program = {e.name for e in program.execution_plan}
+    139 method_name_of_test_suites = {t.method_name for t in method_test_suites}
+--> 141 assert method_name_of_test_suites.issubset(
+    142     method_name_of_program
+    143 ), f"All method names in bundled config should be found in program.execution_plan, \
+    144      but {str(method_name_of_test_suites - method_name_of_program)} does not include."
+    146 # check if method_tesdt_suites has been sorted in ascending alphabetical order of method name.
+    147 for test_suite_id in range(1, len(method_test_suites)):
+AssertionError: All method names in bundled config should be found in program.execution_plan,          but {'MISSING_METHOD_NAME'} does not include.
+```
+:::
diff --git a/docs/source/compiler-delegate-and-partitioner.md b/docs/source/compiler-delegate-and-partitioner.md
index c82af7d98fe..b5f7e0d3d8a 100644
--- a/docs/source/compiler-delegate-and-partitioner.md
+++ b/docs/source/compiler-delegate-and-partitioner.md
@@ -129,7 +129,7 @@ static auto success_with_compiler = register_backend(backend);
 
 ## Developer Tools Integration: Debuggability
 
-Providing consistent debugging experience, be it for runtime failures or performance profiling, is important. ExecuTorch employs native Developer Tools for this purpose, which enables correlating program instructions to original PyTorch code, via debug handles. You can read more about it [here](./sdk-etrecord).
+Providing consistent debugging experience, be it for runtime failures or performance profiling, is important. ExecuTorch employs native Developer Tools for this purpose, which enables correlating program instructions to original PyTorch code, via debug handles. You can read more about it [here](./etrecord).
 
 Delegated program or subgraphs are opaque to ExecuTorch runtime and appear as a special `call_delegate` instruction, which asks corresponding backend to handle the execution of the subgraph or program. Due to the opaque nature of backend delgates, native Developer Tools does not have visibility into delegated program. Thus the debugging, functional or performance, experiences of delegated execution suffers significantly as compared to it's non-delegated counterpart.
 
diff --git a/docs/source/devtools-overview.md b/docs/source/devtools-overview.md
index 771e8db6b95..e18b7f16c64 100644
--- a/docs/source/devtools-overview.md
+++ b/docs/source/devtools-overview.md
@@ -27,7 +27,7 @@ ETRecord (ExecuTorch Record) is an artifact generated during the export process
 
 To draw a rough equivalence to conventional software development ETRecord can be considered as the binary built with debug symbols that is used for debugging in GNU Project debugger (gdb).
 
-More details are available in the [ETRecord documentation](sdk-etrecord.rst) on how to generate and store an ETRecord.
+More details are available in the [ETRecord documentation](etrecord.rst) on how to generate and store an ETRecord.
 
 ### ETDump
 ETDump (ExecuTorch Dump) is the binary blob that is generated by the runtime after running a model. Similarly as above, to draw a rough equivalence to conventional software development, ETDump can be considered as the coredump of ExecuTorch, but in this case within ETDump we store all the performance and debug data that was generated by the runtime during model execution.
diff --git a/docs/source/etrecord.rst b/docs/source/etrecord.rst
new file mode 100644
index 00000000000..63546f43ca6
--- /dev/null
+++ b/docs/source/etrecord.rst
@@ -0,0 +1,40 @@
+Prerequisite | ETRecord - ExecuTorch Record
+===========================================
+
+Overview
+--------
+
+``ETRecord`` is intended to be the debug artifact that is generated by
+users ahead of time (when they export their model to run on ExecuTorch).
+To draw a rough equivalent to conventional software development,
+``ETRecord`` can be considered as the binary built with debug symbols
+that is used for debugging in GNU Debugger (gdb). It is expected that
+the user will supply this to the ExecuTorch Developer Tools in order for
+them to debug and visualize their model.
+
+``ETRecord`` contains numerous components such as:
+
+* Edge dialect graph with debug handles
+* Delegate debug handle maps
+
+The ``ETRecord`` object itself is intended to be opaque to users and they should not access any components inside it directly.
+It should be provided to the `Inspector API <sdk-inspector.html>`__ to link back performance and debug data sourced from the runtime back to the Python source code.
+
+Generating an ``ETRecord``
+--------------------------
+
+The user should use the following API to generate an ``ETRecord`` file. They
+will be expected to provide the Edge Dialect program (returned by the call to ``to_edge()``),
+the ExecuTorch program (returned by the call to ``to_executorch()``), and optional models that
+they are interested in working with via our tooling.
+
+.. warning::
+    Users should do a deepcopy of the output of ``to_edge()`` and pass in the deepcopy to the ``generate_etrecord`` API. This is needed because the subsequent call, ``to_executorch()``, does an in-place mutation and will lose debug data in the process.
+
+.. currentmodule:: executorch.devtools.etrecord._etrecord
+.. autofunction:: generate_etrecord
+
+Using an ``ETRecord``
+---------------------
+
+Pass the ``ETRecord`` as an optional argument into the `Inspector API <sdk-inspector.html>`__ to access this data and  do post-run analysis.
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 20f0c944820..a5b2b4af2e7 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -201,8 +201,8 @@ Topics in this section will help you get started with ExecuTorch.
    :hidden:
 
    devtools-overview
-   sdk-bundled-io
-   sdk-etrecord
+   bundled-io
+   etrecord
    sdk-etdump
    sdk-profiling
    sdk-debugging
diff --git a/docs/source/llm/getting-started.md b/docs/source/llm/getting-started.md
index 771bf489a94..1cfeab6e5e6 100644
--- a/docs/source/llm/getting-started.md
+++ b/docs/source/llm/getting-started.md
@@ -752,7 +752,7 @@ Through the ExecuTorch Developer Tools, users are able to profile model executio
 
 ##### ETRecord generation (Optional)
 
-An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [the ETRecord docs](../sdk-etrecord.md).
+An ETRecord is an artifact generated at the time of export that contains model graphs and source-level metadata linking the ExecuTorch program to the original PyTorch model. You can view all profiling events without an ETRecord, though with an ETRecord, you will also be able to link each event to the types of operators being executed, module hierarchy, and stack traces of the original PyTorch source code. For more information, see [the ETRecord docs](../etrecord.md).
 
 
 In your export script, after calling `to_edge()` and `to_executorch()`, call `generate_etrecord()` with the `EdgeProgramManager` from `to_edge()` and the `ExecuTorchProgramManager` from `to_executorch()`. Make sure to copy the `EdgeProgramManager`, as the call to `to_backend()` mutates the graph in-place.
diff --git a/docs/source/sdk-bundled-io.md b/docs/source/sdk-bundled-io.md
index 776c37a5da3..488ade7bac8 100644
--- a/docs/source/sdk-bundled-io.md
+++ b/docs/source/sdk-bundled-io.md
@@ -1,554 +1,3 @@
 # Bundled Program -- a Tool for ExecuTorch Model Validation
 
-## Introduction
-`BundledProgram` is a wrapper around the core ExecuTorch program designed to help users wrapping test cases with the model they deploy. `BundledProgram` is not necessarily a core part of the program and not needed for its execution, but is particularly important for various other use-cases, such as model correctness evaluation, including e2e testing during the model bring-up process.
-
-Overall, the procedure can be broken into two stages, and in each stage we are supporting:
-
-* **Emit stage**: Bundling the test I/O cases along with the ExecuTorch program, serializing into flatbuffer.
-* **Runtime stage**: Accessing, executing, and verifying the bundled test cases during runtime.
-
-## Emit stage
-This stage mainly focuses on the creation of a `BundledProgram` and dumping it out to the disk as a flatbuffer file. The main procedure is as follow:
-1. Create a model and emit its ExecuTorch program.
-2. Construct a `List[MethodTestSuite]` to record all test cases that needs to be bundled.
-3. Generate `BundledProgram` by using the emited model and `List[MethodTestSuite]`.
-4. Serialize the `BundledProgram` and dump it out to the disk.
-
-### Step 1: Create a Model and Emit its ExecuTorch Program.
-
-ExecuTorch Program can be emitted from user's model by using ExecuTorch APIs. Follow the [Generate Sample ExecuTorch program](./getting-started-setup.md) or [Exporting to ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
-
-### Step 2: Construct `List[MethodTestSuite]` to hold test info
-
-In `BundledProgram`, we create two new classes, `MethodTestCase` and `MethodTestSuite`, to hold essential info for ExecuTorch program verification.
-
-`MethodTestCase` represents a single testcase. Each `MethodTestCase` contains inputs and expected outputs for a single execution.
-
-:::{dropdown} `MethodTestCase`
-
-```{eval-rst}
-.. autofunction:: executorch.devtools.bundled_program.config.MethodTestCase.__init__
-    :noindex:
-```
-:::
-
-`MethodTestSuite` contains all testing info for single method, including a str representing method name, and a `List[MethodTestCase]` for all testcases:
-
-:::{dropdown} `MethodTestSuite`
-
-```{eval-rst}
-.. autofunction:: executorch.devtools.bundled_program.config.MethodTestSuite
-    :noindex:
-```
-:::
-
-Since each model may have multiple inference methods, we need to generate `List[MethodTestSuite]` to hold all essential infos.
-
-
-### Step 3: Generate `BundledProgram`
-
-We provide `BundledProgram` class under `executorch/devtools/bundled_program/core.py` to bundled the `ExecutorchProgram`-like variable, including
-                            `ExecutorchProgram`, `MultiMethodExecutorchProgram` or `ExecutorchProgramManager`, with the `List[MethodTestSuite]`:
-
-:::{dropdown} `BundledProgram`
-
-```{eval-rst}
-.. autofunction:: executorch.devtools.bundled_program.core.BundledProgram.__init__
-    :noindex:
-```
-:::
-
-Construtor of `BundledProgram `will do sannity check internally to see if the given `List[MethodTestSuite]` matches the given Program's requirements. Specifically:
-1. The method_names of each `MethodTestSuite` in `List[MethodTestSuite]` for should be also in program. Please notice that it is no need to set testcases for every method in the Program.
-2. The metadata of each testcase should meet the requirement of the coresponding inference methods input.
-
-### Step 4: Serialize `BundledProgram` to Flatbuffer.
-
-To serialize `BundledProgram` to make runtime APIs use it, we provide two APIs, both under `executorch/devtools/bundled_program/serialize/__init__.py`.
-
-:::{dropdown} Serialize and Deserialize
-
-```{eval-rst}
-.. currentmodule:: executorch.devtools.bundled_program.serialize
-.. autofunction:: serialize_from_bundled_program_to_flatbuffer
-    :noindex:
-```
-
-```{eval-rst}
-.. currentmodule:: executorch.devtools.bundled_program.serialize
-.. autofunction:: deserialize_from_flatbuffer_to_bundled_program
-    :noindex:
-```
-:::
-
-### Emit Example
-
-Here is a flow highlighting how to generate a `BundledProgram` given a PyTorch model and the representative inputs we want to test it along with.
-
-```python
-import torch
-
-from executorch.exir import to_edge
-from executorch.devtools import BundledProgram
-
-from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
-from executorch.devtools.bundled_program.serialize import (
-    serialize_from_bundled_program_to_flatbuffer,
-)
-from torch.export import export, export_for_training
-
-
-# Step 1: ExecuTorch Program Export
-class SampleModel(torch.nn.Module):
-    """An example model with multi-methods. Each method has multiple input and single output"""
-
-    def __init__(self) -> None:
-        super().__init__()
-        self.a: torch.Tensor = 3 * torch.ones(2, 2, dtype=torch.int32)
-        self.b: torch.Tensor = 2 * torch.ones(2, 2, dtype=torch.int32)
-
-    def forward(self, x: torch.Tensor, q: torch.Tensor) -> torch.Tensor:
-        z = x.clone()
-        torch.mul(self.a, x, out=z)
-        y = x.clone()
-        torch.add(z, self.b, out=y)
-        torch.add(y, q, out=y)
-        return y
-
-
-# Inference method name of SampleModel we want to bundle testcases to.
-# Notices that we do not need to bundle testcases for every inference methods.
-method_name = "forward"
-model = SampleModel()
-
-# Inputs for graph capture.
-capture_input = (
-    (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
-    (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
-)
-
-# Export method's FX Graph.
-method_graph = export(
-    export_for_training(model, capture_input).module(),
-    capture_input,
-)
-
-
-# Emit the traced method into ET Program.
-et_program = to_edge(method_graph).to_executorch()
-
-# Step 2: Construct MethodTestSuite for Each Method
-
-# Prepare the Test Inputs.
-
-# Number of input sets to be verified
-n_input = 10
-
-# Input sets to be verified.
-inputs = [
-    # Each list below is a individual input set.
-    # The number of inputs, dtype and size of each input follow Program's spec.
-    [
-        (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
-        (torch.rand(2, 2) - 0.5).to(dtype=torch.int32),
-    ]
-    for _ in range(n_input)
-]
-
-# Generate Test Suites
-method_test_suites = [
-    MethodTestSuite(
-        method_name=method_name,
-        test_cases=[
-            MethodTestCase(
-                inputs=input,
-                expected_outputs=(getattr(model, method_name)(*input), ),
-            )
-            for input in inputs
-        ],
-    ),
-]
-
-# Step 3: Generate BundledProgram
-bundled_program = BundledProgram(et_program, method_test_suites)
-
-# Step 4: Serialize BundledProgram to flatbuffer.
-serialized_bundled_program = serialize_from_bundled_program_to_flatbuffer(
-    bundled_program
-)
-save_path = "bundled_program.bpte"
-with open(save_path, "wb") as f:
-    f.write(serialized_bundled_program)
-
-```
-
-We can also regenerate `BundledProgram` from flatbuffer file if needed:
-
-```python
-from executorch.devtools.bundled_program.serialize import deserialize_from_flatbuffer_to_bundled_program
-save_path = "bundled_program.bpte"
-with open(save_path, "rb") as f:
-    serialized_bundled_program = f.read()
-
-regenerate_bundled_program = deserialize_from_flatbuffer_to_bundled_program(serialized_bundled_program)
-```
-
-## Runtime Stage
-This stage mainly focuses on executing the model with the bundled inputs and and comparing the model's output with the bundled expected output. We provide multiple APIs to handle the key parts of it.
-
-
-### Get ExecuTorch Program Pointer from `BundledProgram` Buffer
-We need the pointer to ExecuTorch program to do the execution. To unify the process of loading and executing `BundledProgram` and Program flatbuffer, we create an API:
-
-:::{dropdown} `GetProgramData`
-
-```{eval-rst}
-.. doxygenfunction:: torch::executor::bundled_program::GetProgramData
-```
-:::
-
-Here's an example of how to use the `GetProgramData` API:
-```c++
-// Assume that the user has read the contents of the file into file_data using
-// whatever method works best for their application. The file could contain
-// either BundledProgram data or Program data.
-void* file_data = ...;
-size_t file_data_len = ...;
-
-// If file_data contains a BundledProgram, GetProgramData() will return a
-// pointer to the Program data embedded inside it. Otherwise it will return
-// file_data, which already pointed to Program data.
-const void* program_ptr;
-size_t program_len;
-status = torch::executor::bundled_program::GetProgramData(
-    file_data, file_data_len, &program_ptr, &program_len);
-ET_CHECK_MSG(
-    status == Error::Ok,
-    "GetProgramData() failed with status 0x%" PRIx32,
-    status);
-```
-
-### Load Bundled Input to Method
-To execute the program on the bundled input, we need to load the bundled input into the method. Here we provided an API called `torch::executor::bundled_program::LoadBundledInput`:
-
-:::{dropdown} `LoadBundledInput`
-
-```{eval-rst}
-.. doxygenfunction:: torch::executor::bundled_program::LoadBundledInput
-```
-:::
-
-### Verify the Method's Output.
-We call `torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput` to verify the method's output with bundled expected outputs. Here's the details of this API:
-
-:::{dropdown} `VerifyResultWithBundledExpectedOutput`
-
-```{eval-rst}
-.. doxygenfunction:: torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput
-```
-:::
-
-
-### Runtime Example
-
-Here we provide an example about how to run the bundled program step by step. Most of the code is borrowed from [executor_runner](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp), and please review that file if you need more info and context:
-
-```c++
-// method_name is the name for the method we want to test
-// memory_manager is the executor::MemoryManager variable for executor memory allocation.
-// program is the ExecuTorch program.
-Result<Method> method = program->load_method(method_name, &memory_manager);
-
-ET_CHECK_MSG(
-    method.ok(),
-    "load_method() failed with status 0x%" PRIx32,
-    method.error());
-
-// Load testset_idx-th input in the buffer to plan
-status = torch::executor::bundled_program::LoadBundledInput(
-        *method,
-        program_data.bundled_program_data(),
-        FLAGS_testset_idx);
-ET_CHECK_MSG(
-    status == Error::Ok,
-    "LoadBundledInput failed with status 0x%" PRIx32,
-    status);
-
-// Execute the plan
-status = method->execute();
-ET_CHECK_MSG(
-    status == Error::Ok,
-    "method->execute() failed with status 0x%" PRIx32,
-    status);
-
-// Verify the result.
-status = torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput(
-        *method,
-        program_data.bundled_program_data(),
-        FLAGS_testset_idx,
-        FLAGS_rtol,
-        FLAGS_atol);
-ET_CHECK_MSG(
-    status == Error::Ok,
-    "Bundle verification failed with status 0x%" PRIx32,
-    status);
-
-```
-
-## Common Errors
-
-Errors will be raised if `List[MethodTestSuites]` doesn't match the `Program`. Here're two common situations:
-
-### Test input doesn't match model's requirement.
-
-Each inference method of PyTorch model has its own requirement for the inputs, like number of input, the dtype of each input, etc. `BundledProgram` will raise error if test input not meet the requirement.
-
-Here's the example of the dtype of test input not meet model's requirement:
-
-```python
-import torch
-
-from executorch.exir import to_edge
-from executorch.devtools import BundledProgram
-
-from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
-from torch.export import export
-
-
-class Module(torch.nn.Module):
-    def __init__(self):
-        super().__init__()
-        self.a = 3 * torch.ones(2, 2, dtype=torch.float)
-        self.b = 2 * torch.ones(2, 2, dtype=torch.float)
-
-    def forward(self, x):
-        out_1 = torch.ones(2, 2, dtype=torch.float)
-        out_2 = torch.ones(2, 2, dtype=torch.float)
-        torch.mul(self.a, x, out=out_1)
-        torch.add(out_1, self.b, out=out_2)
-        return out_2
-
-
-model = Module()
-method_names = ["forward"]
-
-inputs = (torch.ones(2, 2, dtype=torch.float), )
-
-# Find each method of model needs to be traced my its name, export its FX Graph.
-method_graph = export(
-    export_for_training(model, inputs).module(),
-    inputs,
-)
-
-# Emit the traced methods into ET Program.
-et_program = to_edge(method_graph).to_executorch()
-
-# number of input sets to be verified
-n_input = 10
-
-# Input sets to be verified for each inference methods.
-# To simplify, here we create same inputs for all methods.
-inputs = {
-    # Inference method name corresponding to its test cases.
-    m_name: [
-        # NOTE: executorch program needs torch.float, but here is torch.int
-        [
-            torch.randint(-5, 5, (2, 2), dtype=torch.int),
-        ]
-        for _ in range(n_input)
-    ]
-    for m_name in method_names
-}
-
-# Generate Test Suites
-method_test_suites = [
-    MethodTestSuite(
-        method_name=m_name,
-        test_cases=[
-            MethodTestCase(
-                inputs=input,
-                expected_outputs=(getattr(model, m_name)(*input),),
-            )
-            for input in inputs[m_name]
-        ],
-    )
-    for m_name in method_names
-]
-
-# Generate BundledProgram
-
-bundled_program = BundledProgram(et_program, method_test_suites)
-```
-
-:::{dropdown} Raised Error
-
-```
-The input tensor tensor([[-2,  0],
-        [-2, -1]], dtype=torch.int32) dtype shall be torch.float32, but now is torch.int32
----------------------------------------------------------------------------
-AssertionError                            Traceback (most recent call last)
-Cell In[1], line 72
-     56 method_test_suites = [
-     57     MethodTestSuite(
-     58         method_name=m_name,
-   (...)
-     67     for m_name in method_names
-     68 ]
-     70 # Step 3: Generate BundledProgram
----> 72 bundled_program = create_bundled_program(program, method_test_suites)
-File /executorch/devtools/bundled_program/core.py:276, in create_bundled_program(program, method_test_suites)
-    264 """Create bp_schema.BundledProgram by bundling the given program and method_test_suites together.
-    265
-    266 Args:
-   (...)
-    271     The `BundledProgram` variable contains given ExecuTorch program and test cases.
-    272 """
-    274 method_test_suites = sorted(method_test_suites, key=lambda x: x.method_name)
---> 276 assert_valid_bundle(program, method_test_suites)
-    278 bundled_method_test_suites: List[bp_schema.BundledMethodTestSuite] = []
-    280 # Emit data and metadata of bundled tensor
-File /executorch/devtools/bundled_program/core.py:219, in assert_valid_bundle(program, method_test_suites)
-    215 # type of tensor input should match execution plan
-    216 if type(cur_plan_test_inputs[j]) == torch.Tensor:
-    217     # pyre-fixme[16]: Undefined attribute [16]: Item `bool` of `typing.Union[bool, float, int, torch._tensor.Tensor]`
-    218     # has no attribute `dtype`.
---> 219     assert cur_plan_test_inputs[j].dtype == get_input_dtype(
-    220         program, program_plan_id, j
-    221     ), "The input tensor {} dtype shall be {}, but now is {}".format(
-    222         cur_plan_test_inputs[j],
-    223         get_input_dtype(program, program_plan_id, j),
-    224         cur_plan_test_inputs[j].dtype,
-    225     )
-    226 elif type(cur_plan_test_inputs[j]) in (
-    227     int,
-    228     bool,
-    229     float,
-    230 ):
-    231     assert type(cur_plan_test_inputs[j]) == get_input_type(
-    232         program, program_plan_id, j
-    233     ), "The input primitive dtype shall be {}, but now is {}".format(
-    234         get_input_type(program, program_plan_id, j),
-    235         type(cur_plan_test_inputs[j]),
-    236     )
-AssertionError: The input tensor tensor([[-2,  0],
-        [-2, -1]], dtype=torch.int32) dtype shall be torch.float32, but now is torch.int32
-
-```
-
-:::
-
-### Method name in `BundleConfig` does not exist.
-
-Another common error would be the method name in any `MethodTestSuite` does not exist in Model. `BundledProgram` will raise error and show the non-exist method name:
-
-```python
-import torch
-
-from executorch.exir import to_edge
-from executorch.devtools import BundledProgram
-
-from executorch.devtools.bundled_program.config import MethodTestCase, MethodTestSuite
-from torch.export import export
-
-
-class Module(torch.nn.Module):
-    def __init__(self):
-        super().__init__()
-        self.a = 3 * torch.ones(2, 2, dtype=torch.float)
-        self.b = 2 * torch.ones(2, 2, dtype=torch.float)
-
-    def forward(self, x):
-        out_1 = torch.ones(2, 2, dtype=torch.float)
-        out_2 = torch.ones(2, 2, dtype=torch.float)
-        torch.mul(self.a, x, out=out_1)
-        torch.add(out_1, self.b, out=out_2)
-        return out_2
-
-
-model = Module()
-method_names = ["forward"]
-
-inputs = (torch.ones(2, 2, dtype=torch.float),)
-
-# Find each method of model needs to be traced my its name, export its FX Graph.
-method_graph = export(
-    export_for_training(model, inputs).module(),
-    inputs,
-)
-
-# Emit the traced methods into ET Program.
-et_program = to_edge(method_graph).to_executorch()
-
-# number of input sets to be verified
-n_input = 10
-
-# Input sets to be verified for each inference methods.
-# To simplify, here we create same inputs for all methods.
-inputs = {
-    # Inference method name corresponding to its test cases.
-    m_name: [
-        [
-            torch.randint(-5, 5, (2, 2), dtype=torch.float),
-        ]
-        for _ in range(n_input)
-    ]
-    for m_name in method_names
-}
-
-# Generate Test Suites
-method_test_suites = [
-    MethodTestSuite(
-        method_name=m_name,
-        test_cases=[
-            MethodTestCase(
-                inputs=input,
-                expected_outputs=(getattr(model, m_name)(*input),),
-            )
-            for input in inputs[m_name]
-        ],
-    )
-    for m_name in method_names
-]
-
-# NOTE: MISSING_METHOD_NAME is not an inference method in the above model.
-method_test_suites[0].method_name = "MISSING_METHOD_NAME"
-
-# Generate BundledProgram
-bundled_program = BundledProgram(et_program, method_test_suites)
-
-```
-
-:::{dropdown} Raised Error
-
-```
-All method names in bundled config should be found in program.execution_plan,          but {'MISSING_METHOD_NAME'} does not include.
----------------------------------------------------------------------------
-AssertionError                            Traceback (most recent call last)
-Cell In[3], line 73
-     70 method_test_suites[0].method_name = "MISSING_METHOD_NAME"
-     72 # Generate BundledProgram
----> 73 bundled_program = create_bundled_program(program, method_test_suites)
-File /executorch/devtools/bundled_program/core.py:276, in create_bundled_program(program, method_test_suites)
-    264 """Create bp_schema.BundledProgram by bundling the given program and method_test_suites together.
-    265
-    266 Args:
-   (...)
-    271     The `BundledProgram` variable contains given ExecuTorch program and test cases.
-    272 """
-    274 method_test_suites = sorted(method_test_suites, key=lambda x: x.method_name)
---> 276 assert_valid_bundle(program, method_test_suites)
-    278 bundled_method_test_suites: List[bp_schema.BundledMethodTestSuite] = []
-    280 # Emit data and metadata of bundled tensor
-File /executorch/devtools/bundled_program/core.py:141, in assert_valid_bundle(program, method_test_suites)
-    138 method_name_of_program = {e.name for e in program.execution_plan}
-    139 method_name_of_test_suites = {t.method_name for t in method_test_suites}
---> 141 assert method_name_of_test_suites.issubset(
-    142     method_name_of_program
-    143 ), f"All method names in bundled config should be found in program.execution_plan, \
-    144      but {str(method_name_of_test_suites - method_name_of_program)} does not include."
-    146 # check if method_tesdt_suites has been sorted in ascending alphabetical order of method name.
-    147 for test_suite_id in range(1, len(method_test_suites)):
-AssertionError: All method names in bundled config should be found in program.execution_plan,          but {'MISSING_METHOD_NAME'} does not include.
-```
-:::
+Please update your link to <https://pytorch.org/executorch/main/bundled-io.html>. This URL will be deleted after v0.4.0.
diff --git a/docs/source/sdk-debugging.md b/docs/source/sdk-debugging.md
index 4707b4a2f99..80358fe99a1 100644
--- a/docs/source/sdk-debugging.md
+++ b/docs/source/sdk-debugging.md
@@ -13,7 +13,7 @@ Currently, ExecuTorch supports the following debugging flows:
 ### Runtime
 For a real example reflecting the steps below, please refer to [example_runner.cpp](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp).
 
-1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) while exporting your model. When provided, this enables users to link profiling information back to the eager model source code (with stack traces and module hierarchy).
+1. [Optional] Generate an [ETRecord](./etrecord.rst) while exporting your model. When provided, this enables users to link profiling information back to the eager model source code (with stack traces and module hierarchy).
 2. Integrate [ETDump generation](./sdk-etdump.md) into the runtime and set the debugging level by configuring the `ETDumpGen` object. Then, provide an additional buffer to which intermediate outputs and program outputs will be written. Currently we support two levels of debugging:
     - Program level outputs
     ```C++
diff --git a/docs/source/sdk-etrecord.rst b/docs/source/sdk-etrecord.rst
index 63546f43ca6..ee8f9b2b2d2 100644
--- a/docs/source/sdk-etrecord.rst
+++ b/docs/source/sdk-etrecord.rst
@@ -1,40 +1,4 @@
 Prerequisite | ETRecord - ExecuTorch Record
 ===========================================
 
-Overview
---------
-
-``ETRecord`` is intended to be the debug artifact that is generated by
-users ahead of time (when they export their model to run on ExecuTorch).
-To draw a rough equivalent to conventional software development,
-``ETRecord`` can be considered as the binary built with debug symbols
-that is used for debugging in GNU Debugger (gdb). It is expected that
-the user will supply this to the ExecuTorch Developer Tools in order for
-them to debug and visualize their model.
-
-``ETRecord`` contains numerous components such as:
-
-* Edge dialect graph with debug handles
-* Delegate debug handle maps
-
-The ``ETRecord`` object itself is intended to be opaque to users and they should not access any components inside it directly.
-It should be provided to the `Inspector API <sdk-inspector.html>`__ to link back performance and debug data sourced from the runtime back to the Python source code.
-
-Generating an ``ETRecord``
---------------------------
-
-The user should use the following API to generate an ``ETRecord`` file. They
-will be expected to provide the Edge Dialect program (returned by the call to ``to_edge()``),
-the ExecuTorch program (returned by the call to ``to_executorch()``), and optional models that
-they are interested in working with via our tooling.
-
-.. warning::
-    Users should do a deepcopy of the output of ``to_edge()`` and pass in the deepcopy to the ``generate_etrecord`` API. This is needed because the subsequent call, ``to_executorch()``, does an in-place mutation and will lose debug data in the process.
-
-.. currentmodule:: executorch.devtools.etrecord._etrecord
-.. autofunction:: generate_etrecord
-
-Using an ``ETRecord``
----------------------
-
-Pass the ``ETRecord`` as an optional argument into the `Inspector API <sdk-inspector.html>`__ to access this data and  do post-run analysis.
+Please update your link to <https://pytorch.org/executorch/main/etrecord.html>. This URL will be deleted after v0.4.0.
diff --git a/docs/source/sdk-inspector.rst b/docs/source/sdk-inspector.rst
index 4f55271b3fe..4d46915a8af 100644
--- a/docs/source/sdk-inspector.rst
+++ b/docs/source/sdk-inspector.rst
@@ -5,7 +5,7 @@ Overview
 --------
 
 The Inspector APIs provide a convenient interface for analyzing the
-contents of `ETRecord <sdk-etrecord.html>`__ and
+contents of `ETRecord <etrecord.html>`__ and
 `ETDump <sdk-etdump.html>`__, helping developers get insights about model
 architecture and performance statistics. It’s built on top of the `EventBlock Class <#eventblock-class>`__ data structure,
 which organizes a group of `Event <#event-class>`__\ s for easy access to details of profiling events.
diff --git a/docs/source/sdk-profiling.md b/docs/source/sdk-profiling.md
index e17fb1ae48e..945260721e7 100644
--- a/docs/source/sdk-profiling.md
+++ b/docs/source/sdk-profiling.md
@@ -13,7 +13,7 @@ We provide access to all the profiling data via the Python [Inspector API](./sdk
 
 ## Steps to Profile a Model in ExecuTorch
 
-1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) while you're exporting your model. If provided this will enable users to link back profiling details to eager model source code (with stack traces and module hierarchy).
+1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model. If provided this will enable users to link back profiling details to eager model source code (with stack traces and module hierarchy).
 2.  Build the runtime with the pre-processor flags that enable profiling. Detailed in the [ETDump documentation](./sdk-etdump.md).
 3. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md).
 4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1.
diff --git a/docs/source/tutorials_source/devtools-integration-tutorial.py b/docs/source/tutorials_source/devtools-integration-tutorial.py
index b5e335b43d1..92d8e326004 100644
--- a/docs/source/tutorials_source/devtools-integration-tutorial.py
+++ b/docs/source/tutorials_source/devtools-integration-tutorial.py
@@ -20,7 +20,7 @@
 # This tutorial will show a full end-to-end flow of how to utilize the Developer Tools to profile a model.
 # Specifically, it will:
 #
-# 1. Generate the artifacts consumed by the Developer Tools (`ETRecord <../sdk-etrecord.html>`__, `ETDump <../sdk-etdump.html>`__).
+# 1. Generate the artifacts consumed by the Developer Tools (`ETRecord <../etrecord.html>`__, `ETDump <../sdk-etdump.html>`__).
 # 2. Create an Inspector class consuming these artifacts.
 # 3. Utilize the Inspector class to analyze the model profiling result.
 
@@ -124,7 +124,7 @@ def forward(self, x):
 # ---------------
 #
 # Next step is to generate an ``ETDump``. ``ETDump`` contains runtime results
-# from executing a `Bundled Program Model <../sdk-bundled-io.html>`__.
+# from executing a `Bundled Program Model <../bundled-io.html>`__.
 #
 # In this tutorial, a `Bundled Program` is created from the example model above.
 
@@ -296,6 +296,6 @@ def forward(self, x):
 # ^^^^^^^^^^^^^^^
 #
 # - `ExecuTorch Developer Tools Overview <../devtools-overview.html>`__
-# - `ETRecord <../sdk-etrecord.html>`__
+# - `ETRecord <../etrecord.html>`__
 # - `ETDump <../sdk-etdump.html>`__
 # - `Inspector <../sdk-inspector.html>`__
diff --git a/docs/website/docs/tutorials/bundled_program.md b/docs/website/docs/tutorials/bundled_program.md
deleted file mode 100644
index e477d8e6a61..00000000000
--- a/docs/website/docs/tutorials/bundled_program.md
+++ /dev/null
@@ -1,162 +0,0 @@
-DEPRECATED: This document is moving to //executorch/docs/source/sdk-bundled-io.md
-
-# Bundled Program
-
-## Introduction
-Bundled Program is a wrapper around the core ExecuTorch program designed to help users wrapping test cases and other related info with the models they deploy. Bundled Program is not necessarily a core part of the program and not needed for its execution but is more necessary for various other use-cases, especially for model correctness evaluation such as e2e testing during model bring-up etc.
-
-Overall procedure can be broken into two stages, and in each stage we are supporting:
-* **Emit stage**: Bundling test I/O cases as well as other useful info in key-value pairs along with the ExecuTorch program.
-* **Runtime stage**: Accessing, executing and verifying the bundled test cases during runtime.
-
-## Emit stage
-
- This stage mainly focuses on the creation of a BundledProgram, and dump it out to the disk as a flatbuffer file. Please refer to Bento notebook [N2744997](https://www.internalfb.com/intern/anp/view/?id=2744997) for details on how to create a bundled program.
-
-## Runtime Stage
-This stage mainly focuses on executing the model with the bundled inputs and and comparing the model's output with the bundled expected output. We provide multiple APIs to handle the key parts of it.
-
-### Get executorch program ptr from BundledProgram buffer
-We need the pointer to executorch program to do the execution. To unify the process of loading and executing BundledProgram and Program flatbuffer, we create an API:
- ```c++
-
-/**
- * Finds the serialized ExecuTorch program data in the provided file data.
- *
- * The returned buffer is appropriate for constructing a
- * torch::executor::Program.
- *
- * Calling this is only necessary if the file could be a bundled program. If the
- * file will only contain an unwrapped ExecuTorch program, callers can construct
- * torch::executor::Program with file_data directly.
- *
- * @param[in] file_data The contents of an ExecuTorch program or bundled program
- *                      file.
- * @param[in] file_data_len The length of file_data, in bytes.
- * @param[out] out_program_data The serialized Program data, if found.
- * @param[out] out_program_data_len The length of out_program_data, in bytes.
- *
- * @returns Error::Ok if the program was found, and
- *     out_program_data/out_program_data_len point to the data. Other values
- *     on failure.
- */
-Error GetProgramData(
-    void* file_data,
-    size_t file_data_len,
-    const void** out_program_data,
-    size_t* out_program_data_len);
-```
-
-Here's an example of how to use the GetProgramData API:
-```c++
-  // Assume that the user has read the contents of the file into file_data using
-  // whatever method works best for their application. The file could contain
-  // either BundledProgram data or Program data.
-  void* file_data = ...;
-  size_t file_data_len = ...;
-
-  // If file_data contains a BundledProgram, GetProgramData() will return a
-  // pointer to the Program data embedded inside it. Otherwise it will return
-  // file_data, which already pointed to Program data.
-  const void* program_ptr;
-  size_t program_len;
-  status = torch::executor::bundled_program::GetProgramData(
-      buff_ptr.get(), buff_len, &program_ptr, &program_len);
-  ET_CHECK_MSG(
-      status == Error::Ok,
-      "GetProgramData() failed with status 0x%" PRIx32,
-      status);
-```
-
-### Load bundled input to ExecutionPlan
-To execute the program on the bundled input, we need to load the bundled input into the ExecutionPlan. Here we provided an API called `torch::executor::bundled_program::LoadBundledInput`:
-
-```c++
-
-/**
- * Load testset_idx-th bundled input of method_idx-th Method test in
- * bundled_program_ptr to given Method.
- *
- * @param[in] method The Method to verify.
- * @param[in] bundled_program_ptr The bundled program contains expected output.
- * @param[in] testset_idx  The index of input needs to be set into given Method.
- *
- * @returns Return Error::Ok if load successfully, or the error happens during
- * execution.
- */
-ET_NODISCARD Error LoadBundledInput(
-    Method& method,
-    serialized_bundled_program* bundled_program_ptr,
-    size_t testset_idx);
-```
-
-### Verify the plan's output.
-We call `torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput` to verify the method's output with bundled expected outputs. Here's the details of this API:
-
-```c++
-/**
- * Compare the Method's output with testset_idx-th bundled expected
- * output in method_idx-th Method test.
- *
- * @param[in] method The Method to extract outputs from.
- * @param[in] bundled_program_ptr The bundled program contains expected output.
- * @param[in] testset_idx  The index of expected output needs to be compared.
- * @param[in] rtol Relative tolerance used for data comparsion.
- * @param[in] atol Absolute tolerance used for data comparsion.
- *
- * @returns Return Error::Ok if two outputs match, or the error happens during
- * execution.
- */
-ET_NODISCARD Error VerifyResultWithBundledExpectedOutput(
-    Method& method,
-    serialized_bundled_program* bundled_program_ptr,
-    size_t testset_idx,
-    double rtol = 1e-5,
-    double atol = 1e-8);
-
-```
-
-### Example
-
-Here we provide an example about how to run the bundled program step by step.
-
-```c++
-    // method_name is the name for the method we want to test
-    // memory_manager is the executor::MemoryManager variable for executor memory allocation.
-    // program is the executorch program.
-    Result<Method> method = program->load_method(method_name, &memory_manager);
-    ET_CHECK_MSG(
-        method.ok(),
-        "load_method() failed with status 0x%" PRIx32,
-        method.error());
-
-    // Load testset_idx-th input in the buffer to plan
-    status = torch::executor::bundled_program::LoadBundledInput(
-          *method,
-          program_data.bundled_program_data(),
-          FLAGS_testset_idx);
-      ET_CHECK_MSG(
-          status == Error::Ok,
-          "LoadBundledInput failed with status 0x%" PRIx32,
-          status);
-
-    // Execute the plan
-    status = method->execute();
-    ET_CHECK_MSG(
-        status == Error::Ok,
-        "method->execute() failed with status 0x%" PRIx32,
-        status);
-
-    // Verify the result.
-    status = torch::executor::bundled_program::VerifyResultWithBundledExpectedOutput(
-          *method,
-          program_data.bundled_program_data(),
-          FLAGS_testset_idx,
-          FLAGS_rtol,
-          FLAGS_atol);
-      ET_CHECK_MSG(
-          status == Error::Ok,
-          "Bundle verification failed with status 0x%" PRIx32,
-          status);
-
-```
diff --git a/examples/devtools/README.md b/examples/devtools/README.md
index 36cc746d3fe..c06d3eac3fc 100644
--- a/examples/devtools/README.md
+++ b/examples/devtools/README.md
@@ -11,7 +11,7 @@ examples/devtools
 
 ## BundledProgram
 
-We will use an example model (in `torch.nn.Module`) and its representative inputs, both from [`models/`](../models) directory, to generate a [BundledProgram(`.bpte`)](../../docs/source/sdk-bundled-io.md) file using the [script](scripts/export_bundled_program.py). Then we will use [devtools/example_runner](example_runner/example_runner.cpp) to execute the `.bpte` model on the ExecuTorch runtime and verify the model on BundledProgram API.
+We will use an example model (in `torch.nn.Module`) and its representative inputs, both from [`models/`](../models) directory, to generate a [BundledProgram(`.bpte`)](../../docs/source/bundled-io.md) file using the [script](scripts/export_bundled_program.py). Then we will use [devtools/example_runner](example_runner/example_runner.cpp) to execute the `.bpte` model on the ExecuTorch runtime and verify the model on BundledProgram API.
 
 
 1. Sets up the basic development environment for ExecuTorch by [Setting up ExecuTorch from GitHub](https://pytorch.org/executorch/stable/getting-started-setup).
diff --git a/extension/pybindings/pybindings.pyi b/extension/pybindings/pybindings.pyi
index f3d3c1f8d1d..51c134de1ff 100644
--- a/extension/pybindings/pybindings.pyi
+++ b/extension/pybindings/pybindings.pyi
@@ -164,7 +164,7 @@ def _load_for_executorch_from_bundled_program(
 ) -> ExecuTorchModule:
     """Same as _load_for_executorch, but takes a bundled program instead of a file path.
 
-    See https://pytorch.org/executorch/stable/sdk-bundled-io.html for documentation.
+    See https://pytorch.org/executorch/stable/bundled-io.html for documentation.
 
     .. warning::