Merge pull request #3737 from AI-Hypercomputer:bvandermoon-uxr

Google-ML-Automation · Google-ML-Automation · commit b117f50cf048 · 2026-04-23T17:39:36.000-07:00
PiperOrigin-RevId: 904719578
diff --git a/docs/tutorials/first_run.md b/docs/tutorials/first_run.md
@@ -36,7 +36,7 @@ Local development is a convenient way to run MaxText on a single host. It doesn'
 multiple hosts but is a good way to learn about MaxText.
 
 1. [Create and SSH to the single host VM of your choice](https://cloud.google.com/tpu/docs/managing-tpus-tpu-vm). You can use any available single host TPU, such as `v5litepod-8`, `v5p-8`, or `v4-8`.
-2. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+2. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For this tutorial on TPUs, install `maxtext[tpu]`.
 3. After installation completes, run training on synthetic data with the following command:
 
 ```sh
@@ -70,7 +70,7 @@ You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/bl
 
 ### Run MaxText on NVIDIA GPUs
 
-1. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+1. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For this tutorial on GPUs, install `maxtext[cuda12]`.
 2. After installation is complete, run training with the following command on synthetic data:
 
 ```sh
diff --git a/docs/tutorials/inference.md b/docs/tutorials/inference.md
@@ -25,7 +25,7 @@ We support inference of MaxText models on vLLM via an [out-of-tree](https://gith
 
 # Installation
 
-Follow the instructions in [install maxtext](https://maxtext.readthedocs.io/en/latest/install_maxtext.html) to install MaxText with post-training dependencies. We recommend installing from PyPI to ensure you have the latest stable versionset of dependencies.
+Follow the instructions in [install maxtext](https://maxtext.readthedocs.io/en/latest/install_maxtext.html) to install MaxText. For this inference tutorial on TPU (which uses vLLM), you must install `maxtext[tpu-post-train]`, as it includes the required adapter plugin. We recommend installing from PyPI to ensure you have the latest stable version of dependencies.
 
 After finishing the installation, ensure that the MaxText on vLLM adapter plugin has been installed. To do so, run the following command:
 
diff --git a/docs/tutorials/post_training_index.md b/docs/tutorials/post_training_index.md
@@ -1,5 +1,9 @@
 # Post-training
 
+```{note}
+Post-training workflows on TPU require specific dependencies. Please ensure you have installed MaxText with `maxtext[tpu-post-train]` as described in the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
+```
+
 ## What is MaxText post-training?
 
 MaxText provides performance and scalable LLM and VLM post-training, across a variety of techniques like SFT and GRPO.
diff --git a/docs/tutorials/pretraining.md b/docs/tutorials/pretraining.md
@@ -20,6 +20,10 @@
 
 In this tutorial, we introduce how to run pretraining with real datasets. While synthetic data is commonly used for benchmarking, we rely on real datasets to obtain meaningful weights. Currently, MaxText supports three dataset input pipelines: HuggingFace, Grain, and TensorFlow Datasets (TFDS). We will walk you through: setting up dataset, modifying the [dataset configs](https://github.com/AI-Hypercomputer/maxtext/blob/f11f5507c987fdb57272c090ebd2cbdbbadbd36c/src/maxtext/configs/base.yml#L631-L675) and [tokenizer configs](https://github.com/AI-Hypercomputer/maxtext/blob/f11f5507c987fdb57272c090ebd2cbdbbadbd36c/src/maxtext/configs/base.yml#L566) for training, and optionally enabling evaluation.
 
+```{note}
+Before starting this tutorial, ensure you have installed MaxText following the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html). For pre-training, install `maxtext[tpu]` for TPUs or `maxtext[cuda12]` for GPUs.
+```
+
 To start with, we focus on HuggingFace datasets for convenience.
 
 - Later on, we will give brief examples for Grain and TFDS. For a comprehensive guide, see the [Data Input Pipeline](../guides/data_input_pipeline.md) topic.