AI-Hypercomputer · copybara-service · Mar 7, 2026 · Mar 4, 2026
@@ -7,12 +7,12 @@ Before you run ML workload on Multihost with GCE or GKE, simply apply `bash pref
 
 Here is an example for GCE:
 ```
-bash preflight.sh PLATFORM=GCE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
+bash preflight.sh PLATFORM=GCE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
 ```
 
 Here is an example for GKE:
 ```
-bash preflight.sh PLATFORM=GKE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
+bash preflight.sh PLATFORM=GKE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
 ```
 
 # Optimization 2: Numa binding (You can only apply this to v4 and v5p)
@@ -22,14 +22,14 @@ For GCE,
 [preflight.sh](https://github.com/google/maxtext/blob/main/preflight.sh) will help you install `numactl` dependency, so you can use it directly, here is an example:
 
 ```
-bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
+bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
 ```
 
 For GKE,
 `numactl` should be built into your docker image from [maxtext_tpu_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/dependencies/dockerfiles/maxtext_tpu_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example
 
 ```
-bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
+bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
 ```
 
 1. `numactl`: This is the command-line tool used for controlling NUMA policy for processes or shared memory. It's particularly useful on multi-socket systems where memory locality can impact performance.

@@ -14,7 +14,7 @@ Two approaches are here:
 CLUSTER=my-cluster
 ZONE=my-zone
 PROJECT=my-project
-python3 -m benchmarks.benchmark_runner xpk --project $PROJECT --zone $ZONE --cluster_name $CLUSTER --device_type v6e-256 --base_output_directory gs://maxtext-experiments-tpem/ --num_steps=5
+python3 -m benchmarks.benchmark_runner xpk --project ${PROJECT?} --zone ${ZONE?} --cluster_name ${CLUSTER?} --device_type v6e-256 --base_output_directory gs://maxtext-experiments-tpem/ --num_steps=5
 ```
 
 ```shell
@@ -23,7 +23,7 @@ export RUNNER=us-docker.pkg.dev/path/to/maxtext_runner
 export PROXY_IMAGE=us-docker.pkg.dev/cloud-tpu-v2-images/pathways/proxy_server
 export SERVER_IMAGE=us-docker.pkg.dev/cloud-tpu-v2-images/pathways/server
 
-python3 -m benchmarks.benchmark_runner xpk --project $PROJECT --zone $ZONE --cluster_name $CLUSTER --device_type v6e-256 --base_output_directory gs://maxtext-experiments-tpem/ --num_steps=5 --pathways_server_image="${SERVER_IMAGE}" --pathways_proxy_server_image="${PROXY_IMAGE}" --pathways_runner_image="${RUNNER}"
+python3 -m benchmarks.benchmark_runner xpk --project ${PROJECT?} --zone ${ZONE?} --cluster_name ${CLUSTER?} --device_type v6e-256 --base_output_directory gs://maxtext-experiments-tpem/ --num_steps=5 --pathways_server_image="${SERVER_IMAGE?}" --pathways_proxy_server_image="${PROXY_IMAGE?}" --pathways_runner_image="${RUNNER?}"
 ```
 
 ```shell

@@ -131,34 +131,34 @@ export ICI_EXPERT_PARALLELISM=2
 # 2. Define the Command to Run on the Cluster
 # ==============================================================================
 # This command installs dependencies and then starts the server.
-CMD="export HF_TOKEN=${HF_TOKEN} && \
+CMD="export HF_TOKEN=${HF_TOKEN?} && \
      pip install --upgrade pip && \
      pip install -r benchmarks/api_server/requirements.txt && \
      bash benchmarks/api_server/start_server.sh \
         maxtext/configs/base.yml \
-        model_name="${MODEL_NAME}" \
-        tokenizer_path="${TOKENIZER_PATH}" \
-        load_parameters_path="${LOAD_PARAMETERS_PATH}" \
-        per_device_batch_size=${PER_DEVICE_BATCH_SIZE} \
-        ici_tensor_parallelism=${ICI_TENSOR_PARALLELISM} \
-        ici_expert_parallelism=${ICI_EXPERT_PARALLELISM} \
+        model_name="${MODEL_NAME?}" \
+        tokenizer_path="${TOKENIZER_PATH?}" \
+        load_parameters_path="${LOAD_PARAMETERS_PATH?}" \
+        per_device_batch_size=${PER_DEVICE_BATCH_SIZE?} \
+        ici_tensor_parallelism=${ICI_TENSOR_PARALLELISM?} \
+        ici_expert_parallelism=${ICI_EXPERT_PARALLELISM?} \
         tokenizer_type=\"huggingface\" \
         return_log_prob=True"
 
 
 # ==============================================================================
 # 3. Launch the Workload
 # ==============================================================================
-echo "Launching workload ${RUNNAME}..."
-xpk workload create --workload "${RUNNAME}" \
-  --base-docker-image "${DOCKER_IMAGE}" \
-  --command "${CMD}" \
+echo "Launching workload ${RUNNAME?}..."
+xpk workload create --workload "${RUNNAME?}" \
+  --base-docker-image "${DOCKER_IMAGE?}" \
+  --command "${CMD?}" \
   --num-slices=1  \
-  --cluster "${CLUSTER}" --device-type "${DEVICE_TYPE}" --project "${PROJECT}" --zone "${ZONE}"
+  --cluster "${CLUSTER?}" --device-type "${DEVICE_TYPE?}" --project "${PROJECT?}" --zone "${ZONE?}"
 
-echo "Workload ${RUNNAME} created."
+echo "Workload ${RUNNAME?} created."
 echo "Use the following command to connect:"
-echo "bash benchmarks/api_server/port_forward_xpk.sh job_name=${RUNNAME} project=${PROJECT} zone=${ZONE} cluster=${CLUSTER}"
+echo "bash benchmarks/api_server/port_forward_xpk.sh job_name=${RUNNAME?} project=${PROJECT?} zone=${ZONE?} cluster=${CLUSTER?}"
 ```
 
 ### 2. Launch the Workload

@@ -55,7 +55,7 @@ If we want to pass custom flags this is also possible by specifying
 Useful checking for the existence of SDC on TPU hardware.
 
 ```
-bash maxtest.sh --project $TPU_PROJECT --cluster $CLUSTER --region $REGION --nodepool $NODEPOOL_NAME --num_workers $NUM_WORKERS --libtpu_args '--xla_tpu_enable_sdc_checker'
+bash maxtest.sh --project ${TPU_PROJECT?} --cluster ${CLUSTER?} --region ${REGION?} --nodepool ${NODEPOOL_NAME?} --num_workers ${NUM_WORKERS?} --libtpu_args '--xla_tpu_enable_sdc_checker'
 ```
 
 

@@ -37,8 +37,8 @@ First, make sure python3 virtual environment for MaxText is set up and enabled.
 ```bash
 export VENV_NAME=<your virtual env name> # e.g., maxtext_venv
 pip install uv
-uv venv --python 3.12 --seed $VENV_NAME
-source $VENV_NAME/bin/activate
+uv venv --python 3.12 --seed ${VENV_NAME?}
+source ${VENV_NAME?}/bin/activate
 ```
 
 Second, ensure you have the necessary dependencies installed (PyTorch for the conversion script).
@@ -68,16 +68,16 @@ Finally, run below command to complete the conversion
 
 ```bash
 python3 -m maxtext.checkpoint_conversion.to_maxtext maxtext/configs/base.yml \
-    model_name=${HF_MODEL} \
-    hf_access_token=${HF_TOKEN} \
-    base_output_directory=${MODEL_CHECKPOINT_DIRECTORY} \
+    model_name=${HF_MODEL?} \
+    hf_access_token=${HF_TOKEN?} \
+    base_output_directory=${MODEL_CHECKPOINT_DIRECTORY?} \
     scan_layers=True \
     use_multimodal=false \
     hardware=cpu \
     skip_jax_distributed_system=true \
-    checkpoint_storage_use_zarr3=${USE_ZARR3} \
-    checkpoint_storage_use_ocdbt=${USE_OCDBT} \
-    --lazy_load_tensors=${LAZY_LOAD_TENSORS}
+    checkpoint_storage_use_zarr3=${USE_ZARR3?} \
+    checkpoint_storage_use_ocdbt=${USE_OCDBT?} \
+    --lazy_load_tensors=${LAZY_LOAD_TENSORS?}
 ```
 
 **Key arguments:**

@@ -75,8 +75,8 @@ In this scenario, you should configure each pod in that slice with a ramdisk of
    ```
 2. **Configure gcloud:**
    ```bash
-   gcloud config set project ${PROJECT_ID}
-   gcloud config set compute/zone ${ZONE}
+   gcloud config set project ${PROJECT_ID?}
+   gcloud config set compute/zone ${ZONE?}
    ```
 3. **Clone the XPK repository:**
    ```bash
@@ -85,15 +85,15 @@ In this scenario, you should configure each pod in that slice with a ramdisk of
 4. **Run the cluster creation command:**
    ```bash
    python3 xpk/xpk.py cluster create \
-   --cluster ${CLUSTER_NAME} \
-   --cluster-cpu-machine-type=${MACHINE_TYPE} \
-   --num-slices=${NUM_SLICES} \
-   --tpu-type=${TPU_TYPE} \
+   --cluster ${CLUSTER_NAME?} \
+   --cluster-cpu-machine-type=${MACHINE_TYPE?} \
+   --num-slices=${NUM_SLICES?} \
+   --tpu-type=${TPU_TYPE?} \
    --enable-mtc \
    --enable-gcsfuse-csi-driver \
-   --mtc-ramdisk-size=${RAMDISK_SIZE} \
-   --mtc-gcs-bucket=${OUTPUT_PATH} \
-   --gke-version=${GKE_VERSION}
+   --mtc-ramdisk-size=${RAMDISK_SIZE?} \
+   --mtc-gcs-bucket=${OUTPUT_PATH?} \
+   --gke-version=${GKE_VERSION?}
    ```
 
 ## MaxText configuration
@@ -150,12 +150,12 @@ The flags below would give the user access to the ramdisk in their workload:
 
    ```bash
    python3 xpk/xpk.py workload create \
-   --cluster ${CLUSTER_NAME} \
-   --docker-image ${DOCKER_IMAGE} \
-   --workload ${WORKLOAD_NAME} \
-   --tpu-type=${TPU_TYPE} \
-   --num-slices=${NUM_SLICES} \
-   --ramdisk-directory=${RAMDISK_DIRECTORY} \
+   --cluster ${CLUSTER_NAME?} \
+   --docker-image ${DOCKER_IMAGE?} \
+   --workload ${WORKLOAD_NAME?} \
+   --tpu-type=${TPU_TYPE?} \
+   --num-slices=${NUM_SLICES?} \
+   --ramdisk-directory=${RAMDISK_DIRECTORY?} \
    --mtc-enabled \
-   --command "python3 src/maxtext/trainers/pre_train/train.py src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH dataset_path=$DATA_PATH steps=120 per_device_batch_size=6 enable_checkpoint_cloud_logger=True checkpoint_period=${CHECKPOINT_PEROID} enable_emergency_checkpoint=True local_checkpoint_period=${LOCAL_CHECKPOINT_PERIOD} local_checkpoint_directory=/${RAMDISK_DIRECTORY}"
+   --command "python3 src/maxtext/trainers/pre_train/train.py src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} steps=120 per_device_batch_size=6 enable_checkpoint_cloud_logger=True checkpoint_period=${CHECKPOINT_PEROID?} enable_emergency_checkpoint=True local_checkpoint_period=${LOCAL_CHECKPOINT_PERIOD?} local_checkpoint_directory=/${RAMDISK_DIRECTORY?}"
    ```
@@ -105,8 +105,8 @@ In this scenario, you should configure each pod in that slice with a ramdisk of
    ```
 2. **Configure gcloud:**
    ```bash
-   gcloud config set project ${PROJECT_ID}
-   gcloud config set compute/zone ${ZONE}
+   gcloud config set project ${PROJECT_ID?}
+   gcloud config set compute/zone ${ZONE?}
    ```
 3. **Clone the XPK repository:**
    ```bash
@@ -115,15 +115,15 @@ In this scenario, you should configure each pod in that slice with a ramdisk of
 4. **Run the cluster creation command:**
    ```bash
    python3 xpk/xpk.py cluster create \
-   --cluster ${CLUSTER_NAME} \
-   --cluster-cpu-machine-type=${MACHINE_TYPE} \
-   --num-slices=${NUM_SLICES} \
-   --tpu-type=${TPU_TYPE} \
+   --cluster ${CLUSTER_NAME?} \
+   --cluster-cpu-machine-type=${MACHINE_TYPE?} \
+   --num-slices=${NUM_SLICES?} \
+   --tpu-type=${TPU_TYPE?} \
    --enable-mtc \
    --enable-gcsfuse-csi-driver \
-   --mtc-ramdisk-size=${RAMDISK_SIZE} \
-   --mtc-gcs-bucket=${OUTPUT_PATH} \
-   --gke-version=${GKE_VERSION}
+   --mtc-ramdisk-size=${RAMDISK_SIZE?} \
+   --mtc-gcs-bucket=${OUTPUT_PATH?} \
+   --gke-version=${GKE_VERSION?}
    ```
 
 ## MaxText configuration
@@ -179,12 +179,12 @@ The flags below would give the user access to the ramdisk in their workload:
 
    ```bash
    python3 xpk/xpk.py workload create \
-   --cluster ${CLUSTER_NAME} \
-   --docker-image ${DOCKER_IMAGE} \
-   --workload ${WORKLOAD_NAME} \
-   --tpu-type=${TPU_TYPE} \
-   --num-slices=${NUM_SLICES} \
-   --ramdisk-directory=${RAMDISK_DIRECTORY} \
+   --cluster ${CLUSTER_NAME?} \
+   --docker-image ${DOCKER_IMAGE?} \
+   --workload ${WORKLOAD_NAME?} \
+   --tpu-type=${TPU_TYPE?} \
+   --num-slices=${NUM_SLICES?} \
+   --ramdisk-directory=${RAMDISK_DIRECTORY?} \
    --mtc-enabled  \
-   --command "python3 src/maxtext/trainers/pre_train/train.py src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH dataset_path=$DATA_PATH steps=120 per_device_batch_size=6 enable_checkpoint_cloud_logger=True checkpoint_period=${CHECKPOINT_PEROID} enable_multi_tier_checkpointing=True local_checkpoint_period=${LOCAL_CHECKPOINT_PERIOD} local_checkpoint_directory=/${RAMDISK_DIRECTORY} multi_tier_checkpointing_backup_interval_minutes=${MULTI_TIER_CHECKPOINTING_BACKUP_INT_MIN}"
+   --command "python3 src/maxtext/trainers/pre_train/train.py src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} steps=120 per_device_batch_size=6 enable_checkpoint_cloud_logger=True checkpoint_period=${CHECKPOINT_PEROID?} enable_multi_tier_checkpointing=True local_checkpoint_period=${LOCAL_CHECKPOINT_PERIOD?} local_checkpoint_directory=/${RAMDISK_DIRECTORY?} multi_tier_checkpointing_backup_interval_minutes=${MULTI_TIER_CHECKPOINTING_BACKUP_INT_MIN?}"
    ```
@@ -38,9 +38,9 @@ Grain ensures determinism in data input pipelines by saving the pipeline's state
 
 ```sh
 bash tools/setup/setup_gcsfuse.sh \
-DATASET_GCS_BUCKET=$BUCKET_NAME \
-MOUNT_PATH=$MOUNT_PATH \
-[FILE_PATH=$MOUNT_PATH/my_dataset]
+DATASET_GCS_BUCKET=${BUCKET_NAME?} \
+MOUNT_PATH=${MOUNT_PATH?} \
+[FILE_PATH=${MOUNT_PATH?}/my_dataset]
 ```
 
 Note that `FILE_PATH` is optional; when provided, the script runs `ls -R` for pre-filling the metadata cache (see ["Performance tuning best practices" on the Google Cloud documentation](https://cloud.google.com/storage/docs/cloud-storage-fuse/performance#improve-first-time-reads)).

@@ -89,17 +89,17 @@ Please use a unique workload name, unless you intend to monitor cumulative Goodp
 MaxText enables Goodput recording and monitoring by default with `enable_goodput_recording=True` and `monitor_goodput=True`. You can configure the goodput upload frequency by setting `goodput_upload_interval_seconds`.
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH \
-  dataset_path=$DATA_PATH run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} \
+  dataset_path=${DATA_PATH?} run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30
 ```
 
 #### How to monitor step time deviation
 
 MaxText enables step time deviation monitoring by default with `monitor_step_time_deviation=True`. You can configure the upload frequency by setting `step_deviation_interval_seconds`.
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH \
-  dataset_path=$DATA_PATH run_name=goodput-test-run steps=200 step_deviation_interval_seconds=30
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} \
+  dataset_path=${DATA_PATH?} run_name=goodput-test-run steps=200 step_deviation_interval_seconds=30
 ```
 
 #### How to enable Pathways Goodput
@@ -111,7 +111,7 @@ Enabling `enable_pathways_goodput` turns on Goodput measurement for Pathways wor
 ```
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH dataset_path=$DATA_PATH \
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
   run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30 enable_pathways_goodput=True
 ```
 
@@ -168,7 +168,7 @@ and `enable_gcp_step_deviation_metrics` to `False` for disabling step deviation
 metrics.
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=$OUTPUT_PATH dataset_path=$DATA_PATH \
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml base_output_directory=${OUTPUT_PATH?} dataset_path=${DATA_PATH?} \
   run_name=goodput-test-run steps=200 goodput_upload_interval_seconds=30 enable_gcp_goodput_metrics=False \
   enable_gcp_step_deviation_metrics=False
 ```

@@ -87,7 +87,7 @@ Common options for the `quantization` flag when using Qwix include:
 Here is an example of how to run a training job with int8 quantization enabled via Qwix:
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=true quantization='int8'
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=true quantization='int8'
 ```
 
 #### The Qwix Interception API
@@ -142,7 +142,7 @@ When using AQT, you can pass one of the following values to the `quantization` f
 #### Example command for AQT
 
 ```bash
-python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=false quantization='int8'
+python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?} base_output_directory=gs://<my-bucket> dataset_type=synthetic use_qwix_quantization=false quantization='int8'
 ```
 
 Note that `use_qwix_quantization` is not set to `True`.

@@ -59,7 +59,7 @@ After the installation is complete, run a short training job using synthetic dat
 
 ```bash
 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   dataset_type=synthetic \
   steps=10
@@ -73,7 +73,7 @@ To demonstrate model output, run the following command:
 
 ```bash
 python3 -m maxtext.inference.decode src/maxtext/configs/base.yml \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   per_device_batch_size=1
 ```
@@ -94,7 +94,7 @@ To use a pre-configured model for TPUs, you override the `model_name` parameter,
 ```bash
 python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
   model_name=llama3-8b \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   dataset_type=synthetic \
   steps=10
@@ -108,7 +108,7 @@ python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
 ```bash
 python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
   model_name=qwen3-4b \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   dataset_type=synthetic \
   steps=10
@@ -125,7 +125,7 @@ To use a GPU-optimized configuration, you should specify the path to the model's
 
 ```bash
 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/gpu/models/mixtral_8x7b.yml \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   dataset_type=synthetic \
   steps=10
@@ -140,7 +140,7 @@ This will load `gpu/mixtral_8x7b.yml`, which inherits from `base.yml`.
 
 ```bash
 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/gpu/models/llama3-8b.yml \
-  run_name=$YOUR_JOB_NAME \
+  run_name=${YOUR_JOB_NAME?} \
   base_output_directory=gs://<my-bucket> \
   dataset_type=synthetic \
   steps=10
-Original file line number
+Diff line change
@@ Expand Up @@
     Useful checking for the existence of SDC on TPU hardware.
     ```
-    bash maxtest.sh --project $TPU_PROJECT --cluster $CLUSTER --region $REGION --nodepool $NODEPOOL_NAME --num_workers $NUM_WORKERS --libtpu_args '--xla_tpu_enable_sdc_checker'
+    bash maxtest.sh --project ${TPU_PROJECT?} --cluster ${CLUSTER?} --region ${REGION?} --nodepool ${NODEPOOL_NAME?} --num_workers ${NUM_WORKERS?} --libtpu_args '--xla_tpu_enable_sdc_checker'
     ```
@@ Expand Down @@