Sweep across KV cache layouts #662

yeandy · 2024-05-21T22:56:11Z

Sweep across different sharding configurations for KV cache. Will be used in our automation infra here GoogleCloudPlatform/ml-auto-solutions#288

User needs to set the config's inference_metadata_file, which is a path to a
json file.

This json should contain the following keys:

two_axis_order_product_id_list: comma separated string of two_axis_order_product_id
prefill_cache_axis_order_list: comma delimited string of prefill_cache_axis
ar_cache_axis_order_list: comma delimited string of ar_cache_axis
accelerator: name of the accelerator
flatten_microbenchmark_results: Whether or not to flatten results. Should
be true

MaxText/configs/base.yml

MaxText/inference_microbenchmark_sweep.py

MaxText/inference_microbenchmark.py

morgandu

Thank you Andy for the great work! Overall looks good, and I am happy to see the first pass results.

Some comments/suggestions:

I think my overall goal is try to get rid of MaxText/inference_microbenchmark_sweep.py and let MaxText/inference_microbenchmark.py being self contained.

On the ml_auto_solutions side, any sweeping now or later can either use existing flags(base.yml) or we may need to introduce new flags as part of the experiment. Have a manual test run, then scale up for more experiments. It'd be great if there is no extra / minimum code requirement between the manual test, and ml_auto_solutions.

MaxText/inference_microbenchmark.py

MaxText/inference_microbenchmark_sweep.py

morgandu · 2024-05-31T17:04:45Z

LGTM on my side!

Adding @patemotter for visibility since he may need to use this soon.

MaxText/inference_microbenchmark_sweep.py

morgandu · 2024-06-03T19:13:01Z

MaxText/inference_microbenchmark_sweep.py

+    # Manually update the config
+    # Don't set key_value_axis_order_product_id; otherwise it will recompute
+    # ar_key_axis_order and ar_value_axis_order
+    quant = 'bf16' if not config.quantization else config.quantization
+    run_name = (
+      f"{inference_metadata['accelerator']}-{config.model_name}-"
+      f"{quant}-{key_value_axis_order_product_id}-{prefill_key_axis_order}-"
+      f"{ar_key_axis_order}"
+    )
+    tensorboard_dir = os.path.join(config.base_output_directory, run_name, "tensorboard", "")
+    checkpoint_dir = os.path.join(config.base_output_directory, run_name, "checkpoint", "")
+    metrics_dir = os.path.join(config.base_output_directory, run_name, "metrics", "")


There are quant and quantize_kvcache and different combination of these two, as discussed, we will create different test_config in xlml, and the base_run_name should already have all the information to differentiate the runs

morgandu · 2024-06-03T19:13:23Z

MaxText/inference_microbenchmark_sweep.py

+    pyconfig._config.keys['tensorboard_dir'] = tensorboard_dir # pylint: disable=protected-access
+    pyconfig._config.keys['checkpoint_dir'] = checkpoint_dir # pylint: disable=protected-access
+    pyconfig._config.keys['metrics_dir'] = metrics_dir # pylint: disable=protected-access


I don't think checkpoint_dir and metrics_dir are used at all?

MaxText/inference_microbenchmark_sweep.py

yeandy

Please take a look

MaxText/inference_microbenchmark.py

MaxText/inference_microbenchmark_sweep.py

yeandy · 2024-06-17T16:54:12Z

@morgandu Can you take a final look? Anything else we need to add?

morgandu · 2024-06-24T20:05:59Z

Final LGTM! Though the PR description need to be updated! since we have prefill_cache_axis_order and ar_cache_axis_order now.

yeandy · 2024-06-24T20:13:35Z

Updated description.

morgandu reviewed May 22, 2024

View reviewed changes

MaxText/configs/base.yml Outdated Show resolved Hide resolved

morgandu reviewed May 22, 2024

View reviewed changes

MaxText/configs/base.yml Outdated Show resolved Hide resolved

morgandu reviewed May 22, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Outdated Show resolved Hide resolved

morgandu reviewed May 22, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu requested changes May 22, 2024

View reviewed changes

morgandu force-pushed the mor--kv-cache-layout branch 3 times, most recently from 2c3ecf8 to 185563d Compare May 23, 2024 18:14

morgandu force-pushed the mor--kv-cache-layout branch 3 times, most recently from 187cb3d to 000e935 Compare May 30, 2024 21:54

morgandu reviewed May 31, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Show resolved Hide resolved

morgandu reviewed May 31, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed May 31, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed May 31, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Show resolved Hide resolved

morgandu assigned patemotter May 31, 2024

morgandu requested a review from patemotter May 31, 2024 17:04

morgandu assigned gobbleturk May 31, 2024

morgandu force-pushed the mor--kv-cache-layout branch 2 times, most recently from 58b5b31 to 4b4eaaa Compare May 31, 2024 19:16

morgandu approved these changes May 31, 2024

View reviewed changes

morgandu force-pushed the mor--kv-cache-layout branch from 4b4eaaa to 57429da Compare June 3, 2024 05:06

Base automatically changed from mor--kv-cache-layout to main June 3, 2024 15:22

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 3, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Outdated Show resolved Hide resolved

yeandy commented Jun 5, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Show resolved Hide resolved

MaxText/inference_microbenchmark.py Show resolved Hide resolved

morgandu reviewed Jun 6, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 6, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 6, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 6, 2024

View reviewed changes

MaxText/inference_microbenchmark.py Outdated Show resolved Hide resolved

morgandu reviewed Jun 6, 2024

View reviewed changes

MaxText/inference_microbenchmark_sweep.py Outdated Show resolved Hide resolved

yeandy marked this pull request as ready for review June 17, 2024 16:52

yeandy requested review from rwitten and gobbleturk as code owners June 17, 2024 16:52

gobbleturk approved these changes Jun 24, 2024

View reviewed changes

yeandy force-pushed the mor--kv-cache-layout-reformat-output branch from 074cf22 to 6c03e98 Compare June 25, 2024 17:00

Inference Microbenchmark Sweep

9606e62

yeandy force-pushed the mor--kv-cache-layout-reformat-output branch from 6c03e98 to 9606e62 Compare June 25, 2024 17:44

yeandy added the pull ready label Jun 25, 2024

copybara-service bot merged commit 5a215db into main Jun 25, 2024
13 checks passed

copybara-service bot deleted the mor--kv-cache-layout-reformat-output branch June 25, 2024 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sweep across KV cache layouts #662

Sweep across KV cache layouts #662

yeandy commented May 21, 2024 •

edited

Loading

morgandu left a comment •

edited

Loading

morgandu commented May 31, 2024

morgandu Jun 3, 2024

morgandu Jun 3, 2024

yeandy left a comment

yeandy commented Jun 17, 2024

morgandu commented Jun 24, 2024

yeandy commented Jun 24, 2024

Sweep across KV cache layouts #662

Sweep across KV cache layouts #662

Conversation

yeandy commented May 21, 2024 • edited Loading

morgandu left a comment • edited Loading

Choose a reason for hiding this comment

morgandu commented May 31, 2024

morgandu Jun 3, 2024

Choose a reason for hiding this comment

morgandu Jun 3, 2024

Choose a reason for hiding this comment

yeandy left a comment

Choose a reason for hiding this comment

yeandy commented Jun 17, 2024

morgandu commented Jun 24, 2024

yeandy commented Jun 24, 2024

yeandy commented May 21, 2024 •

edited

Loading

morgandu left a comment •

edited

Loading