Skip to content

Conversation

@zhxchen17
Copy link
Contributor

Summary:
Replacing the API usage while removing some dead code.

Test Plan:

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 --model.flavor=debugmodel_flex_attn

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 3, 2025
@zhxchen17 zhxchen17 marked this pull request as draft November 3, 2025 20:10
@zhxchen17
Copy link
Contributor Author

Will wait for the nightly build to catch up

@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch 2 times, most recently from 314f20a to 47cc583 Compare November 5, 2025 14:11
@zhxchen17 zhxchen17 marked this pull request as ready for review November 5, 2025 14:13
@zhxchen17
Copy link
Contributor Author

cc @yiming0416

@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch from 47cc583 to 70897e3 Compare November 5, 2025 15:19
@zhxchen17 zhxchen17 marked this pull request as draft November 5, 2025 15:43
@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch 2 times, most recently from 751702a to 5f8af78 Compare November 5, 2025 15:48
@zhxchen17 zhxchen17 marked this pull request as ready for review November 5, 2025 15:48
@zhxchen17
Copy link
Contributor Author

Test failure doesn't repro with pytorch latest main branch.

@yiming0416 yiming0416 self-requested a review November 6, 2025 16:56
@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch from 5f8af78 to 1c8af0d Compare November 6, 2025 17:09
@zhxchen17 zhxchen17 marked this pull request as draft November 6, 2025 19:15
Summary:
Replacing the API usage while removing some dead code.

Test Plan:
```
NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 --model.flavor=debugmodel_flex_attn
```
@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch from 1c8af0d to 3a5bea6 Compare November 6, 2025 23:12
@zhxchen17 zhxchen17 marked this pull request as ready for review November 6, 2025 23:13
@yiming0416 yiming0416 merged commit 268020d into pytorch:main Nov 6, 2025
5 checks passed
jquesnelle pushed a commit to NousResearch/torchtitan that referenced this pull request Nov 10, 2025
Summary:
Replacing the API usage while removing some dead code.

Test Plan:
```
NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 --model.flavor=debugmodel_flex_attn
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants