issues Search Results · repo:pytorch/torchtitan language:Python
Filter by
280 results
(101 ms)280 results
inpytorch/torchtitan (press backspace or delete to remove)Bug description
I am seeing Linear layer weights in float32 ( wq.weight.dtype torch.float32 ) even after setting the following.
mixed_precision_param = bfloat16 mixed_precision_reduce = bfloat16
Is ...
question
githubsgi
- 11
- Opened yesterday
- #1027
Wondering if there is any plan to add the 1B and/or 3B models to the TorchTitan set of example models ? It is probably
fairly straight forward to do that , if I am not missing anything, Another toml file ...
githubsgi
- 3
- Opened yesterday
- #1026
Bug description
CONFIG_FILE=./torchtitan/models/llama/train_configs/debug_model.toml ./run_train.sh --model.use_fl ex_attn
--training.compile
I have a long stack trace with TORCHDYNAMO_VERBOSE=1:
...
module: flex attention
lkhphuc
- 1
- Opened 6 days ago
- #1005
I ve been working on something similar, see https://github.com/PrimeIntellect-ai/CPUOptimizer, but I see a problem in
your implementation.
If you hide the optimizer step inside the backward pass, the ...
apaz-cli
- 7
- Opened 9 days ago
- #990
Currently TorchTitan supports PP, CP, FSDP, PP parallelisms. Is there a plan to support Expert Parallelism (EP) ? Along
the same line, see some DeepSeek files in the repo. Is there a plan to support DeepSeek ...
question
githubsgi
- 4
- Opened 9 days ago
- #987
Would appreciate if someone can share a toml file to do PP+FSDP+TP for 405B model.
githubsgi
- 2
- Opened 9 days ago
- #986
Bug description
I ve been playing around with the experimental DeepSeek code and in-progress PRs and have a simple sft training loop
implemented. The results on the first few steps look pretty promising ...
EugenHotaj
- 1
- Opened 10 days ago
- #979
Today run.py is a direct invoke. The model is expected to provide a TrainSpec so that it can be invoked by Titan s
train.py.
kwen2501
- Opened 11 days ago
- #975
Today run.py does not do checkpointing during training. We need to hook it to Titan s DCP.
kwen2501
- Opened 11 days ago
- #974

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.