Skip to content

Conversation

@incredere
Copy link
Contributor

Ubench run ID: nemo2_training-llama3-1-70b-seq8192-gbs1024-mbs1-gpus16-2025-11-01_010353-a39411ab-7bae-4f63-b944-b43244a180c0

@incredere incredere requested a review from junjieqian November 4, 2025 01:13
@junjieqian
Copy link
Collaborator

The code looks good to me. But just one question is, why do we need start a duplicated setup for nearly same recipe but simply just different GBS? Anything blocking us not reusing the same codes, but providing different parameters in the README?
I understand we want to pipe stream the recipe codes, but these duplicated codes make me concern about the regressed code quality in the repo.
CC. @tonyjohnchen @bwuu , WDYT?

@tonyjohnchen tonyjohnchen merged commit cf687e7 into AI-Hypercomputer:main Nov 5, 2025
1 check passed
@incredere incredere deleted the 2node-bf16-seq8192-gbs1024 branch November 5, 2025 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants