Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/fp32 weight master copy #328

Merged
merged 5 commits into from
May 1, 2024
Merged

Conversation

karpathy
Copy link
Owner

@karpathy karpathy commented May 1, 2024

optionally keep a master copy of params in fp32
the added flag is -w 0/1, where 1 is default (i.e. by default we DO keep the fp32 copy)
increases memory for the additional copy of params in float, but the running time seems ~unaffected

@@ -2429,6 +2454,7 @@ int main(int argc, char *argv[]) {
int overfit_single_batch = 0; // useful for debugging, 1 = only load a single data batch once
int max_steps = -1;
int override_enable_tf32 = 1;
int use_master_weights = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this naming change, the name in my original PR was just confusing :)

@karpathy karpathy merged commit 4dd1ab4 into master May 1, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants