Move model to device #2

dariocazzani · 2025-08-22T21:06:23Z

What does this PR do?

It moves the model to the same device as all the tensors

ayeganov · 2025-08-22T21:19:33Z

scratchgpt/main.py

    latest_model_path = get_latest_model_weights_path(args.experiment)

    model = TransformerLanguageModel(NUM_HEADS, tokenizer.vocab_size, N_EMBED, BLOCK_SIZE, NUM_BLOCKS)
+    model = model.to(DEVICE)


Maybe we should fix it in load_model?

Unless you want to change the signature, we can't.
We need to assign model = model.to(DEVICE)

So.. we either just add that line, or we change

def load_model(model_path: str, model: nn.Module, device: torch.device) -> None

to

def load_model(model_path: str, model: nn.Module, device: torch.device) -> nn.Module ... return model

But it's messy because we are doing:

else: print("No model path exists, proceeding with a new model")

In my experience, I always have a line like

model = model.to(DEVICE)

after creation

Gotcha - then I think we can remove model.to from load_model and add this line.

But also - we can do return model.to(DEVICE) from load_model. Since we always provide a valid model pointer.

Done.. if you disagree, I don't care

ayeganov

Ship it!

dariocazzani requested a review from ayeganov August 22, 2025 21:06

ayeganov reviewed Aug 22, 2025

View reviewed changes

dariocazzani force-pushed the hotfix/model_to_device branch from cf9bdf6 to fd1cde2 Compare August 22, 2025 23:55

dariocazzani requested a review from ayeganov August 22, 2025 23:55

Move model to device

cad5896

dariocazzani force-pushed the hotfix/model_to_device branch from fd1cde2 to cad5896 Compare August 22, 2025 23:56

ayeganov approved these changes Aug 23, 2025

View reviewed changes

dariocazzani merged commit 3536277 into main Aug 23, 2025

dariocazzani deleted the hotfix/model_to_device branch August 23, 2025 01:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move model to device #2

Move model to device #2

Uh oh!

dariocazzani commented Aug 22, 2025

Uh oh!

ayeganov Aug 22, 2025

Uh oh!

dariocazzani Aug 22, 2025

Uh oh!

ayeganov Aug 22, 2025

Uh oh!

ayeganov Aug 22, 2025

Uh oh!

dariocazzani Aug 22, 2025

Uh oh!

ayeganov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Move model to device #2

Move model to device #2

Uh oh!

Conversation

dariocazzani commented Aug 22, 2025

What does this PR do?

Uh oh!

ayeganov Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

dariocazzani Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

ayeganov Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

ayeganov Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

dariocazzani Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

ayeganov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants