Training details (MoGe and MoGe-2) #100

Closed

opened

on Jul 30, 2025

Hi @EasternJournalist could you confirm or deny some of the hyperparameters that were used for training? This is what I gathered from various sources. In particular, I suspect that 1 million training steps for MoGe (v1) is not actually correct? What about the batch size of MoGe-2?

Thanks in advance!

	MoGe	MoGe-2
batch size	256 (paper)	128 (comment)
steps	~160k (comment)	120k (paper)
base LR	1e-4/1e-5 configs/train/v1.json	1e-4/1e-5 (paper)
LR schedule	"more conservative than 2"	half every 25k steps (paper)
GPUs	64 V100 #9	32 A100 (paper)
time	one week #9	5 days (paper)

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests