Skip to content

Fresh runs can save a checkpoint at step 0 and tensorboard writes unconditionally #3790

@janmarxen

Description

@janmarxen

Bug report

MaxText appears to perform checkpoints even when setting flags to disable it. In my case, the run used load_parameters_path, enable_checkpointing=True (to enable initial parameter loading), checkpoint_period=1000000 save_checkpoint_on_completion=False, and still attempted checkpoint-related work in a 50 step training run.

The same issue applies to enable_tensorboard, which is ignored by write_setup_info_to_tensorboard.

@giusgal and @janmarxen

Logs/Output

No response

Environment Information

No response

Additional Context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions