The kat model training time problem

Hi, I read in the paper that the gpu you guys are using is a single a5000 to train kat, what I am using is a single a6000, when I train a kat model such as kat_base, I cranked up the batchsize to 512, and it took me up to a day to train an epoch down the line, then I tried to train a smaller model: kat_ tiny, and adjusted the batchsize to 1024, it also took up to 10 hours to train an epoch, which is very time-consuming. Is this normal or am I mistaken somewhere?

![Image](https://github.com/user-attachments/assets/d4b78719-073f-4446-9f88-d89a3a7350f6)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The kat model training time problem #30

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The kat model training time problem #30

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions