Skip to content

Add a 2-slice pallas training test in pre-submit CI #8850

@tengyifei

Description

@tengyifei

We should have a test that trains a very simple model with a pallas kernel across two slices of TPUv4 and checks that it doesn't hang.

Currently our pre-submit CI only runs things on 1 slice of TPUv4 and that doesn't cover cases like multi-slice training.

Post-submit CI requires human diligence to monitor and revert changes, which has proven to be ineffective. As long as we can afford it, we should test things in pre-submit and not post-submit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    testingTesting and coverage related issues.xla:tpuTPU specific issues and PRs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions