Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Determine which gpu tests, if any, can be parallelized, and strategy to do so #8675

Closed
areusch opened this issue Aug 6, 2021 · 1 comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it

Comments

@areusch
Copy link
Contributor

areusch commented Aug 6, 2021

#8576 is enabling pytest-xdist for CPU-targeted TVM tests. This is a good first step to parallelizing the TVM test suite, but the real benefits will come if we can find a way to do this for tests that target GPUs. It's not clear if this can be done.

I think:

  • we can probably safely run smaller GPU tests in parallel
  • we don't have a good list of which tests those are so we don't know how much benefit this will provide
  • we should probably write more smaller tests, though
  • we should probably build tvm.testing decorators to formally identify these in code

Known unknowns that could break this approach:

  • Are there ordering problems with some tests?
  • I think we generate cached testdata for re-use between tests, and that needs to be broken out into pytest fixtures
  • Other driver-level issues that could break this.

Feel free to generate actionable child issues as a result of this. If things are not actionable and require deliberation, it would be great to post up to discuss.tvm.ai instead--we like to keep issues in GH to only those with clear steps to resolve.

@mikepapadim to take a look at this
cc @Mousius @denise-k @driazati @gigiblender @jroesch @leandron

@areusch areusch added the needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it label Oct 19, 2022
@Lunderberg
Copy link
Contributor

Closed as the CI performance push was largely completd, would need re-timing for current bottlenecks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it
Projects
None yet
Development

No branches or pull requests

2 participants