Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky Tests in CircleCI #908

Open
anupambhatnagar opened this issue Jan 12, 2022 · 4 comments · Fixed by #917
Open

Flaky Tests in CircleCI #908

anupambhatnagar opened this issue Jan 12, 2022 · 4 comments · Fixed by #917
Assignees

Comments

@anupambhatnagar
Copy link

anupambhatnagar commented Jan 12, 2022

Here is a list of flaky tests that we should fix in our next fix-a-thon.

  • test_shared_weight_mevo[optim_state-flat]
  • test_regnet[pytorch-flatten-mixed]
  • test_shared_weight_mevo[train-none]
  • test_shared_weight_mevo[train-nonflat]
  • test_fsdp_memory[fsdp_amp_default-ckpt]; fails due to timeout.
  • test1[no_flatten-full] - fails due to timeout.
@min-xu-ai
Copy link
Contributor

also layertracker test seems to be flaky

tests.experimental.tooling.test_layer_memory_tracker

>       assert summary.total_forward_allocations >= summary.total_activation_allocations
E       assert 77056000 >= 77070864
E        +  where 77056000 = LayerwiseMemoryTrackerSummary(max_memory_allocated=104022528, max_memory_cached=134217728, total_activation_allocation... is_forward=True, all_gathered=0, cumul_all_gathered=0, event=TraceForwardEvent(memory_diff=0, memory_activations=0))]).total_forward_allocations
E        +  and   77070864 = LayerwiseMemoryTrackerSummary(max_memory_allocated=104022528, max_memory_cached=134217728, total_activation_allocation... is_forward=True, all_gathered=0, cumul_all_gathered=0, event=TraceForwardEvent(memory_diff=0, memory_activations=0))]).total_activation_allocations

tests/experimental/tooling/test_layer_memory_tracker.py:65: AssertionError

@anj-s
Copy link
Contributor

anj-s commented Feb 8, 2022

Another flaky test for the list:
tests/nn/data_parallel/test_fsdp.py: TestSerialization

@min-xu-ai
Copy link
Contributor

@min-xu-ai
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants