Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Test Failing: build_dockerfile_bazel_cache_prod #15332

Closed
shashidhar-patil opened this issue Oct 30, 2023 · 7 comments
Closed

CI Test Failing: build_dockerfile_bazel_cache_prod #15332

shashidhar-patil opened this issue Oct 30, 2023 · 7 comments
Assignees
Labels
type: bug Something isn't working

Comments

@shashidhar-patil
Copy link
Contributor

Your Environment

  • Version: v1.9
  • Affected Component: Access Gateway
  • Affected Subcomponent: AGW service
  • Deployment Environment: Vagrant (AGW)

Describe the Issue

Magma CI: build_dockerfile_bazel_cache_prod is failing. Bazel tests are failing due to memory leaks detected by Leak sanitizer.
Screenshot from 2023-10-30 13-09-55

However, the memory leak seems to be in libsan library and not in the Magma codebase.
Screenshot from 2023-10-30 13-10-35

Issue Analysis

To Reproduce

  1. [HOST] git clone https://github.com/magma/magma.git
  2. [HOST] cd magma/lte/gateway
  3. [HOST] vagrant up magma
  4. [HOST] vagrant ssh magma
  5. [VM] cd magma/lte/gateway
  6. [VM] apt list -i | grep liblsan
    Fresh installation of magma gives older 10.3.0 version of lsan (in which the issue doesn't exist)
  7. Run bazel test with leak sanitizer.
    [VM] bazel test --config=production --flaky_test_attempts=5 --test_tag_filters=-manual `bazel query "kind(cc_test, //...)"`
  8. Now upgrade the liblsan0 package to 10.5.0 (The issue exists with this version)
    sudo apt upgrade liblsan
  9. [VM] bazel test --config=production --flaky_test_attempts=5 --test_tag_filters=-manual `bazel query "kind(cc_test, //...)"`
    The issue is reproduced.
    Screenshot from 2023-10-30 13-10-35

Expected behavior

Successful execution of the Magma CI: build_dockerfile_bazel_cache_prod

Probable Fixes

  1. Move to Ubuntu 22.04
    • Pro: This issue is fixed in gcc11+ through the reimplementation of dlsym allocator.
    • Con: This will need additional effort to test and stabilize Magma in Ubuntu 22.04.
  2. Disable the Leak Sanitizer entirely from the Production build. (Preferred as it unblocks v1.9 PRs)
    • Pro: Removing the --config=lsan --copt=O3 from profuction build fixes the issue.
    • Con: Some leaks may go unnoticed. However Address Sanitizer is running as part of CI, this should be able to catch most of the leaks.
@panyogesh
Copy link
Contributor

It seems apt-install does not have the 10.3.0 version. It has only 10.5.0 version (sudo apt-get install -y liblsan0=10.5.0-1ubuntu1~20.04).
10.3.0 seems to be only in archives. Forcing a downgrade to 10.3.0 locks our system in a state of broken dependencies reference-link

@lucasgonze
Copy link
Contributor

For solution 1 "Move to Ubuntu 22.04", how much uncertainty does this create? How much new work is it?

@panyogesh
Copy link
Contributor

For solution 1 "Move to Ubuntu 22.04", how much uncertainty does this create? How much new work is it?

Based on our past exp.... Roughly 3 months 3-4 engg.

@lucasgonze
Copy link
Contributor

3 months 3-4 eng

My guess is that we don't have that many hours available for an edge case like this. Do you agree, Yogesh?

@panyogesh
Copy link
Contributor

3 months 3-4 eng

My guess is that we don't have that many hours available for an edge case like this. Do you agree, Yogesh?

Yes

@panyogesh
Copy link
Contributor

Meanwhile to unblock we have raised the PR: #15334

@tapasmishra
Copy link

We can consider the issue to be fixed for now. Seems the changes in PR #15334 are working and we are no more seeing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants