Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Fix e2e gRPC remote storage server instability #5459

Closed
wants to merge 1 commit into from

Conversation

james-ryans
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • Require the remote storage to be healthy in 30 seconds before being tested.

How was this change tested?

  • Add a 1s sleep when starting the remote storage server and run STORAGE=grpc SPAN_STORAGE_TYPE=memory make jaeger-v2-storage-integration-test
    go func() {
    defer s.wg.Done()
    if err := s.grpcServer.Serve(s.grpcConn); err != nil {
    s.logger.Error("GRPC server exited", zap.Error(err))
    }
    s.healthcheck.Set(healthcheck.Unavailable)
    }()

Checklist

Signed-off-by: James Ryans <james.ryans2012@gmail.com>
@james-ryans james-ryans requested a review from a team as a code owner May 18, 2024 17:43
Copy link

codecov bot commented May 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.46%. Comparing base (14cbcea) to head (42e6905).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5459   +/-   ##
=======================================
  Coverage   95.46%   95.46%           
=======================================
  Files         331      331           
  Lines       16112    16112           
=======================================
  Hits        15382    15382           
  Misses        556      556           
  Partials      174      174           
Flag Coverage Δ
badger_v1 8.08% <ø> (ø)
badger_v2 ?
cassandra-3.x-v1 16.48% <ø> (ø)
cassandra-3.x-v2 1.86% <ø> (ø)
cassandra-4.x-v1 16.48% <ø> (ø)
cassandra-4.x-v2 1.86% <ø> (ø)
elasticsearch-7.x 1.77% <ø> (ø)
elasticsearch-8.x 1.77% <ø> (-0.02%) ⬇️
grpc_v1 9.11% <ø> (+0.01%) ⬆️
grpc_v2 7.49% <ø> (ø)
kafka 9.81% <ø> (ø)
opensearch-1.x 1.78% <ø> (+0.01%) ⬆️
opensearch-2.x 1.78% <ø> (ø)
unittests 93.98% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -25,7 +31,7 @@ type RemoteMemoryStorage struct {
}

func StartNewRemoteMemoryStorage(t *testing.T) *RemoteMemoryStorage {
logger := zaptest.NewLogger(t)
logger := zaptest.NewLogger(t, zaptest.Level(zap.DebugLevel))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug is the default, don't think we need to specify it

conn, err := grpc.NewClient(fmt.Sprintf("localhost:%d", ports.RemoteStorageGRPC), dialOpts...)
require.NoError(t, err)
require.Eventually(t, func() bool {
s := conn.GetState()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the whole point of recent changes in grpc from Dial to NewClient was to avoid proactively creating a connection to the backend(s), so I am not sure if this will ever work.

I think a better solution would be to add "google.golang.org/grpc/health/grpc_health_v1" to the endpoint and query it (via real request, not via connection status) to verify that the server is up.

yurishkuro added a commit that referenced this pull request May 19, 2024
## Which problem is this PR solving?
- Remote storage server is missing a gRPC health check, the current use
case is for testing environment to check if the gRPC server is ready
before run the tests.

## Description of the changes
- Add "google.golang.org/grpc/health/grpc_health_v1" to remote-storage
gRPC server.
- Ping health check for remote storage server before proceeding with
tests
- Add deduping of spans by ID when retrieved from storage (when storage
is not idempotent)
- Make `testzap` loggers include source file

## How was this change tested?
- Tested with #5459 PR.

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [ ] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: James Ryans <james.ryans2012@gmail.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Co-authored-by: Yuri Shkuro <github@ysh.us>
@yurishkuro
Copy link
Member

I think we can close this. I added pinging of the healthcheck service to #5461

@james-ryans
Copy link
Contributor Author

I think we can close this. I added pinging of the healthcheck service to #5461

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants