New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[agent] debug logs for session, node events on dispatcher, heartbeats #2486

Merged
merged 1 commit into from Jan 24, 2018

Conversation

Projects
None yet
4 participants
@anshulpundir
Contributor

anshulpundir commented Jan 17, 2018

Added info logs to the agent to track the manager its connecting to, timeouts, heartbeat from the dispatcher.

Signed-off-by: Anshul Pundir anshul.pundir@docker.com

backoff = initialSessionFailureBackoff + 2*backoff
if backoff > maxSessionFailureBackoff {
backoff = maxSessionFailureBackoff
}
log.G(ctx).WithError(err).Errorf("agent: session failed. Backoff period: %d", backoff)

This comment has been minimized.

@stevvooe

stevvooe Jan 17, 2018

Contributor

Use a field here: log.G(ctx).WithError(err).WithField("backoff", backoff).Errorf("agent: session failed").

Note this is the maximum backoff range for a random backoff. Actual backoff is rand([0, backoff)). ;)

@@ -1094,6 +1094,7 @@ func (d *Dispatcher) Heartbeat(ctx context.Context, r *api.HeartbeatRequest) (*a
}
period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID)
log.G(ctx).WithField("dispatcher", "heartbeat").Infof("agent heartbeat period %v", period)

This comment has been minimized.

@stevvooe

stevvooe Jan 17, 2018

Contributor

Might be better to use the module here, rather than a field.

@codecov

This comment has been minimized.

codecov bot commented Jan 17, 2018

Codecov Report

Merging #2486 into master will increase coverage by 0.19%.
The diff coverage is 72.41%.

@@            Coverage Diff             @@
##           master    #2486      +/-   ##
==========================================
+ Coverage   61.23%   61.42%   +0.19%     
==========================================
  Files          49      129      +80     
  Lines        6890    21313   +14423     
==========================================
+ Hits         4219    13092    +8873     
- Misses       2241     6812    +4571     
- Partials      430     1409     +979
@@ -58,6 +60,8 @@ func (b *Broker) Select(dialOpts ...grpc.DialOption) (*Conn, error) {
// connection.
func (b *Broker) SelectRemote(dialOpts ...grpc.DialOption) (*Conn, error) {
peer, err := b.remotes.Select()
log.G(context.Background()).Infof("Manager selected by agent for session: %v", peer)

This comment has been minimized.

@nishanttotla

nishanttotla Jan 17, 2018

Contributor

nit: all messages start with lowercase.

@anshulpundir anshulpundir force-pushed the anshulpundir:worker_logs branch 2 times, most recently from 5bf93bc to 8f1bb1b Jan 17, 2018

@anshulpundir anshulpundir requested review from fcrisciani and dperny Jan 17, 2018

@anshulpundir anshulpundir changed the title from [agent] Debug logs to [agent] debug logs for session, node events on dispatcher, heartbeats Jan 17, 2018

@anshulpundir anshulpundir force-pushed the anshulpundir:worker_logs branch 3 times, most recently from e0e280f to f822e10 Jan 17, 2018

@anshulpundir

This comment has been minimized.

Contributor

anshulpundir commented Jan 23, 2018

ping @stevvooe @dperny for review

@nishanttotla

This comment has been minimized.

Contributor

nishanttotla commented Jan 23, 2018

@anshulpundir CI is failing with a bunch of data races.

@anshulpundir

This comment has been minimized.

Contributor

anshulpundir commented Jan 23, 2018

Will check it out tomorrow @nishanttotla

resp, err := client.Heartbeat(heartbeatCtx, &api.HeartbeatRequest{
SessionID: s.sessionID,
})
cancel()
if err != nil {
if grpc.Code(err) == codes.NotFound {
log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat failed")

This comment has been minimized.

@fcrisciani

fcrisciani Jan 23, 2018

Member

do you want to add here again the manager details to make it easier to identify it?

This comment has been minimized.

@anshulpundir

anshulpundir Jan 23, 2018

Contributor

Will do!

@@ -1094,6 +1108,8 @@ func (d *Dispatcher) Heartbeat(ctx context.Context, r *api.HeartbeatRequest) (*a
}
period, err := d.nodes.Heartbeat(nodeInfo.NodeID, r.SessionID)
log.G(ctx).WithField("method", "(*Dispatcher).Heartbeat").Infof("received heartbeat from worker %v, expect next heartbeat in %v", nodeInfo, period)

This comment has been minimized.

@fcrisciani

fcrisciani Jan 23, 2018

Member

maybe this one debug?

This comment has been minimized.

@anshulpundir

anshulpundir Jan 23, 2018

Contributor

Its every 5 seconds, I thought thats not too frequent for logging ?

This comment has been minimized.

@fcrisciani

fcrisciani Jan 23, 2018

Member

it's per node in the cluster, so can easily be hundred every 5 sec :D

This comment has been minimized.

@anshulpundir

anshulpundir Jan 23, 2018

Contributor

Ahh yes, I mistook this for the log on the agent.

@anshulpundir anshulpundir force-pushed the anshulpundir:worker_logs branch 3 times, most recently from f8af5f4 to fc5a878 Jan 23, 2018

resp, err := client.Heartbeat(heartbeatCtx, &api.HeartbeatRequest{
SessionID: s.sessionID,
})
cancel()
if err != nil {
if grpc.Code(err) == codes.NotFound {
log.G(ctx).WithFields(fields).WithError(err).Errorf("heartbeat to manager %v failed", s.conn.Peer())

This comment has been minimized.

@fcrisciani

fcrisciani Jan 23, 2018

Member

do you want to move this one line above, so it's printed no matter from the grpc code?

@anshulpundir anshulpundir force-pushed the anshulpundir:worker_logs branch from fc5a878 to bb17218 Jan 24, 2018

[agent] debug logs for session, node events on dispatcher, heartbeats.
Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>

@anshulpundir anshulpundir force-pushed the anshulpundir:worker_logs branch from bb17218 to 6fa4dda Jan 24, 2018

@anshulpundir anshulpundir merged commit e3f09ac into docker:master Jan 24, 2018

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/project 61.42% (target 0%)
Details
dco-signed All commits are signed

@anshulpundir anshulpundir deleted the anshulpundir:worker_logs branch Jan 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment