New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic client #2032

Merged
merged 2 commits into from Jan 29, 2018

Conversation

Projects
None yet
6 participants
@fcrisciani
Member

fcrisciani commented Dec 8, 2017

  1. decode internal state of service discovery and overlay
  2. remediate in case of orphan entries
  3. realign naming with moby/moby api
  4. GetEntry should return error if the key is under deletion (added unit test)

@fcrisciani fcrisciani requested review from vieux, trapier and mavenugo Dec 8, 2017

@ddebroy

Looks good. Had a couple of minor comments. One q: can the remediation be triggered automatically once in a while as part of a "garbage collection sweep" or is that too dangerous?

if err != nil {
logrus.WithError(err).Fatalf("The connection failed")
}
httpIsOk(resp.Body)

This comment has been minimized.

@ddebroy

ddebroy Dec 8, 2017

Contributor

You can directly check resp.StatusCode for 200/2xx codes which indicate HTTP OK rather than parsing resp.body for OK: https://golangcode.com/get-the-http-response-status-code/

This comment has been minimized.

@fcrisciani

fcrisciani Dec 13, 2017

Member

I think can be a good enhancement, right now the server side did not really set any specific error code, but it would be good to implement a more proper api with error codes

}
httpIsOk(resp.Body)
clsuterPeers := fetchNodePeers(*ipPtr, *portPtr, "")

This comment has been minimized.

@ddebroy

ddebroy Dec 8, 2017

Contributor

clsuster typo

This comment has been minimized.

@selansen

This comment has been minimized.

@fcrisciani
Makefile Outdated
@@ -52,6 +53,7 @@ cross-local:
@echo "🐳 $@"
go build -o "bin/dnet-$$GOOS-$$GOARCH" ./cmd/dnet
go build -o "bin/docker-proxy-$$GOOS-$$GOARCH" ./cmd/proxy
go build -o "bin/dignosticClient-$$GOOS-$$GOARCH" ./diagnose/client

This comment has been minimized.

@vieux

vieux Dec 8, 2017

Contributor

nit: why put it in /diagnose/ and not /cmd/ ? like the others ?

This comment has been minimized.

@fcrisciani

fcrisciani Dec 13, 2017

Member

yep moved

for _, k := range orphanKeys {
resp, err := http.Get(fmt.Sprintf(deleteEntry, ip, port, network, tableName, k))
if err != nil {
logrus.WithError(err).Fatalf("Failed deleting entry k:%s", k)

This comment has been minimized.

@vieux

vieux Dec 8, 2017

Contributor

do we really want to fatal here ? since it will stop the program, it won't delete the remaining entries.

This comment has been minimized.

@fcrisciani

fcrisciani Dec 13, 2017

Member

sure, changed in error

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch 2 times, most recently from 84493e3 to 989af94 Dec 14, 2017

@codecov-io

This comment has been minimized.

codecov-io commented Dec 14, 2017

Codecov Report

❗️ No coverage uploaded for pull request base (master@862df3a). Click here to learn what that means.
The diff coverage is 10.28%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #2032   +/-   ##
=========================================
  Coverage          ?   40.46%           
=========================================
  Files             ?      138           
  Lines             ?    22172           
  Branches          ?        0           
=========================================
  Hits              ?     8971           
  Misses            ?    11884           
  Partials          ?     1317
Impacted Files Coverage Δ
networkdb/networkdbdiagnostic.go 0% <0%> (ø)
agent.go 5.02% <0%> (ø)
networkdb/networkdb.go 67.8% <100%> (ø)
controller.go 36.74% <52.94%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 862df3a...be91c3e. Read the comment docs.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Dec 19, 2017

@fcrisciani can you include a README / Markdown file in this PR? Misty started working on one in docker/docker.github.io#5558, and can be used as a starting point

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch 2 times, most recently from 3ba06ad to ef512ba Jan 4, 2018

@fcrisciani

This comment has been minimized.

Member

fcrisciani commented Jan 9, 2018

@thaJeztah @ddebroy @vieux can you guys give another pass?

@ddebroy

LGTM. Just a couple of minor string nits

A message like the following will appear in the Docker host logs:
```none
Starting the diagnose server listening on <port> for commands

This comment has been minimized.

@ddebroy

ddebroy Jan 10, 2018

Contributor

should the string diagnose be diagnostics ?

A message like the following will appear in the Docker host logs:
```none
Disabling the diagnose server

This comment has been minimized.

@ddebroy

ddebroy Jan 10, 2018

Contributor

should the string diagnose be diagnostics ?

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch from ef512ba to ed0721b Jan 11, 2018

@ddebroy

LGTM

@fcrisciani

This comment has been minimized.

Member

fcrisciani commented Jan 23, 2018

@thaJeztah can you also give the final blessing?

@fcrisciani fcrisciani changed the title from Diagnose client to Diagnostic client Jan 23, 2018

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch 3 times, most recently from 766b46b to 5b9614f Jan 24, 2018

@ddebroy

Looks good with a couple of minor comments.

// IsDebugEnable returns true when the debug is enabled
func (s *Server) IsDebugEnable() bool {
// IsDiagnosticEnable returns true when the debug is enabled
func (s *Server) IsDiagnosticEnable() bool {

This comment has been minimized.

@ddebroy

ddebroy Jan 24, 2018

Contributor

Minor nit: how about changing the name to IsDiagnosticEnable**d** like in controller
func (c *controller) IsDiagnosticEnabled() bool {

This comment has been minimized.

@fcrisciani
@@ -328,6 +328,9 @@ func (nDB *NetworkDB) GetEntry(tname, nid, key string) ([]byte, error) {
if err != nil {
return nil, err
}
if entry != nil && entry.deleting {
return nil, types.NotFoundErrorf("entry not found in table %s with network id %s and key %s", tname, nid, key)

This comment has been minimized.

@ddebroy

ddebroy Jan 24, 2018

Contributor

Masking the existence into a "not found" message may be confusing if this message makes into logs but say the actual deletion does not trigger for a while. How about making the message something along the lines of: "entry found but in deleting state. returning not found" to keep things super clear for support engineers?

This comment has been minimized.

@fcrisciani

fcrisciani Jan 24, 2018

Member

changing in: return nil, types.NotFoundErrorf("entry in table %s network id %s and key %s deleted and pending garbage collection", tname, nid, key)

@thaJeztah

Left some comments 🤗

@@ -0,0 +1,8 @@
FROM docker:17.12-rc-dind

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

non-rc tag 17.12-dind is now available (https://hub.docker.com/_/docker/)

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

I was also thinking why we needed the dind for this tool, but this is so that we can use it for older daemons, which do not have this functionality built-in, correct?

Should we have two versions of the image? one "minimal" (just the binary), and one dind?

RUN apk add --no-cache curl
WORKDIR /tool

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

Instead of installing in /tool, you could just install in /usr/local/bin

**Standalone network:**
```bash
$ debugClient -c sd -v -net n8a8ie6tb3wr2e260vxj8ncy4

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

s/debugClient/diagnosticClient/

This comment has been minimized.

@fcrisciani
**Overlay network:**
```bash
$ debugClient -port 2001 -c overlay -v -net n8a8ie6tb3wr2e260vxj8ncy4

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

s/debugClient/diagnosticClient/

This comment has been minimized.

@fcrisciani
WORKDIR /tool
COPY daemon.json /etc/docker/daemon.json
COPY diagnosticClient /tool/diagnosticClient

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

For a follow-up; we should make this Dockerfile a multi-stage build, and actually build the client in it (I had a branch with that, but looks like I didn't push, and it's not on my laptop 😊

Remember that table operations have ownership, so any `create entry` will be persistent till
the diagnostic container is part of the swarm.
1. Make sure that the node where the diagnostic client will run is not part of the swarm, if so do `docker swarm leave -f`

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

We'll need different steps for 17.12+ daemons (as they don't have to leave the swarm, just the diagnostic client to connect to them)

This comment has been minimized.

@fcrisciani

fcrisciani Jan 24, 2018

Member

This is the container version so you will run the dind version and you will need to leave the swarm

This comment has been minimized.

@thaJeztah

thaJeztah Jan 25, 2018

Member

Understood; thinking if we should have instructions (also using a containerised version) to use for 17.12 daemons (just bind-mounting /var/run/docker.sock e.g. and connecting to the running daemon)

This comment has been minimized.

@fcrisciani

fcrisciani Jan 25, 2018

Member

added the containerized version of it, as dockereng/network-diagnostic:onlyclient that can be used as it with --net host, no need for the docker.sock

2. To run the container, use a command like the following:
```bash
$ docker container run --name net-diagnostic -d --privileged --network host fcrisciani/network-diagnostic

This comment has been minimized.

@thaJeztah

thaJeztah Jan 24, 2018

Member

Can we put the image under an official organization? (dockereng/ or dockercore/ e.g.)

This comment has been minimized.

@fcrisciani

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch 3 times, most recently from 198046b to 2ccffde Jan 25, 2018

@fcrisciani

This comment has been minimized.

Member

fcrisciani commented Jan 25, 2018

Note to myself, after this gets merged the moby vendoring requires code change due to changes in method names

fcrisciani added some commits Dec 6, 2017

Change diagnose module name to diagnostic
Align it to the moby/moby external api

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Diagnostic client
- the client allows to talk to the diagnostic server and
decode the internal values of the overlay and service discovery

- the tool also allows to remediate in case of orphans entries

- added README

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

@fcrisciani fcrisciani force-pushed the fcrisciani:debug-client branch from 2ccffde to be91c3e Jan 26, 2018

@thaJeztah

LGTM, but left two suggestions 👍

`HUP` signal to the PID you found in the previous step.
```bash
kill -HUP <pid-of-dockerd>

This comment has been minimized.

@thaJeztah

thaJeztah Jan 26, 2018

Member

nit: this could be killall -HUP dockerd

`HUP` signal to the PID you found in the previous step.
```bash
kill -HUP <pid-of-dockerd>

This comment has been minimized.

@thaJeztah

thaJeztah Jan 26, 2018

Member

same here

@vieux

vieux approved these changes Jan 26, 2018

LGTM

@fcrisciani fcrisciani merged commit 20dd462 into docker:master Jan 29, 2018

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
dco-signed All commits are signed

@fcrisciani fcrisciani deleted the fcrisciani:debug-client branch Jan 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment