Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when uploading endpoint TLS files #768

Closed
msgeissler opened this issue Apr 6, 2017 · 34 comments
Closed

Issue when uploading endpoint TLS files #768

msgeissler opened this issue Apr 6, 2017 · 34 comments

Comments

@msgeissler
Copy link

msgeissler commented Apr 6, 2017

Description
Adding a new TLS-endpoint using the GUI seems to be a bit buggy at the moment and will create endpoints without the necessary TLS-files or will not upload the files.

Steps to reproduce the issue:
Version A using wizard:

  1. Start the container and mount the data-volume and local socket: docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v D:\Docker\portainer-config:/data --name portainer portainer/portainer
  2. Use the wizard to add a new remote endpoint using TLS
  3. Upload fails with "Unable to upload TLS certs" (http error: mkdir /data/tls/: file exists (code=500))

Version B using endpoint-settings:

  1. Start the container and mount the data-volume and local socket: docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v D:\Docker\portainer-config:/data --name portainer portainer/portainer
  2. Select the local docker in the wizard
  3. Go to the endpoints-option in the portainer settings
  4. Add a new endpoint and try to add TLS
  5. Upload fails with "Unable to upload TLS certs" but will create new folders in /data/tls each time the add-button is clicked (http error: mkdir /data/tls/: file exists (code=500)), containing only ca.pem or ca.pem + cert.pem (no key.pem)
  6. Hitting the button a fourth time will somehow succeed and create a working endpoint (the folder does contain all files)
  7. But the endpoint overview will now show 4 endpoints, as the previous endpoints were still created (5 with the local endpoint)

Version C using endpoint-settings:

  1. Start the container and mount the data-volume and local socket: docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v D:\Docker\portainer-config:/data --name portainer portainer/portainer
  2. Select the local docker in the wizard
  3. Go to the endpoints-option in the portainer settings
  4. Add a new endpoint without TLS
  5. Edit the new endpoint and add the TLS option and all files
  6. Saving the changes will be successful, but no files will be added to the filesystem

Any other info e.g. Why do you consider this to be a bug? What did you expect to happen instead?

Technical details:

  • Portainer version: v1.12.4
  • Portainer Docker image tag (latest/arm/windows...): none (/ latest)
  • Target Docker version (the host/cluster you manage): 17.03.01-ce
  • Target Swarm version (if applicable):
  • Platform (windows/linux): windows (target: linux)
  • Browser: Chrome 57.0.2987.98 (64-bit)
@deviantony
Copy link
Member

deviantony commented Apr 8, 2017

@msgeissler

Thanks for opening that issue, unfortunately I'm not able to reproduce all the cases.

There is a bug with the endpoint-init wizard at the moment when trying to define a remote endpoint with TLS and I'm working on fixing that, you can track the evolution here: #782

http error: mkdir /data/tls/: file exists (code=500): looks like Portainer is trying to create the /data/tls but it already exists. This is really strange as Portainer will try to create the /data/tls directory at startup if it is not existing (see https://github.com/portainer/portainer/blob/develop/api/file/file.go#L45).

As this error is raised after choosing the creation of a remote TLS endpoint, I'd say it's related to Portainer not being able to transmit a correct endpoint ID with the HTTP query, that would lead to the creation of the /data/tls/ directory instead of /data/tls/<ID> (see https://github.com/portainer/portainer/blob/develop/api/file/file.go#L57).

I'm wondering if this could be related to the Windows platform as I've already seen filesystem issues with the Go file library before. Would it be possible for you to try it on another platform to confirm that?

I was able to reproduce the last issue (C) and will push a fix for that soon. You can track the evolution here: #781

@msgeissler
Copy link
Author

I will install it on my VPS and see if the error occurs there as well. Once I know more I will let you know.

@msgeissler
Copy link
Author

I'm wondering if this could be related to the Windows platform as I've already seen filesystem issues with the Go file library before. Would it be possible for you to try it on another platform to confirm that?

Looks like you are right, on my Linux VPS Version B is not reproducible. This might really be a Windows exclusive bug.

@deviantony
Copy link
Member

Interesting, what about Version A ?

@msgeissler
Copy link
Author

Sorry, forgot to mention that because of #782.
The init-wizard would add all the files correctly and after manually reloading portainer it does work (although the UI will not select the endpoint by default this way if I remember correctly).

@zcalusic
Copy link

I'm sorry but this doesn't look like it's Windows specific at all. I have same problems on Linux when trying to add remote endpoints protected with TLS.

First this:

2017/05/21 14:22:25 http error: mkdir /data/tls/2: file exists (code=500)
2017/05/21 14:22:25 http error: mkdir /data/tls/2: file exists (code=500)

Then I tried to fool it by deleting tls folder, only to get this:

2017/05/21 14:29:15 http error: mkdir /data/tls/4: no such file or directory (code=500)
2017/05/21 14:29:15 http error: mkdir /data/tls/4: no such file or directory (code=500)
2017/05/21 14:29:15 http error: mkdir /data/tls/4: no such file or directory (code=500)

Then I deleted everything in data volume and restarted the container, to try from scratch, after entering all info again. now it blocks on clicking "Connect" button. Clicked a few more times, gave up, clicked on the local again, suddenly I'm in and have 4 endpoints (all those clicks that did nothing...), and actually I'm connected to remote endpoint?!

So, with a lot of clicking, restarts, deletia - it worked! But, it's a mess...

@deviantony
Copy link
Member

@zcalusic this bug has been fixed in #782

It will be released soon :)

@zcalusic
Copy link

Oh, great news! Although I somehow managed to add 4 more endpoints in the meantime, no problem at all. :)

Thank you for this very neat utility @deviantony!

@deviantony
Copy link
Member

Quick summary on this issue, the problem is the following on the Windows platform:

Version B using endpoint-settings:

Start the container and mount the data-volume and local socket: docker run -d -p 9000:9000 -v /var/run/docker.sock:/var/run/docker.sock -v D:\Docker\portainer-config:/data --name portainer portainer/portainer
Select the local docker in the wizard
Go to the endpoints-option in the portainer settings
Add a new endpoint and try to add TLS
Upload fails with "Unable to upload TLS certs" but will create new folders in /data/tls each time the add-button is clicked (http error: mkdir /data/tls/: file exists (code=500)), containing only ca.pem or ca.pem + cert.pem (no key.pem)
Hitting the button a fourth time will somehow succeed and create a working endpoint (the folder does contain all files)
But the endpoint overview will now show 4 endpoints, as the previous endpoints were still created (5 with the local endpoint)

@danieleagle
Copy link

I'm having this exact same issue using portainer/portainer:1.14.3 running on Ubuntu 16.04 LTS. I'm getting the below error:

http error: mkdir /data/tls/14: file exists (code=500)

When I look in /data/tls/14, I see only two files created: ca.pem and key.pem. If I manually add the third missing file and then reload the GUI, it works.

@deviantony
Copy link
Member

@GetchaDEAGLE could you paste the full container logs here?

Also, could you give us some reproduction steps?

@danieleagle
Copy link

@deviantony, I will paste the logs when I get access to the server again and provide further details. Stay tuned.

@deviantony
Copy link
Member

ping @GetchaDEAGLE

@danieleagle
Copy link

@deviantony, I'm sorry to say that the server was redeployed and we lost the logs but the persistent data from the volume is still intact.

As mentioned in my previous comment, from within the UI, adding an endpoint and uploading the cert files is what causes the issue. The data is being persisted to a network share via NFS v4. I don't believe anything special has been done to the permissions on the files/folders for the volume.

After the error occurs, adding the files manually to the volume solves the problem.

If I gather anything else that could be useful I'll let you know. Also, sorry for the delay as I just had a baby and was down for awhile.

@deviantony
Copy link
Member

Hey @GetchaDEAGLE, so you are still having this issue? What about using the latest version of Portainer?

PS: Congratulations ! :-)

@danieleagle
Copy link

I'm not having the issue anymore after performing the workaround. I'll report back if the problem happens again. Thanks for your help.

@r3pek
Copy link

r3pek commented Mar 12, 2018

Having the exact same issue as @GetchaDEAGLE

@deviantony
Copy link
Member

@r3pek could you give us some details about your env? Platform & docker version at least.

@r3pek
Copy link

r3pek commented Mar 12, 2018

@deviantony sure.
CentOS 7.4

# docker --version
Docker version 18.02.0-ce, build fc4de44

Running in swarm mode with 2 nodes.

Portainer installed as a service with a custom made docker-compose.yml and /data of the container image pointing to an NFS share.

@deviantony
Copy link
Member

Looks like both of you @r3pek and @GetchaDEAGLE are using a NFS share to store the /data folder. Wonder if it's related...

Also @r3pek any chance that you can provide a screenshot of the console and network tabs of your browser developer tool after the error is raised?

@r3pek
Copy link

r3pek commented Mar 12, 2018

There you have it:

image
image

@deviantony
Copy link
Member

@r3pek thanks, will investigate and see if I can reproduce.

@r3pek
Copy link

r3pek commented Mar 12, 2018

@deviantony btw, this is what's in the logs of the container:
2018/03/12 23:07:40 http error: mkdir /data/tls/18: file exists (code=500)
2018/03/12 23:07:40 http error: mkdir /data/tls/18: file exists (code=500)

Seems like you're trying to create the directory 18/ again on file upload, erroring out if mkdir fails.

@deviantony
Copy link
Member

deviantony commented Mar 12, 2018

Thanks for that last comment, it's probably what I thought and related to NFS.

The HTTP handler calls the FileService.StoreTLSFile function here: https://github.com/portainer/portainer/blob/develop/api/http/handler/upload.go#L64

Which will try to create the 18/ folder only if it does not exists: https://github.com/portainer/portainer/blob/develop/api/filesystem/filesystem.go#L120

It seems that the directory check for existence is failing when the underlying filesystem is using NFS: https://github.com/portainer/portainer/blob/develop/api/filesystem/filesystem.go#L211-L222

Might be an issue with Go, I'll search if there is an existing issue open related to os.Stat and NFS.

EDIT: I'm thinking that switching from Os.Mkdir to Os.MkdirAll might solve the problem.

@ncresswell
Copy link
Member

ncresswell commented Mar 12, 2018 via email

@r3pek
Copy link

r3pek commented Mar 12, 2018

Remember, if you are using NFS with Docker volumes, you MUST have "root squash” enabled on your NFS backend.

"MUST" ? For services running as root on the container?

@ncresswell
Copy link
Member

ncresswell commented Mar 13, 2018 via email

@r3pek
Copy link

r3pek commented Mar 13, 2018

Depends on your actual config tho (is the docker daemon running as root user?)

Yeah, dockerd is running as root.

@ncresswell
Copy link
Member

ncresswell commented Mar 13, 2018 via email

@deviantony
Copy link
Member

@r3pek could you try with the image portainer/portainer:pr1719 and tell me if it solves your problem?

@r3pek
Copy link

r3pek commented Mar 13, 2018

@deviantony confirmed working

@deviantony
Copy link
Member

Thanks for the feedback @r3pek, will merge.

@deviantony deviantony added this to the 1.16.x milestone Mar 13, 2018
@deviantony deviantony changed the title Cannot add tls endpoint using GUI Issue when uploading endpoint TLS files Mar 13, 2018
@deviantony deviantony modified the milestones: 1.16.x, 1.16.5 Apr 1, 2018
@ianseyer
Copy link

ianseyer commented May 9, 2018

@deviantony
I am having the same issue, even when using the above pull request image. Using portainer version 1.16.5.

Exact same scenario: create endpoint, upload tls, the creation of the endpoint hangs with:

2018/05/09 14:53:35 http: panic serving 70.115.151.248:51642: interface conversion: interface {} is map[string]interface {}, not []interface {}
goroutine 7088 [running]:
net/http.(*conn).serve.func1(0xc42050dcc0)
	/usr/local/go/src/net/http/server.go:1721 +0xd0
panic(0xa463e0, 0xc420113840)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/portainer/portainer/http/proxy.getResponseAsJSONArray(0xc4202f1a70, 0x0, 0x0, 0x0, 0x8000102, 0x0)
	/go/src/github.com/portainer/portainer/http/proxy/response.go:42 +0xa4
github.com/portainer/portainer/http/proxy.serviceListOperation(0xc420074c00, 0xc4202f1a70, 0xc42048c460, 0x0, 0x0)
	/go/src/github.com/portainer/portainer/http/proxy/services.go:22 +0x32
github.com/portainer/portainer/http/proxy.(*proxyTransport).executeRequestAndRewriteResponse(0xc420048840, 0xc420074c00, 0xb24580, 0xc42048c460, 0x0, 0x8c0000c4204e7600, 0xc4202e2400)
	/go/src/github.com/portainer/portainer/http/proxy/transport.go:391 +0x8d
github.com/portainer/portainer/http/proxy.(*proxyTransport).rewriteOperation(0xc420048840, 0xc420074c00, 0xb24580, 0x9, 0x1, 0xa0dda0)
	/go/src/github.com/portainer/portainer/http/proxy/transport.go:373 +0xca
github.com/portainer/portainer/http/proxy.(*proxyTransport).proxyServiceRequest(0xc420048840, 0xc420074c00, 0xb007d7, 0x9, 0xc42015ea01)
	/go/src/github.com/portainer/portainer/http/proxy/transport.go:143 +0x33c
github.com/portainer/portainer/http/proxy.(*proxyTransport).proxyDockerRequest(0xc420048840, 0xc420074c00, 0xf, 0xc4204be300, 0xe)
	/go/src/github.com/portainer/portainer/http/proxy/transport.go:66 +0x52c
github.com/portainer/portainer/http/proxy.(*proxyTransport).RoundTrip(0xc420048840, 0xc420074c00, 0xf, 0xc4204be300, 0xe)
	/go/src/github.com/portainer/portainer/http/proxy/transport.go:50 +0x35
net/http/httputil.(*ReverseProxy).ServeHTTP(0xc420112e40, 0xe13760, 0xc420582fc0, 0xc420074a00)
	/usr/local/go/src/net/http/httputil/reverseproxy.go:205 +0x3ce
net/http.StripPrefix.func1(0xe13760, 0xc420582fc0, 0xc420074a00)
	/usr/local/go/src/net/http/server.go:1977 +0xcf
net/http.HandlerFunc.ServeHTTP(0xc4200fa540, 0xe13760, 0xc420582fc0, 0xc420074a00)
	/usr/local/go/src/net/http/server.go:1942 +0x44
github.com/portainer/portainer/http/handler.(*DockerHandler).proxyRequestsToDockerAPI(0xc420276a80, 0xe13760, 0xc420582fc0, 0xc420074a00)
	/go/src/github.com/portainer/portainer/http/handler/docker.go:82 +0x294
github.com/portainer/portainer/http/handler.(*DockerHandler).(github.com/portainer/portainer/http/handler.proxyRequestsToDockerAPI)-fm(0xe13760, 0xc420582fc0, 0xc420074a00)
	/go/src/github.com/portainer/portainer/http/handler/docker.go:34 +0x48
net/http.HandlerFunc.ServeHTTP(0xc420271dc0, 0xe13760, 0xc420582fc0, 0xc420074a00)
	/usr/local/go/src/net/http/server.go:1942 +0x44
github.com/portainer/portainer/http/security.(*RequestBouncer).mwCheckAuthentication.func1(0xe13760, 0xc420582fc0, 0xc420074900)
	/go/src/github.com/portainer/portainer/http/security/bouncer.go:157 +0x105
net/http.HandlerFunc.ServeHTTP(0xc42028c000, 0xe13760, 0xc420582fc0, 0xc420074900)
	/usr/local/go/src/net/http/server.go:1942 +0x44
github.com/portainer/portainer/http/security.mwSecureHeaders.func1(0xe13760, 0xc420582fc0, 0xc420074900)
	/go/src/github.com/portainer/portainer/http/security/bouncer.go:78 +0xfd
net/http.HandlerFunc.ServeHTTP(0xc42028c020, 0xe13760, 0xc420582fc0, 0xc420074900)
	/usr/local/go/src/net/http/server.go:1942 +0x44
github.com/gorilla/mux.(*Router).ServeHTTP(0xc42027bc00, 0xe13760, 0xc420582fc0, 0xc420074900)
	/go/src/github.com/gorilla/mux/mux.go:159 +0x101
net/http.StripPrefix.func1(0xe13760, 0xc420582fc0, 0xc420267f00)
	/usr/local/go/src/net/http/server.go:1977 +0xcf
net/http.HandlerFunc.ServeHTTP(0xc4200111a0, 0xe13760, 0xc420582fc0, 0xc420267f00)
	/usr/local/go/src/net/http/server.go:1942 +0x44
github.com/portainer/portainer/http/handler.(*Handler).ServeHTTP(0xc42028fcb0, 0xe13760, 0xc420582fc0, 0xc420267f00)
	/go/src/github.com/portainer/portainer/http/handler/handler.go:56 +0x92b
net/http.serverHandler.ServeHTTP(0xc42008d290, 0xe13760, 0xc420582fc0, 0xc420267f00)
	/usr/local/go/src/net/http/server.go:2568 +0x92
net/http.(*conn).serve(0xc42050dcc0, 0xe14220, 0xc4202b3c80)
	/usr/local/go/src/net/http/server.go:1825 +0x612
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:2668 +0x2ce

But then I refresh the page and it is there; however, trying to connect to the endpoint gives me the tls/ no such file error.

@deviantony
Copy link
Member

@ianseyer I'm unable to reproduce this.

Care to share more details so that we can try to reproduce?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants