Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use traefik as reverse proxy with an Agent setup #1897

Closed
schemen opened this issue May 14, 2018 · 14 comments
Closed

Unable to use traefik as reverse proxy with an Agent setup #1897

schemen opened this issue May 14, 2018 · 14 comments

Comments

@schemen
Copy link

schemen commented May 14, 2018

Bug description

I cannot access Portainer through a Traefik generated reverse proxy (Let's say mapped to "console.example.com").

I recieve a 504 Error, it seems to timeout.
This issue does not occur if you use the agentless setup.

Expected behavior
I can normally access Portainer through a reverse proxy and HTTPS.

Steps to reproduce the issue:

See technical details for a docker-compose.yml

Technical details:

  • Portainer version: 1.17.0
  • Docker version: 18.03.1-ce in a Docker Swarm setup
  • Platform: Linux
  • Command used to start Portainer: Compose
  • Browser: Chrome

Additional context

version: "3"

services:
  app:
    image: portainer/portainer
    volumes:
      - /var/data/portainer:/data
    networks:
      - traefik_public
      - default
    deploy:
      labels:
        - traefik.frontend.rule=Host:console.example.com
        - traefik.port=9000
      placement:
        constraints: [node.role == manager]
   command: -H tcp://tasks.agent:9001 --tlsskipverify
   

  agent:
   image: portainer/agent
   environment:
     AGENT_CLUSTER_ADDR: tasks.agent
   volumes:
     - /var/run/docker.sock:/var/run/docker.sock
   deploy:
    mode: global

networks:
  traefik_public:
    external: true
  default:
    driver: overlay  
@deviantony
Copy link
Member

Hi @schemen

Could you copy/pate the Portainer logs? Can you access the authentication view of Portainer or not? Is it working in your environment without Traefik?

@schemen
Copy link
Author

schemen commented May 14, 2018

Hi!

Thanks for the quick reply.

When I add the ports with 9000:9000 I can log into Portainer/Create an admin user through the ingress IP or any IP belonging to the cluster.

The only log entry I receive on the portainer frontend is the following:
portainer_app.1.meq603hr06nc@app3 | 2018/05/14 14:03:39 Starting Portainer 1.17.0 on :9000

When accessed through the ingress network (i.e. "host1.example.com:9000") I can see everything and the agent works beautifully.

@schemen
Copy link
Author

schemen commented May 14, 2018

Maybe it's the involvment of multiple networks? Because without the agents and in it's standalone deployment version 1.17.0 works fine

The frontend is attached to two networks, the one for public access via portainer and the one for the agents.

@deviantony
Copy link
Member

Does it work if you remove it from the agent network? If so, you could expose the port 9001 on the agent service and use any node IP as endpoint URL.

@schemen
Copy link
Author

schemen commented May 14, 2018

This is currently not working. What I did:

version: "3"

services:
  app:
    image: portainer/portainer
    volumes:
      - /var/data/portainer:/data
    networks:
      - traefik_public
    deploy:
      labels:
        - traefik.frontend.rule=Host:swarm.example.com
        - traefik.port=9000
      placement:
        constraints: [node.role == manager]
    command: -H tcp://cluster.example.com:9001 --tlsskipverify
    #command: -H unix:///var/run/docker.sock

  agent:
    image: portainer/agent
    ports:
      - "9001:9001"
    environment:
      AGENT_CLUSTER_ADDR: cluster.example.com
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    deploy:
      mode: global

networks:
  traefik_public:
    external: true

This is one of the errors I receive:
portainer_app.1.1ohn2wq8pjiw@app1 | 2018/05/14 17:00:09 http: proxy error: Invalid Docker response

After throwing out everything, it seems that I can still click at the Swarm field and see active tasks and the cluster visualisation, but everything else throws that error and times out.

I've tried without TLS and with TLS but no verification.

@deviantony
Copy link
Member

@schemen I believe that you still need to deploy the agent inside an overlay network and use tasks.agent as the value of the AGENT_CLUSTER_ADDR env var.

@schemen
Copy link
Author

schemen commented May 14, 2018

Lemme try it out :)

@schemen
Copy link
Author

schemen commented May 14, 2018

@deviantony
Mh, sadly this is a negative.

When i tried that I received the following error logs

portainer_agent.0.9mhhupwt2tl9@app2    | 2018/05/14 18:18:44 [ERR] memberlist: Failed to send ping: write udp [::]:7946->10.255.0.68:7946: sendto: operation not permitted
portainer_agent.0.j3nop71hf2cx@app3    | 2018/05/14 18:18:44 [ERR] memberlist: Failed to send ping: write udp [::]:7946->10.255.0.67:7946: sendto: operation not permitted
portainer_agent.0.9mhhupwt2tl9@app2    | 2018/05/14 18:18:44 http error: Get https://10.255.0.67:9001/containers/json?all=1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) (code=500)
portainer_agent.0.8hvceadmf27i@app1    | 2018/05/14 18:18:44 http error: Get https://10.255.0.66:9001/images/json?all=0: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) (code=500)
portainer_agent.0.9mhhupwt2tl9@app2    | 2018/05/14 18:18:44 http error: Get https://10.255.0.67:9001/volumes: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) (code=500)
portainer_agent.0.8hvceadmf27i@app1    | 2018/05/14 18:18:44 http error: Get https://10.255.0.68:9001/networks: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) (code=500)
portainer_agent.0.8hvceadmf27i@app1    | 2018/05/14 18:18:45 [ERR] memberlist: Failed to send ping: write udp [::]:7946->10.255.0.68:7946: sendto: operation not permitted

And a lot more like that as time went on.
is there any verbose option/debug option on the frontend side?

@deviantony
Copy link
Member

deviantony commented May 14, 2018

Did you try something like this ?

version: "3"

services:
  app:
    image: portainer/portainer
    volumes:
      - /var/data/portainer:/data
    networks:
      - traefik_public
    deploy:
      labels:
        - traefik.frontend.rule=Host:swarm.example.com
        - traefik.port=9000
      placement:
        constraints: [node.role == manager]
    command: -H tcp://NODE1_IP:9001 --tlsskipverify


  agent:
    image: portainer/agent
    ports:
      - "9001:9001"
    environment:
      AGENT_CLUSTER_ADDR: tasks.agent
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - agent_network
    deploy:
      mode: global

networks:
  traefik_public:
    external: true
  agent_network:
    driver: overlay

@schemen
Copy link
Author

schemen commented May 14, 2018

Yeah exactly like that. Although above solution would definitely not be ideal as it would expose the agents to any external access.

I have found a workaround just now:
If I add the agents to the traefik_public network I was able to add the agents as an endpoint with the address tasks.agent:9001 again. This is probably working because they're all in the same network.

I am really not sure what this issue might be here :S I hope I didn't bring a Traefik issue to you guys!

@deviantony
Copy link
Member

By using the setup above, you must use the public IP of one of the nodes as the endpoint URL, tasks.agents:9001 won't work as the Portainer instance is not located inside the same overlay network (that's why it works when you add the agents to traefik_public).

@schemen
Copy link
Author

schemen commented May 14, 2018

Sigh... It was my own fault, I am very sorry to bring this up.
If using multiple networks you should define which network to use for traefik so it doesn't fail at routing.
https://docs.traefik.io/configuration/backends/docker/#using-docker-with-swarm-mode

So, a complete composer-file for traefik as an example would look like this:

version: "3"

services:
  app:
    image: portainer/portainer
    volumes:
      - /var/data/portainer:/data
    networks:
      - traefik_public
      - default
    deploy:
      labels:
        - traefik.docker.network=traefik_public
        - traefik.frontend.rule=Host:console.example.com
        - traefik.port=9000
      placement:
        constraints: [node.role == manager]
    command: -H tcp://tasks.agent:9001 --tlsskipverify

  agent:
    image: portainer/agent
    networks:
      - default
    environment:
      AGENT_CLUSTER_ADDR: tasks.agent
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    deploy:
      mode: global

networks:
  traefik_public:
    external: true
  default:
    driver: overlay

I think this issue can be closed. This has been an issue on my traefik configuration.

@deviantony
Copy link
Member

Glad you solved it ! Feel free to write a small post about it, might help other users ;-)

@pascalandy
Copy link

Here is my stack on docker-stack-this

xAt0mZ pushed a commit that referenced this issue Aug 25, 2022
)

* fix(migration): close the database before running backups

On certain filesystems, particuarly NTFS when a network mounted windows
file server is used to store portainer's database, you are unable to
copy the database while it is open. To fix this we simply close the
database and then re-open it after a backup.

* handle close and open errors

* dont return error on nil
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants