Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugins panic #8937

Closed
2 tasks done
jliebert opened this issue Apr 13, 2022 · 19 comments · Fixed by traefik/yaegi#1494
Closed
2 tasks done

Plugins panic #8937

jliebert opened this issue Apr 13, 2022 · 19 comments · Fixed by traefik/yaegi#1494
Assignees
Labels
area/plugins kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. status/5-frozen-due-to-age
Projects

Comments

@jliebert
Copy link

jliebert commented Apr 13, 2022

Welcome!

  • Yes, I've searched similar issues on GitHub and didn't find any.
  • Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

I use the plugins traefik/plugin-blockpath v0.2.1, the plugins is working, path are correctly blocked as I want.

But I get many "panic" error in the log files of Traefik.

What did you see instead?

Panic messages from Traefik:

Apr 13 14:14:44 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:44Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:44 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:44Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:50 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:50Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:50 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:50Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath

What version of Traefik are you using?

Version: 2.6.1
Codename: rocamadour
Go version: go1.17.7
Built: 2022-02-14T16:50:25Z
OS/Arch: linux/amd64

What is your environment & configuration?

pilot:
  token: "secret"

entryPoints:
  http:
    address: :80
  https:
    address: :443
  
providers:
  providersThrottleDuration: 2s
  docker:
    watch: true
    network: webgateway
    exposedByDefault: false
    swarmMode: true
    swarmModeRefreshSeconds: 15s
  
api:
  insecure: true
  dashboard: true

log:
  level: INFO

accessLog: {}

certificatesResolvers:

experimental:
  plugins:
    blockpath:
      modulename: "github.com/traefik/plugin-blockpath"
      version: "v0.2.1"

dynamic_conf.yaml is:

http:
  middlewares:
    # Common bad path rules
    block-badpath:
      plugin:
        blockpath:
          regex:
            - '^/autodiscover/autodiscover.xml$'
            - '^/AutoDiscover/autodiscover.xml$'
            - '^/(w00tw00t|phppath/|pma/|phpMyAdmin|phpmyadmin)'
            - '^/%70%68%70%70%61%74%68/'
            - '/\.env(\?.*|)$'
            - '\.(bak|backup|conf|log|properties|sql|tar)(\?.*|)$'
            - '^/Private/'
            - 'typo3conf/ext/(.*)/Configuration/(TypoScript|FlexForms|TCA|ExtensionBuilder)/'
            - '\.inc\.php(\?.*|)$'

    # Rule only for Typo3 CMS
    block-badpath-typo3:
      plugin:
        blockpath:
          regex:
            - 'wp-(admin|login|content)'

    # Rule only for Drupal CMS
    block-badpath-drupal:
      plugin:
        blockpath:
          regex:
            - 'wp-(admin|login|content)'

tls:
  options:
    default: 
      minVersion: VersionTLS12
      sniStrict: true
      cipherSuites: 
        - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
    
  stores:
    default: {}

If applicable, please paste the log output in DEBUG level

Apr 13 14:14:44 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:44Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:44 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:44Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:50 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:50Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
Apr 13 14:14:50 swarm_manager_3 traefik-ha_V2--traefik[1140]: time="2022-04-13T12:14:50Z" level=error msg="plugins-storage/sources/gop-1248628458/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic" module=github.com/traefik/plugin-blockpath plugin=plugin-blockpath
@rtribotte
Copy link
Member

Hello @jliebert,

Thanks for your interest in Traefik!

Can you share the configuration of the router for which the problem is happening?

@jliebert
Copy link
Author

jliebert commented Apr 14, 2022

Hello @rtribotte,

I have 144 routers, 14 use blockpath plugins like this example:

    labels:
        traefik.enable: 'true'
        traefik.http.middlewares.dummy-site--web--headers.headers.customresponseheaders.server: server
        traefik.http.middlewares.dummy-site--web--redirect-host.redirectregex.permanent: 'true'
        traefik.http.middlewares.dummy-site--web--redirect-host.redirectregex.regex: '^https://([^/]+)'
        traefik.http.middlewares.dummy-site--web--redirect-host.redirectregex.replacement: 'https://www.dummy.com'
        traefik.http.routers.dummy-site--web--secure.entrypoints: https
        traefik.http.routers.dummy-site--web--secure.middlewares: 'block-badpath@http,block-badpath-drupal@http,dummy-site--web--redirect-host,dummy-site--web--headers'
        traefik.http.routers.dummy-site--web--secure.priority: '100'
        traefik.http.routers.dummy-site--web--secure.rule: 'Host(`www.dummy.com`,`dummy.com`)'
        traefik.http.routers.dummy-site--web--secure.tls: 'true'
        traefik.http.services.dummy-site--web--secure.loadbalancer.server.port: '80'

I found no correlation between client http request and panic or http request not being logged at all.

@rtribotte
Copy link
Member

Hello @jliebert,

Thanks for your feedback,

We have everything to try to reproduce the problem, we'll let you know if we can reproduce it, or get back to you if we need more information.

@rtribotte rtribotte added kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. and removed status/0-needs-triage contributor/waiting-for-feedback labels Apr 15, 2022
@rtribotte rtribotte added this to issues in v2 via automation Apr 15, 2022
@rtribotte rtribotte self-assigned this Apr 15, 2022
@jliebert
Copy link
Author

jliebert commented May 19, 2022

I still have "panic" with version 2.6.6.

Our conclusion that was may be caused by a bug in Yaegi, reference from this discussion: traefik/plugin-blockpath#12 (comment)

@wzator
Copy link

wzator commented Nov 20, 2022

the same problems with version 2.9.4

{"level":"error","module":"github.com/traefik/plugin-blockpath","msg":"plugins-storage/sources/gop-3290823723/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic","plugin":"plugin-blockpath","time":"2022-11-20T17:13:37Z"}
{"level":"error","module":"github.com/traefik/plugin-blockpath","msg":"plugins-storage/sources/gop-3290823723/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic","plugin":"plugin-blockpath","time":"2022-11-20T17:13:38Z"}
{"level":"error","module":"github.com/traefik/plugin-blockpath","msg":"plugins-storage/sources/gop-3290823723/src/github.com/traefik/plugin-blockpath/blockpath.go:48:17: panic","plugin":"plugin-blockpath","time":"2022-11-20T17:13:50Z"}

@nmengin nmengin assigned mpl and unassigned rtribotte Nov 22, 2022
@mpl
Copy link
Collaborator

mpl commented Nov 22, 2022

FYI, there is nothing in the core of the plugin that is not supported by yaegi, i.e. there is no problem if you manually run with yaegi something like that:

package main

import (
	"log"
	"net/http"
	"net/http/httputil"
	"net/url"
	"regexp"
)

type Middleware struct {
	regexps []*regexp.Regexp
	next    http.Handler
}

func (h *Middleware) ServeHTTP(rw http.ResponseWriter, r *http.Request) {
	currentPath := r.URL.EscapedPath()

	for _, re := range h.regexps {
		println(currentPath, "VS", re.String())
		if re.MatchString(currentPath) {
			rw.WriteHeader(http.StatusForbidden)
			return
		}
	}

	h.next.ServeHTTP(rw, r)
}

func main() {
	reg := regexp.MustCompile("/foo.*")
	dest, err := url.Parse("http://localhost:6060")
	if err != nil {
		log.Fatal(err)
	}
	next := httputil.NewSingleHostReverseProxy(dest)
	http.Handle("/", &Middleware{
		next:    next,
		regexps: []*regexp.Regexp{reg},
	})
	log.Fatal(http.ListenAndServe(":9090", nil))
}

Moreover, if there was something fundamental unsupported, then you'd get panics/errors for each request.
Can you think of anything specific about the requests that are related to the panic?
As an unrelated example, yaegi used to have a problem when the incoming request was about a websocket upgrade.

@jliebert
Copy link
Author

I get the panics randomly only on production and I don't have service with websocket.

@mpl
Copy link
Collaborator

mpl commented Nov 23, 2022

@jliebert just a detail, but why do I see the name "traefik-ha_V2" in your logs? where does it come from? is it the name of a docker image you built yourself?

@jliebert
Copy link
Author

"traefik-ha_V2" is the name of the stack on the Swarm cluster, I use the standard image for traefik service.
Logs of service are send by syslog with a label "traefik-ha_V2--traefik" it's why you see this in my log.

@Thor77
Copy link

Thor77 commented Dec 9, 2022

We're seeing the same problem, only in our production environment as well which makes this really hard to debug as there's quite a lot of traffic so we can't just enable debug mode and I don't see anything strange in the access log (if it would even be logged in that case). For us this occurs with a custom middleware in this line:

url, _ := url.Parse(req.RequestURI)

so I assume the request is null in that case as described above.
The only useful information I can add: we have this issue since at least Traefik 2.7.1 with this middleware.

@mathieuHa
Copy link

mathieuHa commented Dec 9, 2022

Hello,

I've observed the same issue with Traefik 2.9.6 and a basic plugin which only serves the requests.

func (bouncer *Bouncer) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
	if !bouncer.enabled {
		bouncer.next.ServeHTTP(rw, req)
		return
	}
...
}

An issue #60 is open in a plugin repository, but I believe the error could come from Traefik.

I've found a way to reproduce the "panic" at each request.

The dozzle container triggers this error each time you change container logs.

Here is a following docker-compose that can be used:

version: "3.8"

services:
  traefik:
    image: "traefik:v2.9.6"
    container_name: "traefik"
    restart: unless-stopped
    command:
      - "--log.level=DEBUG"
      - "--accesslog"
      - "--accesslog.filepath=/var/log/traefik/access.log"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"

      - "--experimental.plugins.bouncer.modulename=github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin"
      - "--experimental.plugins.bouncer.version=v1.1.5"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "logs:/var/log/traefik"
    ports:
      - 8000:80
      - 8080:8080
    depends_on:
      - 'crowdsec'

  crowdsec:
    image: crowdsecurity/crowdsec:v1.4.1
    container_name: "crowdsec"
    restart: unless-stopped
    environment:
      COLLECTIONS: crowdsecurity/traefik
      CUSTOM_HOSTNAME: crowdsec
      # We need to register one api key per service we will use
      BOUNCER_KEY_TRAEFIK_1: FIXME-LAPI-KEY-1
    volumes:
      - ./acquis.yaml:/etc/crowdsec/acquis.yaml:ro
      - logs:/var/log/traefik:ro
      - crowdsec-db:/var/lib/crowdsec/data/
      - crowdsec-config:/etc/crowdsec/
    labels:
      - "traefik.enable=false"
  
  dozzle:
    container_name: dozzle
    image: amir20/dozzle:latest
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.router-doz.rule=Host(`localhost`)"
      - "traefik.http.routers.router-doz.entrypoints=web"
      - "traefik.http.routers.router-doz.middlewares=crowdsec-doz@docker" 
      - "traefik.http.services.service-doz.loadbalancer.server.port=8080"
      - "traefik.http.middlewares.crowdsec-doz.plugin.bouncer.enabled=false"
      - "traefik.http.middlewares.crowdsec-doz.plugin.bouncer.crowdseclapikey=FIXME-LAPI-KEY-1"
      - "traefik.http.middlewares.crowdsec-doz.plugin.bouncer.loglevel=DEBUG"

volumes:
  logs:
  crowdsec-db:
  crowdsec-config:

Navigate to http://localhost:8000/
Change container by switching from Crowdsec to Traefik to dozzle.

This will trigger the following logs:

time="2022-12-09T19:17:40Z" level=debug msg="Request has been aborted [10.0.XX.XX:57628 - /api/logs/stream?id=fd7b0e109a4c&lastEventId=]: net/http: abort Handler" middlewareName=traefik-internal-recovery middlewareType=Recovery
time="2022-12-09T19:17:51Z" level=error msg="plugins-storage/sources/gop-2381581510/src/github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin/bouncer.go:176:6: panic" plugin=plugin-bouncer module=github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin
time="2022-12-09T19:17:51Z" level=debug msg="Request has been aborted [10.0.XX.XX:57630 - /api/logs/stream?id=68b1c915e432&lastEventId=]: net/http: abort Handler" middlewareName=traefik-internal-recovery middlewareType=Recovery
time="2022-12-09T19:18:04Z" level=error msg="plugins-storage/sources/gop-2381581510/src/github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin/bouncer.go:176:6: panic" module=github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin plugin=plugin-bouncer
time="2022-12-09T19:18:04Z" level=debug msg="Request has been aborted [10.0.XX.XX:57631 - /api/logs/stream?id=fd7b0e109a4c&lastEventId=]: net/http: abort Handler" middlewareType=Recovery middlewareName=traefik-internal-recovery

It happens every time a new request is closed and open.
Requests appears to be websockets.

@mpl
Copy link
Collaborator

mpl commented Dec 10, 2022

Thank you, I'll see if I can reproduce as well with the new information.

@Thor77
Copy link

Thor77 commented Dec 12, 2022

I found an easier way to reproduce this using httpbin: https://github.com/Thor77/traefik-plugin-panic
It looks like there's an issue with streaming connections, code for the response generation in httpbin: https://github.com/postmanlabs/httpbin/blob/f8ec666b4d1b654e4ff6aedd356f510dcac09f83/httpbin/core.py

@mpl
Copy link
Collaborator

mpl commented Dec 13, 2022

@Thor77 Hey. I wanted to try your repro, but I haven't found github.com/thor77/forward . Is it a copy of another existing plugin that I can find elsewhere?

@Thor77
Copy link

Thor77 commented Dec 13, 2022

@mpl it's included in the repo - https://github.com/Thor77/traefik-plugin-panic/tree/master/forward and mounted to the correct location in the docker-compose file. If there's any difference between the local plugin mode and plugins fetched remotely I can also create a separate repo for it.

@mpl
Copy link
Collaborator

mpl commented Dec 16, 2022

Hi all,

TLDR: I believe there's nothing to worry about.

Longer story: I believe what is happening (and I am almost sure of it in the case of @mathieuHa , because he provided debug logs), is simply that you see the response copy error that is happening when a request is cancelled (for a variety of reasons). When such a thing happens, the stdlib reverse-proxy's handler sends a panic with the http.ErrAbortHandler sentinel error (which is the "net/http: abort Handler" message we see in @mathieuHa logs). This panic is then caught by the caller, and in that particular sentinel case, it is suppressed, which is why (among other things) you do not see any stack trace.

So I believe what you see is the intended behaviour, but if you want to make sure of it, you should enable the debug logs, and check whether you see the message with the abort Handler error, whenever you see the panic message from the plugin. If you see them together, then chances are there is nothing wrong going on.

Now, the last bit that I have to investigate is whether there's something wrong with yaegi regarding the fact that it does print the panic message. I believe that it should not have had to deal with the panic in the first place, as the panic should have been suppressed before the recover behaviour of yaegi takes control, but I need to make sure of that. In any case, aside from making the log disappear, I don't think anything would change in the behaviour, so I'm not really worried either way. I'll keep the issue open for now, until I can figure out that last part.

@jliebert
Copy link
Author

jliebert commented Dec 16, 2022

Hi, I don't panic about this panic, but how-to now when they are a real panic? "If all the boats send sos the rescue is lost".

@mpl
Copy link
Collaborator

mpl commented Dec 17, 2022

@jliebert that's what I already explained above. Enable debug level for logging, and if you see the "net/http: abort Handler" messages as well, then you're probably in the scenario I described.

@mathieuHa
Copy link

Hey @mpl,

Thanks for the detailed description of what is happening. Clear on my side.
I think it would be great if we did not see the panic message since it is not really an error :)

Mathieu

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/plugins kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. status/5-frozen-due-to-age
Projects
No open projects
v2
Done
Development

Successfully merging a pull request may close this issue.

8 participants