-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize heredocs 🎉 #2589
Stabilize heredocs 🎉 #2589
Conversation
Thanks! Looks like this needs a rebase 😅 I guess we want to have #2442 merged as well before "stable", correct? (@tonistiigi ?) |
07a6011
to
8f38655
Compare
Woops, thanks - I did rebase, but only onto origin/master which wasn't pointing to the same as upstream/master 🤦 Definitely would argue for #2442 first (though ultimately up to you all) |
8f38655
to
e5c6c0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other PR is merged now so this can be rebased.
You don't necessarily need to merge everything into the same files. Especially the tests where the files are already huge and hard to manage. Where additional complexity was needed before just to have this functionality under build tags merging is ok.
e5c6c0e
to
970f306
Compare
Have kept the separate test files, but merged in the parts that enabled the parsing of heredocs. |
CI error unrelated but somewhat interesting https://gist.github.com/tonistiigi/6edbcfbe9dd09488cf956b2df704a792 . @AkihiroSuda @crazy-max Possibly related to the containerd update? |
For the stack I'm not sure it's linked to containerd as it happens for the |
Hmm. It failed again the same way on this PR and I've never seen it anywhere else. I don't see how it could be related though. |
Looks like the panic might be somehow related to containerd/go-cni#76 . It is indeed a new codepath 39f6b4e#diff-2af9be1e48878c8f36db87230b5f840ffcc2f2f7cf5c753ecc081b8b4c9e2eb9 that we just vendored. Any ideas @mikebrow ? |
We encountered the same one while merging #2606 on master branch.
|
@dmcgowan @AkihiroSuda @mikebrow Ideas? Seems to be somehow related to containerd/go-cni#76 |
FWIW; I did a quick post on the CNCF slack to ask @mikebrow if he had any ideas https://cloud-native.slack.com/archives/CGEQHPYF4/p1644366840395359 From there;
I haven't looked closely myself, but if anyone on this PR has ideas, feel free to continue the chat there as well 😅 (I'm just trying to get some visibility on the issue) |
checking over other possible causes we updated to github.com/containernetworking/plugins v1.0.1 .. https://github.com/containerd/containerd/blob/main/go.mod#L23 (I should've mentioned the reason I noted this is your log showed you are still on plugins v0.9.1 https://github.com/moby/buildkit/runs/5044909831?check_suite_focus=true#step:6:260) after upgrading the cni api to v1.0.1.. (which is really still 1.0.0 as far as what the supported config version is) https://github.com/containerd/go-cni/blame/main/go.mod#L4 I also note the cni config you guys have here seems .. out of date and actually might not be supported anymore in v1.0.0 + I think that should be 1.0.0 now and read more like this: |
@jedevc we have pushed some changes, can you rebase? |
Signed-off-by: Justin Chadwell <me@jedevc.com>
Signed-off-by: Justin Chadwell <me@jedevc.com>
970f306
to
e1c334b
Compare
Before this patch, the package doesn't init CNIConfig.exec field. It is lazy init by CNIConfig.ensureExec() during AddNetworkList. After enhancement [1] , there are always more than two goroutines(one for loopback and other one for eth0) to call ensureExec(). Go runtime doesn't ensure that value-assignment is atomic, and then it is possible to use nil and panic, like [2]. > Programs that modify data being simultaneously accessed by multiple > goroutines must serialize such access. > > from https://golang.org/ref/mem Advice section. I think it should be fixed in github.com/containernetworking/cni library but we need to delivery containerd release 1.6 version in time. I would like to init CNIConfig like what the `ensureExec()` [3] does instead of lazy init. ------------ How to reproduce it: ```golang package main import "sync" type Exec interface { FindInPath(v int) } type rawExec struct { v int } func (r *rawExec) FindInPath(v int) { r.v += v } type cniConfig struct { exec Exec } func (c *cniConfig) addNetwork(v int) { if c.exec == nil { c.exec = &rawExec{} } c.exec.FindInPath(v) } func main() { c := &cniConfig{} for i := 0; i < 100000; i++ { c.exec = nil // reset var wg sync.WaitGroup for j := 0; j <= 10; j++ { wg.Add(1) go func(k int) { defer wg.Done() c.addNetwork(k) }(j) } wg.Wait() } } ``` And run it ```bash ➜ /tmp go version go version go1.17.3 linux/amd64 ➜ /tmp go run main.go panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x455a40] goroutine 264932 [running]: main.(*rawExec).FindInPath(0x0, 0x0) /tmp/main.go:14 main.(*cniConfig).addNetwork(...) /tmp/main.go:25 main.main.func1(0x0) /tmp/main.go:40 +0xb6 created by main.main /tmp/main.go:37 +0xb8 exit status 2 ➜ /tmp ``` Root Cause: The panic meesage is `main.(*rawExec).FindInPath(0x0, 0x0)` which means that the interface `exec` var has the underly type `rawExec` but its value is zero `FindInPath(0x0`, like what [2] mentioned. ```golang-assembly // go tool compile -S ./main.go // "".(*cniConfig).addNetwork STEXT size=148 args=0x10 locals=0x20 funcid=0x0 ... ... # (AX) is cniConfig address -> checkout c.exec == nil 0x0014 00020 (./main.go:22) CMPQ (AX), $0 0x0018 00024 (./main.go:22) JNE 95 # if c.exec == nil # # Since the go interface is struct iface # # https://github.com/golang/go/blob/release-branch.go1.17/src/runtime/runtime2.go#L202 # # type iface struct { # tab *itab # data unsafe.Pointer # } # # The value-assignment will be two instructions(one for tab and # the other one for data. 0x001a 00026 (./main.go:21) MOVQ AX, "".c+40(SP) 0x001f 00031 (./main.go:25) MOVQ BX, ""..autotmp_4+16(SP) # # AX will be rawExec type information # # type."".rawExec SRODATA size=120 # ... # rel 24+8 t=1 runtime.memequal64·f+0 # rel 32+8 t=1 runtime.gcbits.+0 # rel 40+4 t=5 type..namedata.*main.rawExec-+0 # rel 44+4 t=5 type.*"".rawExec+0 # rel 48+8 t=1 type..importpath."".+0 # rel 56+8 t=1 type."".rawExec+96 # rel 80+4 t=5 type..importpath."".+0 # rel 96+8 t=1 type..namedata.v-+0 # rel 104+8 t=1 type.int+0 # 0x0024 00036 (./main.go:23) LEAQ type."".rawExec(SB), AX 0x002b 00043 (./main.go:23) PCDATA $1, $0 0x002b 00043 (./main.go:23) CALL runtime.newobject(SB) # # CX will be itab and get FindInPath TEXT address. # # go.itab.*"".rawExec,"".Exec SRODATA dupok size=32 # ... # rel 0+8 t=1 type."".Exec+0 # rel 8+8 t=1 type.*"".rawExec+0 # rel 24+8 t=-32767 "".(*rawExec).FindInPath+0 # 0x0030 00048 (./main.go:23) LEAQ go.itab.*"".rawExec,"".Exec(SB), CX 0x0037 00055 (./main.go:23) MOVQ "".c+40(SP), DX 0x003c 00060 (./main.go:23) MOVQ CX, (DX) 0x003f 00063 (./main.go:23) PCDATA $0, $-2 0x003f 00063 (./main.go:23) CMPL runtime.writeBarrier(SB), $0 0x0046 00070 (./main.go:23) JNE 78 0x0048 00072 (./main.go:23) MOVQ AX, 8(DX) 0x004c 00076 (./main.go:23) JMP 87 0x004e 00078 (./main.go:23) LEAQ 8(DX), DI 0x0052 00082 (./main.go:23) CALL runtime.gcWriteBarrier(SB) 0x0057 00087 (./main.go:25) PCDATA $0, $-1 0x0057 00087 (./main.go:25) MOVQ DX, AX 0x005a 00090 (./main.go:25) MOVQ ""..autotmp_4+16(SP), BX # if c.exec != nil and try to call FindInPath # # (AX) is cniConfig # 0x005f 00095 (./main.go:25) MOVQ (AX), CX # # 8(AX) is rawExec address (receiver) # 0x0062 00098 (./main.go:25) MOVQ 8(AX), AX # # 24(CX) is the FindInPath TEXT address. # 0x0066 00102 (./main.go:25) MOVQ 24(CX), CX 0x006a 00106 (./main.go:25) PCDATA $1, $1 0x006a 00106 (./main.go:25) CALL CX 0x006c 00108 (./main.go:26) MOVQ 24(SP), BP 0x0071 00113 (./main.go:26) ADDQ $32, SP 0x0075 00117 (./main.go:26) RET ... ... ``` Since there are multiple goroutines to check `c.exec == nil`, if the lazy-init only assigns iface.tab and the FindInPath TEXT address, interface's value is still nil and the FindInPath will use the interface's value as method receiver. It will panic if try to read the value from receiver. Reference: [1] containerd#76 [2] moby/buildkit#2589 (comment) [3] https://github.com/containernetworking/cni/blob/1694fd7b57e0176a98a12823a5ffc03337fdc152/libcni/api.go#L182 Fixes: containerd#76 Signed-off-by: Wei Fu <fuweid89@gmail.com>
Try to fix it in go-cni containerd/go-cni#82 |
Whoop! And thanks, @fuweid! |
Fixes #2574.
The only relevant commits here are 351c15c and 07a6011 (the rest are the commits from #2442, since I didn't particularly enjoy the idea of trying to resolve merge conflicts after new tests got added there).
Essentially, we just remove the
dfheredoc
tag, and combine all the files in together now that we don't need the separation. Additionally, I've also updated the docs for what I assume will probably be a dockerfile 1.4.0 release? Not sure if that's needed, so I'm happy if that patch gets dropped.Let me know if there's anything else needed.