Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: incorrect GC bitmap when global's type is in another shared object #39927

Closed
aclements opened this issue Jun 29, 2020 · 10 comments
Closed
Labels
Milestone

Comments

@aclements
Copy link
Member

@aclements aclements commented Jun 29, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14 linux/amd64

Also reproduced on master and as far back at Go 1.10.

Does this issue reproduce with the latest release?

Yes.

What did you do?

When linking in shared mode, if the type of a global variable is imported from a package in a different shared object, the global variable is marked as containing no pointers, which can cause the garbage collector to free reachable memory.

Here's an example from @cherrymui:

p.go

package p

type T [10]*int

main.go

package main

import "p"
import "runtime"

var x p.T
var y [10]*uint

func main() {
	for i := range x {
		x[i] = new(int)
		*x[i] = 12345
	}
	runtime.GC()
	runtime.GC()
	runtime.GC()
	for i := range y {
		y[i] = new(uint)
		*y[i] = 54321
	}
	runtime.GC()
	runtime.GC()
	runtime.GC()
	for i := range x {
		println(x[i], *x[i], y[i], *y[i])
	}
}
$ GOPATH=/tmp go install -buildmode=shared runtime sync/atomic
$ GOPATH=/tmp go install -buildmode=shared -linkshared p
$ GOPATH=/tmp go build -linkshared main
$ ./main
0xc00004a080 54321 0xc00004a080 54321
0xc00004a088 54321 0xc00004a088 54321
0xc00004a090 54321 0xc00004a090 54321
0xc00004a098 54321 0xc00004a098 54321
0xc00004a0a0 54321 0xc00004a0a0 54321
0xc00004a0a8 54321 0xc00004a0a8 54321
0xc00004a0b0 54321 0xc00004a0b0 54321
0xc00004a0b8 54321 0xc00004a0b8 54321
0xc00004a0c0 54321 0xc00004a0c0 54321
0xc00004a0c8 54321 0xc00004a0c8 54321

What did you expect to see?

In the example program, the garbage collector should not have freed the objects reachable from x, so it should have outputted:

0xc00004a080 12345 0xc00004a080 54321

What did you see instead?

The garbage collector incorrectly freed the objects reachable from x because its type was declared in a package in a different shared object.

/cc @cherrymui @jeremyfaller @thanm

@aclements aclements added the NeedsFix label Jun 29, 2020
@aclements aclements added this to the Go1.15 milestone Jun 29, 2020
@aclements
Copy link
Member Author

@aclements aclements commented Jun 29, 2020

This happens when the linker is building the gcprog for the globals. It calls decodetypeGcmask (or decodetypeGcprog) for the global's type. Since the type is declared in another shared object, it has symbol type sym.SDYNIMPORT, so decodetypeGcmask attempts to read it out of the right section in the shared object. However, the shared object has already been closed, so sect.ReadAt returns a "file already closed error". decodetypeGcmask ignores this error and simply returns the freshly-allocated, zeroed bitmap, indicating that the type has no pointers.

There's a secondary issue on ARM64 on master where decodetypeGcprogShlib returns 0, which causes findShlibSection to return the NULL section (which has address 0 and length 0), causing decodetypeGcmask to look in the wrong section, in addition to trying to read from a closed file.

@aclements
Copy link
Member Author

@aclements aclements commented Jun 29, 2020

For the secondary issue, CL 215997 made decodetypeGcprogShlib return 0 on ARM64, where previously it would return the addend of the relocation on the gcdata field (these relocations were collected by ldshlibsyms). However, while returning 0 is definitely wrong, we no longer need any special case for this on ARM64 because this addend now gets applied in place in the shared object, meaning the contents of the gcdata field are already the correct pointer on ARM64, just like on all the other platforms.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 29, 2020

Change https://golang.org/cl/240462 mentions this issue: cmd/link: fix GC data reading from shared library

@gopherbot gopherbot closed this in 5779bb4 Jun 30, 2020
@networkimprov
Copy link

@networkimprov networkimprov commented Jun 30, 2020

Backport candidate?

gopherbot pushed a commit that referenced this issue Jun 30, 2020
This reverts CL 240462.

Reason for revert: test fails on PPC64LE.

Updates #39927.

Change-Id: I4f14fd0c36e604a80ae9f2f86d1e643e28945e93
Reviewed-on: https://go-review.googlesource.com/c/go/+/240616
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
@gopherbot
Copy link

@gopherbot gopherbot commented Jun 30, 2020

Change https://golang.org/cl/240616 mentions this issue: Revert "cmd/link: fix GC data reading from shared library"

@aclements
Copy link
Member Author

@aclements aclements commented Jun 30, 2020

Change reverted because it broke ppc64le.

@aclements aclements reopened this Jun 30, 2020
@aclements
Copy link
Member Author

@aclements aclements commented Jun 30, 2020

@gopherbot, please backport to Go 1.14 and 1.13.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 30, 2020

Backport issue(s) opened: #39955 (for 1.14), #39956 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 30, 2020

Change https://golang.org/cl/240621 mentions this issue: cmd/link: fix GC data reading from shared library (attempt 2)

@gopherbot gopherbot closed this in 7799756 Jul 1, 2020
@gopherbot
Copy link

@gopherbot gopherbot commented Jul 1, 2020

Change https://golang.org/cl/240511 mentions this issue: [release-branch.go1.14] cmd/link: fix GC data reading from shared library (attempt 2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.