Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

symbol garbling is not reproducible across platforms within the same versions of Go/Garble/GOOS. #449

Closed
kortschak opened this issue Jan 8, 2022 · 12 comments · Fixed by #496
Assignees
Labels
enhancement New feature or request

Comments

@kortschak
Copy link
Contributor

What version of Garble and Go are you using?

$ garble version
v0.5.0
$ go version                   
go version go1.17.6 linux/amd64
$ garble version
v0.5.0
$ go version                   
go version go1.17.6 darwin/amd64

What environment are you running Garble on?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN="/home/user/bin"
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/user/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/user/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.6"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1321203981=/tmp/go-build -gno-record-gcc-switches"

(similar on darwin)

What did you do?

Compile the following code on linux with GOOS=darwin and on darwin with GOOS unset (so implicitly darwin). Then run both executables on darwin.

package main

func main() {
	panic("panic message")
}

What did you expect to see?

The same output from the two executables.

What did you see instead?

From the executable built on linux:

panic: panic message

goroutine 1 [running]:
main.main()
	_VX2T3dM.go:1 +0x27

From the executable built on darwin:

panic: panic message

goroutine 1 [running]:
main.main()
	FYt7Rx68.go:1 +0x27

The impact of this is that when doing garble reverse the HOSTOS needs to be known which will impact users who make use of cross-compilation for packaging, say building for all GOOSs on linux, which is not uncommon.

@mvdan
Copy link
Member

mvdan commented Jan 8, 2022

I think it's reasonable that building for GOOS=darwin from both Linux and Mac should result in the same obfuscation, and ideally in the same binary (assuming all other variables are equal, and CGO_ENABLED=0 is consistent).

This is not the case today, because we make use of Build IDs to seed the internal hashing, and those build IDs are rather platform-specific. We even use garble's own Build ID (embedded in its binary) to seed some globally-scoped hashes, such as exported field obfuscations. This will definitely change depending on what GOOS/GOARCH garble is built and ran on.

Aside from the above - do we care about GOOS=linux garble build and GOOS=darwin garble build resulting in exactly the same obfuscated names? It's a similar scenario, but not the same as before. One upside is that reversing becomes easier, as one can reverse output from any binary given the same version, regardless of GOOS/GOARCH. One downside is that obfuscation quality might be worse by default; if one can figure out how to de-obfuscate some names for one GOOS, they can just copy-paste the learnings to others. Another downside is that builds across target platforms aren't meant to be identical and reproducible, so making obfuscation identical doesn't make a ton of sense.

So my intuition is to only do what the first paragraph of this comment says: make obfuscation reproducible as long as all the versions (Go, garble, source modules) and build parameters (GOOS, GOARCH, -tags, etc) are identical.

@mvdan
Copy link
Member

mvdan commented Jan 8, 2022

Having said the above, I'm not sure how we could do this at a technical level. I'm thinking about what makes sense from the user's perspective first.

@kortschak
Copy link
Contributor Author

kortschak commented Jan 8, 2022

Build parameters that are considered need to also include GOHOSTOS and GOHOSTARCH if the first paragraph does not become a goal.

I agree that it's not sensible to try to make cross GOOS builds obfuscate the same way, there are too many differences between the platforms already to make that a sensible thing to aim for.

@mvdan
Copy link
Member

mvdan commented Jan 8, 2022

Build parameters that are considered need to also include GOHOSTOS and GOHOSTARCH if the first paragraph does not become a goal.

If the first paragraph does not become a goal, then we have nothing to do; GOHOSTOS, GOHOSTARCH, and any other build parameters involved in building garble itself are already part of the seed used when obfuscating code. So I think we're in agreement that we want to remove those variables.

@kortschak
Copy link
Contributor Author

If the first paragraph does not become a goal, then we have nothing to do

I was thinking that documenting the requirement would be a worthwhile addition if it is not a goal.

@capnspacehook
Copy link
Member

If the user manually specifies a seed but compiles the same package with different GOOS/GOARCH values, will the obfuscation be the same? iirc it won't, and I think that striving for that would be helpful for cases like building a client and server that both use a package for serialization, and that package requires reflection. I've seen that use case come up a few times before.

@mvdan
Copy link
Member

mvdan commented Jan 14, 2022

would be helpful for cases like building a client and server that both use a package for serialization, and that package requires reflection

I'm not sure I follow - what we aim for is to instead not obfuscate those RPC types. Detecting reflection like that is not perfect, but it's our only shot at solving that issue.

The only alternative is to only use a user-supplied seed to obfuscate names in each package, and then the user would have to keep the same seed if they want multiple builds to be compatible in terms of RPC. I'm not a big fan of this solution because it doesn't work by default (the user needs to supply a seed), and it requires the user to rotate seeds correctly (e.g. if never rotated, this makes deobfuscation easier over time).

@mvdan
Copy link
Member

mvdan commented Jan 16, 2022

Thinking about this some more: we could assume that, if the user supplies a -seed, they know what they are doing. In that case, we would only use their seed for hashing, and we wouldn't use anything else like build IDs, which depend on both the host and target build parameters.

This way, if the author wants a program to obfuscate the same way across platforms, they could accomplish it by controlling -seed. If they want reproducible builds, they'll want to control the seed, anyway - otherwise even a rebuild of garble could break reproducibility.

One feature we could lose if we did this is that name N0 in packages P1 and P2 would be obfuscated the same way, as we would no longer use P1 and P2's different build ID hashes as part of N0's obfuscated name hash. We can solve this by instead including P1 and P2's import paths in the obfuscated name hash.

Edit: this idea comes from @capnspacehook and @moloch--, mostly :)

@capnspacehook
Copy link
Member

That sounds perfect, as long as we document that if -seed is used the user is in total control of the seed. That may seem obvious to many users given the name of the flag but isn't the case currently.

I'm personally a fan of designing systems that aim to 'just work' out of the box with opinionated defaults, but also let advanced users tweak some settings if they want.

@mvdan
Copy link
Member

mvdan commented Jan 16, 2022

@kortschak do you have thoughts on #449 (comment)? Would it be a good solution to your problem? I realise that needing to supply -seed to get deterministic builds across platforms is perhaps not ideal, but it's a good step forward that we can take rather easily.

@kortschak
Copy link
Contributor Author

kortschak commented Jan 17, 2022

It would solve my problem, but it makes me sort of sad (for aesthetic reasons more than anything — I can't provide any rational justification for this). Given that that my use in in CI to check that things are consistent, the weakening of obfuscation if seeds aren't rotated is not a real concern. I imagine that the approach could be made more secure in the future if another strategy is found.

@mvdan
Copy link
Member

mvdan commented Jan 17, 2022

@kortschak you're pretty spot on. The reason I'm not super happy about this solution is that it makes the default behavior less reproducible, akin to how go build isn't reproducible unless you remember to use the right flags like -trimpath.

I've actually been thinking about places to source a good "default seed" from for some time, see #156. I think that could be a way to improve on the default case, so that we'd effectively always have a seed that's constant across target and host platforms, but still impossible to figure out unless you have the source code.

@capnspacehook capnspacehook self-assigned this Jan 17, 2022
@capnspacehook capnspacehook added the enhancement New feature or request label Jan 17, 2022
capnspacehook added a commit to capnspacehook/garble that referenced this issue Jan 17, 2022
If -seed is passed, the user wishes to explicitly provide a specific seed
for some reason. It would make sense that building the same code for
different OSes/architecures providing the same seed would result in
names getting hashed the same way.

That is not the case currently, because Build IDs are always used to
salt the hashes of names, and Build IDs differ whenever any build
arguments differ. Instead, salt hashes with the package's import path
when -seed is passed, and use Build IDs as salts otherwise. This ensures
also ensures the same name in different packages will always be hashed
differently.

Fixes burrowers#449.
capnspacehook added a commit to capnspacehook/garble that referenced this issue Jan 17, 2022
If -seed is passed, the user wishes to explicitly provide a specific seed
for some reason. It would make sense that building the same code for
different OSes/architecures providing the same seed would result in
names getting hashed the same way.

That is not the case currently, because Build IDs are always used to
salt the hashes of names, and Build IDs differ whenever any build
arguments differ. Instead, salt hashes with the package's import path
when -seed is passed, and use Build IDs as salts otherwise. This ensures
also ensures the same name in different packages will always be hashed
differently.

Fixes burrowers#449.
@mvdan mvdan assigned mvdan and unassigned capnspacehook Mar 5, 2022
mvdan added a commit to mvdan/garble-fork that referenced this issue Mar 5, 2022
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes burrowers#449.
mvdan added a commit to mvdan/garble-fork that referenced this issue Mar 12, 2022
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes burrowers#449.
mvdan added a commit to mvdan/garble-fork that referenced this issue Mar 14, 2022
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes burrowers#449.
mvdan added a commit that referenced this issue Mar 14, 2022
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes #449.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging a pull request may close this issue.

3 participants