Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflection issue with embedded struct #765

Closed
xuannv112 opened this issue Jun 20, 2023 · 15 comments
Closed

Reflection issue with embedded struct #765

xuannv112 opened this issue Jun 20, 2023 · 15 comments

Comments

@xuannv112
Copy link
Contributor

xuannv112 commented Jun 20, 2023

What version of Garble and Go are you using?

$ garble version
mvdan.cc/garble v0.10.0

$ go version
go version go1.20.5

What did you do?

Here is the sample code for obfuscation

func main() {
        type MyInt int
	var t struct {
		Field1 MyInt
		MyInt
	}
	_ = reflect.ValueOf(t)
	_ = reflect.ValueOf(t).FieldByName("Field1")
	_ = reflect.ValueOf(t).FieldByName("MyInt")
}

What did you expect to see?

MyInt type shouldn't be obfuscated.

What did you see instead?

func main() {
	type CZKZyYFezzoE int
	var doEHHO5 struct {
		Field1	CZKZyYFezzoE
		CZKZyYFezzoE
	}
	_ =  /*line b_0JQxZezU.go:1*/ reflect.CRS4V1ASWQ(doEHHO5)
	_ =  /*line XksHcrjPjnW.go:1*/ reflect.CRS4V1ASWQ(doEHHO5).FieldByName("Field1")
	_ =  /*line UDwbo_E.go:1*/ reflect.CRS4V1ASWQ(doEHHO5).FieldByName("MyInt")
}

Notes:

1/ It is the same issue when using garble with go spew package
go-spew

We obfuscate t0 field so vt0 returns 0 and panic later in the init function.

panic("reflect.Value read-only flag has changed semantics")

2/ Solution
I think when using reflection, we can record non top-level objects as well. So is it ok to remove this logic?
No need to compare pkg.Scope() vs obj.Parent()
Check top level objects

The tradeoff here: There is no obfuscation even if there is no embedded struct when using reflection

@theaog
Copy link

theaog commented Jun 20, 2023

what go/garble version could we use to avoid this issue?

@theaog
Copy link

theaog commented Jun 20, 2023

> $ garble run *.go                                                                                                                           [±master ●]
# command-line-arguments
/home/m/.cache/garble/tool/link: running gcc failed: exit status 1
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.xdOu7h.swNWWA':
go.go:(.text+0x125347): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.iU5VGNz74.aQw4UsJmR7':
go.go:(.text+0x1256eb): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.xGBWdJXQaj6v':
go.go:(.text+0x1257f2): undefined reference to `oUEfqo.iNRMZd'
/usr/bin/ld: /tmp/go-link-1189358754/go.o: in function `oUEfqo.(*pPSrIql).nOIY6ItdS':
go.go:(.text+0x12f9c1): undefined reference to `oUEfqo.vXtJ6QlKb'
collect2: error: ld returned 1 exit status

exit status 2
exit status 1

@xuannv112
Copy link
Contributor Author

xuannv112 commented Jun 20, 2023

There are some temporary solutions
1/ Use GOGARBLE to specify which package you want to obfuscate -> I prefer this way
2/ Downgrade to go1.20.x (x<5), garblev0.9.3 (actually you only need to use the master branch but before this commit)
3/ For go-spew package, you can build with the safe tag -tags safe

@lu4p
Copy link
Member

lu4p commented Jun 21, 2023

@xuannv112 sounds like you have identified an issue and have a rough idea how to fix it, happy to review your PR. See CONTRIBUTING.md to get started, if you get stuck somewhere feel free to ask for help in our slack channel.

@mvdan
Copy link
Member

mvdan commented Jun 21, 2023

Note that we can't simply remove the check at https://github.com/burrowers/garble/blob/v0.10.0/reflect.go#L432, because the following line records the type as pkg.Path() + "." + obj.Name(), which assumes top-level declarations.

@mvdan
Copy link
Member

mvdan commented Jun 21, 2023

My change in 0f2b59d was slightly flawed, as can be seen by this regression. When obfuscating a package, we still need to temporarily record what local types aren't obfuscated, even if those don't matter when obfuscating importers.

We could consider reverting, but the old code was also flawed and it would lead to bad obfuscation depending on the state of the cache.

I think it's probably time to rewrite recordedObjectString; it has a couple of hacks and TODOs, so we already knew that it wasn't great :) I'll take a crack at it tonight, since I think that's probably important enough for v0.10.1.

@xuannv112
Copy link
Contributor Author

xuannv112 commented Jun 21, 2023

Agree that we should check if the local types should be obfuscated or not. (using reflection and being used in an embedded struct).
For a simpler solution, I think we can return this (use file-level instead of package level (if remove))

        if pkg.Scope() != obj.Parent() {
		pos := fset.Position(obj.Pos())
		return fmt.Sprintf("%s.%s - %s", pkg.Path(), obj.Name(),
			filepath.Base(pos.Filename))
	}

It means in the same file, all local types with the same obj name will not be obfuscated. I think it is acceptable.

@xuannv112
Copy link
Contributor Author

xuannv112 commented Jun 22, 2023

Or we can check in the object's scope if there are any structs that used the object as an embedded struct.

          getObjId := func(pkg *types.Package, obj types.Object) string {
		          pos := fset.Position(obj.Pos())
		          return fmt.Sprintf("%s.%s - %s:%d", pkg.Path(), obj.Name(),
			          filepath.Base(pos.Filename), pos.Line)
	 }
	if pkg.Scope() != obj.Parent() {
		// check if obj is a field of an embedded struct by lookup it's scope
		if _, ok := obj.(*types.TypeName); ok {
			scope := obj.Parent()
			for _, objName := range scope.Names() {
				otherObj := scope.Lookup(objName)
				if structType, ok := otherObj.Type().(*types.Struct); ok {
					for i := 0; i < structType.NumFields(); i++ {
						field := structType.Field(i)
						if field.Name() == obj.Name() && field.Embedded() {
							return getObjId(pkg, obj)
						}
					}
				}
			}
		}

		return ""
	}

So we will ignore only local types used in embedded field (without impact to other obj)

@mvdan
Copy link
Member

mvdan commented Jun 22, 2023

Definitely don't do that - that sort of double loop linear search will be very expensive in this part of the codebase.

I think it's okay, for now, to change recordedObjectString to do the same for local types that it already does for struct fields. It's a hack, but it should work.

@xuannv112
Copy link
Contributor Author

xuannv112 commented Jun 22, 2023

I have moved the embedded field check logic to only ri.recordUsedForReflect function to save cost.
#767

Here is the benchmark result:

go test -run=- -bench=. -count=6 -benchtime=1x

goos: darwin
goarch: amd64
pkg: mvdan.cc/garble
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
         │  old.txt   │           new.txt           │
         │   sec/op   │   sec/op    vs base         │
Build-12   24.84 ± 9%   25.20 ± 6%  ~ (p=1.000 n=6)

         │   old.txt    │              new.txt               │
         │    bin-B     │    bin-B      vs base              │
Build-12   5.642Mi ± 0%   5.643Mi ± 0%  +0.02% (p=0.002 n=6)

         │    old.txt    │               new.txt                │
         │ cached-sec/op │ cached-sec/op  vs base               │
Build-12    611.3m ± 36%    723.0m ± 41%  +18.28% (p=0.041 n=6)

         │   old.txt   │              new.txt              │
         │ mallocs/op  │ mallocs/op   vs base              │
Build-12   34.54M ± 0%   36.59M ± 0%  +5.93% (p=0.002 n=6)

         │   old.txt   │           new.txt           │
         │ sys-sec/op  │ sys-sec/op  vs base         │
Build-12   17.25 ± 10%   17.13 ± 5%  ~ (p=0.818 n=6)

@xuannv112
Copy link
Contributor Author

I think i can optimize a litte bit by scanning only when there are 2 conditions: reflection + embedded field. It is better when trigger by only 1 condition: reflection

@xuannv112
Copy link
Contributor Author

xuannv112 commented Jun 23, 2023

I have updated the logic: checking the embedded fields. No need for linear search at all. Here is the benchmark result

goos: darwin
goarch: amd64
pkg: mvdan.cc/garble
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
         │  old.txt   │           new.txt           │
         │   sec/op   │   sec/op    vs base         │
Build-12   24.84 ± 9%   25.26 ± 8%  ~ (p=0.937 n=6)

         │   old.txt    │              new.txt               │
         │    bin-B     │    bin-B      vs base              │
Build-12   5.642Mi ± 0%   5.648Mi ± 0%  +0.11% (p=0.002 n=6)

         │    old.txt    │            new.txt             │
         │ cached-sec/op │ cached-sec/op  vs base         │
Build-12    611.3m ± 36%    592.4m ± 16%  ~ (p=0.937 n=6)

         │   old.txt   │              new.txt              │
         │ mallocs/op  │ mallocs/op   vs base              │
Build-12   34.54M ± 0%   36.60M ± 0%  +5.94% (p=0.002 n=6)

         │   old.txt   │           new.txt           │
         │ sys-sec/op  │ sys-sec/op  vs base         │
Build-12   17.25 ± 10%   17.52 ± 8%  ~ (p=0.485 n=6)

@mvdan
Copy link
Member

mvdan commented Jun 23, 2023

Hang on, but why do we need the extra logic? As far as I know, the %s.%s - %s:%d hack will work just fine for local types just like it already does for fields. We don't need the extra code, I'm fairly sure.

@xuannv112
Copy link
Contributor Author

For this use case

	type Tqw int
	var T11 struct {
		A Tqw
		y Tqw
		a Tqw
	}
	_ = reflect.ValueOf(T11).FieldByName("A")

Tqw is still obfuscated as the old logic.

1/ My PR will not obfuscate Tqw only when Tqw is used as an embedded field
2/ If there is no extra logic and simply return %s.%s - %s:%d, for normal type (not used as an embedded field), it will not be obfuscated as the old logic.

@xuannv112
Copy link
Contributor Author

It is a minor case, but I want to keep the old logic unchanged if the cost is acceptable.

@mvdan mvdan closed this as completed in 404b2ce Jun 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants