Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DiscardElements helper transformer #28

Closed
wants to merge 1 commit into from
Closed

Conversation

dsnet
Copy link
Collaborator

@dsnet dsnet commented Jul 26, 2017

In some situations users want to compare two maps where missing entries
are equal to those that have the zero value.
For example, the following maps would be considered equal:
x := map[string]int{"foo": 12345, "zero":0}
y := map[string]int{"foo": 12345}

To help with this, we add DiscardElements to cmpopts that transforms maps
and slices by stripping entries based on a user provided function.
To strip zero values, the user can provide:
cmpopts.DiscardElements(func(v int) bool { return v == 0 })

//
// DiscardMapZeros can be used in conjunction with EquateEmpty,
// but cannot be used with SortMaps.
func DiscardMapZeros() cmp.Option {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"zeros" and "zeroes" are both correct spellings for the plural of "zero". Perhaps it would be better to avoid that ambiguity if we can.

How about DiscardZeroElements or OmitZeroElements? (That could also apply to slices, e.g. for a slice of pointers with a nil pointer meaning "skip".)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DiscardZeroElements also operating on slices feels like it is over-stretching what it should do. Although, I admit that it probably would not break anything since zero-values of any type is typically equal to itself. I do like the fact that I wouldn't need to think about whether this applies to maps or slices.

Let me think about this.

@@ -94,7 +94,8 @@ func (ss sliceSorter) less(v reflect.Value, i, j int) bool {
// • Transitive: if !less(x, y) and !less(y, z), then !less(x, z)
// • Total: if x != y, then either less(x, y) or less(y, x)
//
// SortMaps can be used in conjuction with EquateEmpty.
// SortMaps can be used in conjunction with EquateEmpty,
// but cannot be used with DiscardMapZeros.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is an interesting data point for our "only one option may apply" heuristic. These options happen to be commutative, so it's safe (and not particularly confusing) to apply them both.

Perhaps this is another good use-case for the "first option that applies" combiner you mentioned at GopherCon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is another good use-case for the "first option that applies" combiner you mentioned at GopherCon.

Yep. I was thinking the exact same thing :)

It would be really cool to be able to do something like FilterPriority(DiscardMapZeros(), SortMaps(f)) to first discard the map zeros and then sort the map.

Let's leave the comments clean-up for the CL that adds in the priority filter idea.

@dsnet
Copy link
Collaborator Author

dsnet commented Jul 27, 2017

@alercah provided an interesting example where DiscardMapZeros and EquateEmpty does not compose well.

Consider the following:

x: map[string][]int{"foo": []int{}}
y: map[string][]int{"foo": nil}

If only EquateEmpty were used, then these two are equal. However, the moment DiscardMapZeros is used, these are no longer equal because DiscardMapZeros only applies on y producing map[string][]int{}, while keeping the x value the same.

The current API will need to be thought about more carefully.

@dsnet
Copy link
Collaborator Author

dsnet commented Jul 27, 2017

If the API for transformers allowed for func(T, T) (R, R) you could take an additive approach where all missing elements of the other map could be added as the zero value. However, I'm not particularly fond of adding another function signature for transformers.

A subtractive approach would have worked as well if it only discarded zero-values of keys that do not exist in the other map. The core problem is that Transformers were designed assuming they would never need to know about the other value.

I feel like a IgnoreMapZeros approach would be much better, but there current Filter API does not have a concept of a "missing" element.

@bcmills
Copy link

bcmills commented Jul 27, 2017

Hmm, that example suggests that DiscardMapZeros should discard all values that are equal to the zero value, not just values identical to the zero value.

@dsnet
Copy link
Collaborator Author

dsnet commented Jul 27, 2017

Hmm, that example suggests that DiscardMapZeros should discard all values that are equal to the zero value, not just values identical to the zero value.

That would also be an API change where you could take in the cmp.state itself and call comparisons with it, because the definition of equality is dependent on the set of Options that are currently being used. messsagediff takes this approach where you had access to the comparer. I'd like to avoid that since it gets messy.

@dsnet
Copy link
Collaborator Author

dsnet commented Jul 31, 2017

I see at least 4 possible options:

  • Add API to allow Options to operate on "non-existent" values
  • Add API to allow Options to allow functions to call cmp.Equal themselves with the set of Options already provided to the initial call to cmp.Equal
  • Add API to allow Transformers to also operate on func(T, T) (R, R)
  • Add no new API and just change how this helper is used.

Add API to allow Options to operate on "non-existent" values
This approach would involve altering the algorithm of cmp.Equal to still descend into two values even if one of them is missing. There would need to be new API to allow Options to have a concept of "non-existent" values. Non-existent values are possible for maps and slices. Arguably it is possible for interfaces and pointers to have "non-existent" values if we consider nil values to be "non-existent".

I am opposed to this approach since whether two elements of a slice are "non-existent" or not is dependent on the exact diffing algorithm used to compare two slices. Thus, the output of whether two objects are equal or not is now possibly dependent on the diffing algorithm, which causes the results of cmp.Equal to be tied to specific internal implementation details.

Add API to allow Options to allow functions to call cmp.Equal themselves with the set of Options already provided to the initial call to cmp.Equal
I don't have an idea of how the function signatures would change to be able to accommodate plumbing a custom comparator down to each Option. Regardless of the function signatures, I'm opposed to this approach since the result of cmp.Equal is dependent on the Path, and exposing the current custom equality operation to the user requires the user to properly update Path, which seems like a very brittle API and easy for users to get wrong.

Add API to allow Transformers to also operate on func(T, T) (R, R)
Early in the design of cmp, there were no Filter options, and transformers used to take in a signature of func(T, T, Path) (R, R, bool). The Path and bool types correspond with a FilterPath. The resulting func(T, T) (R, R) was simplified to simply func(T) R. When that happened, I noted that this prevents the use case where the transformation is a function of the other value.

Using func(T, T) (R, R), we can pre-emptively add the zero value for keys that exist in the other map. Thus, after transformation, both maps would be the same length.

Although I do see some utility in func(T, T) (R, R), I'm not convince that this use-case is a sufficient strong candidate for adding that as a valid Transformer to accept. (See option 4 below)

Add no new API and just change how this helper is used.
Instead of changing cmp, we can change the helper option being added to give more control to the user.

We can go with the following:

// DiscardElements transforms all slices and maps of type []V or map[K]V,
// where type V is assignable to type T, by removing elements where the
// remove function, f, reports true.
//
// As an example, all zero elements can be discarded with:
//	DiscardElements(func(v MyStruct) bool { return v == MyStruct{} })
//
// DiscardElements can be used in conjunction with EquateEmpty,
// but cannot be used with SortMaps.
func DiscardElements(f func(T) bool) cmp.Option

There are two benefits to this helper options:

  • Unlike DiscardMapZeros which operates on all maps, this scopes the operation to only work on maps where the value is type T, specified by the user.
  • For Alexis' example, it can now be handled with:
DiscardElements(func(v []int) bool {
    return cmp.Equal(v, []int(nil), cmpopts.EquateEmpty())
})

It is a little unfortunate, that the user would have to specify EquateEmpty again, but at least the control is in the user's hands, and it doesn't require exposing too much internal state of the comparison to the user.

@alercah
Copy link

alercah commented Jul 31, 2017

Hmm. If you provided helper functions for the zero and zero/empty cases, I think that would be good.

@dsnet
Copy link
Collaborator Author

dsnet commented Aug 1, 2017

@alercah, I'm going to hold off on adding sub-helper functions for those cases. It did not seem like the original problem was common enough to warrant more helpers. We can always add those in the future.

@alercah
Copy link

alercah commented Aug 1, 2017

SGTM

@dsnet dsnet force-pushed the discard-maps branch 3 times, most recently from b7c0d85 to d2e3457 Compare August 2, 2017 01:17
@dsnet dsnet changed the title Add DiscardMapZeros helper transformer Add DiscardElements helper transformer Aug 2, 2017
@dsnet
Copy link
Collaborator Author

dsnet commented Aug 2, 2017

PR updated with DiscardElements helper that operates on both slices and maps.

In some situations users want to compare two maps where missing entries
are equal to those that have the zero value.
For example, the following maps would be considered equal:
	x := map[string]int{"foo": 12345, "zero":0}
	y := map[string]int{"foo": 12345}

To help with this, we add DiscardElements to cmpopts that transforms maps
and slices by stripping entries based on a user provided function.
To strip zero values, the user can provide:
	cmpopts.DiscardElements(func(v int) bool { return v == 0 })
@dbinit
Copy link

dbinit commented Mar 13, 2018

I just ran into this issue in my first attempt at using cmp for some unit tests. In my particular case, one of the maps will always be a superset of the other, so I just want to set the missing keys in the smaller map to the zero value.

This is what I came up with:

func FillMaps(keys interface{}) cmp.Option {
	var mf mapFiller
	vk := reflect.ValueOf(keys)
	if vk.Kind() == reflect.Map {
		// Use the keys from an existing map.
		mf.keyType = vk.Type().Key()
		mf.keys = vk.MapKeys()
	} else {
		// Array or slice of keys.
		mf.keyType = vk.Type().Elem()
		for i, l := 0, vk.Len(); i < l; i++ {
			mf.keys = append(mf.keys, vk.Index(i))
		}
	}
	return cmp.FilterValues(mf.filter, cmp.Transformer("Fill", mf.fill))
}

type mapFiller struct {
	keyType reflect.Type
	keys    []reflect.Value
}

func (mf mapFiller) filter(x, y interface{}) bool {
	vx, vy := reflect.ValueOf(x), reflect.ValueOf(y)
	return (x != nil && y != nil && vx.Type() == vy.Type()) &&
		(vx.Kind() == reflect.Map && mf.keyType.AssignableTo(vx.Type().Key())) &&
		(mf.hasMissingKey(vx) || mf.hasMissingKey(vy))
}

func (mf mapFiller) hasMissingKey(v reflect.Value) bool {
	for _, k := range mf.keys {
		if !v.MapIndex(k).IsValid() {
			return true
		}
	}
	return false
}

func (mf mapFiller) fill(m interface{}) interface{} {
	vm := reflect.ValueOf(m)
	if !mf.hasMissingKey(vm) {
		return m
	}
	tm := vm.Type()
	vn := reflect.MakeMap(tm)
	vz := reflect.Zero(tm.Elem())
	for _, k := range mf.keys {
		vn.SetMapIndex(k, vz)
	}
	for _, k := range vm.MapKeys() {
		vn.SetMapIndex(k, vm.MapIndex(k))
	}
	return vn.Interface()
}

And I just use it like:

cmp.Diff(want, got, FillMaps(got))

@dsnet
Copy link
Collaborator Author

dsnet commented Mar 13, 2018

While this works for maps, it seems weird to me that the Option is a function of the values of the input, rather just the type.

@dsnet
Copy link
Collaborator Author

dsnet commented Feb 26, 2019

#121 provides the ground work for ignoring missing map and slice elements. I'm abandoning this PR since the other PR is a superior approach in my opinion.

@dsnet dsnet closed this Feb 26, 2019
@dsnet dsnet deleted the discard-maps branch February 26, 2019 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants