Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization of marshaled types #64

Closed
williamgcampbell opened this issue Jul 26, 2022 · 4 comments
Closed

Initialization of marshaled types #64

williamgcampbell opened this issue Jul 26, 2022 · 4 comments

Comments

@williamgcampbell
Copy link
Contributor

williamgcampbell commented Jul 26, 2022

When processing structs that implement a marshaller, this library has typically left it up to the unmarshaling function to populate the object. For example, given this config.

type Config struct {
	URL *url.URL `env:"URL"`
}

The result of processing this configuration through env-config would be equivalent to marshaling and unmarshaling a url.URL struct. For example:

func MarshalTest(t *testing.T) {
	var cfg Config
	if err := envconfig.Process(context.Background(), &cfg); err != nil {
		t.Fatal(err)
	}

	u := &url.URL{}
	b, _ := u.MarshalBinary()
	_ = u.UnmarshalBinary(b)
	want := Config{
		URL: u,
	}

	if diff := cmp.Diff(want, cfg, cmp.AllowUnexported(Config{})); diff != "" {
		t.Errorf("Configuration() mismatch (-want +got):\n%s", diff)
	}
}

Since v0.8.0 this test would fail. It seems that the library is skipping the decoder of a struct field if there are no values set. This generally seems ok for fields that can be controlled with noinit, default tags but in this example, the initialization of fields internal to url.URL cannot be controlled with env tags.

I think introducing the concept of managed vs unmanaged fields in the configuration will help with this. If a configuration field has an env tag it should follow the rules laid out by the env-config library, which will only use the decoder if a conf value is set (default or otherwise). Any struct fields without the env tag should default to using their underlying decoder no matter what. This will prevent potentially initializing fields that were not intended to be initialized.

I've included a draft PR to demonstrate this.

@sethvargo
Copy link
Owner

Hi @williamgcampbell

Could you put up a PR that shows the failure you're talking about with url.URL?

@sethvargo
Copy link
Owner

Okay, this is a a tricky one. One of envconfig's promises is that it will deeply initialize pointer struct fields. In this case net/url.URL.UserInfo is a struct pointer, so it's being initialized to &net/url.UserInfo{}, but the binary unmarshaler leaves the value as nil.

I think url.URL might be a weird edge case here that accidentally had weird behavior in previous versions because of it's binary unmarshaler.

{
	// https://github.com/sethvargo/go-envconfig/issues/64
	name: "custom_decoder_uses_decoder_no_env",
	input: &struct {
		URL *url.URL
	}{},
	exp: &struct {
		URL *url.URL
	}{
		URL: &url.URL{},
	},
	lookuper: MapLookuper(nil),
},
{
	// https://github.com/sethvargo/go-envconfig/issues/64
	name: "custom_decoder_uses_decoder_env_no_value",
	input: &struct {
		URL *url.URL `env:"URL"`
	}{},
	exp: &struct {
		URL *url.URL `env:"URL"`
	}{
		URL: &url.URL{},
	},
	lookuper: MapLookuper(nil),
},
{
	// https://github.com/sethvargo/go-envconfig/issues/64
	name: "custom_decoder_uses_decoder_env_with_value",
	input: &struct {
		URL *url.URL `env:"URL"`
	}{},
	exp: &struct {
		URL *url.URL `env:"URL"`
	}{
		URL: &url.URL{
			Scheme: "https",
			Host:   "foo.bar",
		},
	},
	lookuper: MapLookuper(map[string]string{
		"URL": "https://foo.bar",
	}),
},

I think the only thing we could do is to always process the decoder (reverting part of #62), so that decoder interfaces are always processed regardless of whether an environment variable is present. However, that breaks some other things, so I'm not really sure the best path here. I'm open to ideas.

/cc @twmb

@twmb
Copy link

twmb commented Jul 27, 2022

I'm confused a little bit, why not add noinit to the URL field so that the config looks like:

type Config struct {
	URL *url.URL `env:"URL,noinit"`
}

This is almost the same behavior being desired, except rather than new(url.URL), you have nil. I'm not sure what the use case / purpose of a completely empty but non-nil URL could be -- I could only see this as a shortcut to set fields if they are not specified, but specifically relying on the input being non-nil. This is leveraging the deeply-initialize-pointers aspect of the library, but as @sethvargo points out, the library previously guaranteed that it would deeply initialize structs.

This doesn't solve the use case of not using pointers,

type Config struct {
    URL url.URL `env:URL`
}

In v0.7.0, this would initialize the URL with UnmarshalBinary, leaving UserInfo nil. In v0.8.0, initializes the URL deeply without calling UnmarshalBinary, which inadvertently initializes the internal UserInfo. In fact, I'm seeing that having one level of indirection also initializes even with noinit in v0.8.0: https://go.dev/play/p/t_jcxhjJx2r

I agree that not calling unmarshal functions is new behavior as of #64. The #64 fix does feel correct to me -- if I don't have an env var for a field, why would the field be processed with no input? However, this does somewhat conflict conceptually with the always-deep-initialize aspect.

AFAICT the only happy path to support both #62 and this issue is a new tag or two that supports one behavior or the other. My real preference is to not deeply initialize structs default, and only deeply initialize if some deep struct has a present env key. This is a breaking API change.

@williamgcampbell
Copy link
Contributor Author

why not add noinit to the URL field so that the config looks like...

Giving it some thought, this sounds like the best path forward. I like the idea of envconfig always initializing, and the noinit tag indicating a full stop on that level down.

In fact, I'm seeing that having one level of indirection also initializes even with noinit in v0.8.0

I believe this is an actual bug. Looking at the code the parent noinit tag is not being carried through multiple levels of indirection. Each run of processWith is only taking into consideration its direct parent's noinit tag. This test should pass but it currently fails in v0.8.0

		{
			name: "noinit/no_init_passed_from_parent",
			input: &struct {
				Electron *Electron `env:"FIELD,noinit"`
			}{},
			exp: &struct {
				Electron *Electron `env:"FIELD,noinit"`
			}{
				Electron: nil,
			},
			lookuper: MapLookuper(map[string]string{}),
		},

I think this should be easily fixable by using noInit := parentNoInit || opts.NoInit when calling processWith recursively.

This was referenced Jul 27, 2022
sethvargo added a commit that referenced this issue Dec 27, 2023
**:warning: Breaking!** Envconfig no longer runs decoders on unset values! To restore the old behavior, add the `decodeunset` struct field annotation or pass the `DefaultDecodeUnset` configuration option as `true`.

Prior to this change, envconfig would always call decoding functions, even for unset or empty values. This proved problematic for some libraries like `url` and `zap` which implement `TextUnmarshaller`, but error on the empty string (#61). #62 changed the behavior to only call the decoder if a value was explicitly provided, but that broke users unexpectedly (#64), so the change was reverted.

After much discussion, we decided to keep the existing behavior until the 1.0 release. However, after further reflection, I think we need to support a user-level configurable option. Some fields and structs may still want the decoder to run on empty values.

This PR changes envconfig to only process a field if any of the following are true:

- A value was provided (the value can be set to the empty string, there is a distinction between "unset" and "set to empty")
- A default value was provided
- The `decodeunset` struct field tag is set
- The `DefaultDecodeUnset` configuration option is true
sethvargo added a commit that referenced this issue Dec 27, 2023
**:warning: Breaking!** Envconfig no longer runs decoders on unset values! To restore the old behavior, add the `decodeunset` struct field annotation or pass the `DefaultDecodeUnset` configuration option as `true`.

Prior to this change, envconfig would always call decoding functions, even for unset or empty values. This proved problematic for some libraries like `url` and `zap` which implement `TextUnmarshaller`, but error on the empty string (#61). #62 changed the behavior to only call the decoder if a value was explicitly provided, but that broke users unexpectedly (#64), so the change was reverted.

After much discussion, we decided to keep the existing behavior until the 1.0 release. However, after further reflection, I think we need to support a user-level configurable option. Some fields and structs may still want the decoder to run on empty values.

This PR changes envconfig to only process a field if any of the following are true:

- A value was provided (the value can be set to the empty string, there is a distinction between "unset" and "set to empty")
- A default value was provided
- The `decodeunset` struct field tag is set
- The `DefaultDecodeUnset` configuration option is true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants