Skip to content

Conversation

ghost
Copy link

@ghost ghost commented Feb 5, 2020

I tried using the default CRD-based operator configuration from here, but it failed:

time="2020-02-05T13:48:02Z" level=fatal msg="unable to read operator configuration: could not get operator configuration object \"postgres-operator-configuration\": v1.OperatorConfiguration.Configuration: v1.OperatorConfigurationData.Kubernetes: v1.KubernetesMetaConfiguration.MasterPodMoveTimeout: readUint64: unexpected character: \xff, error found in #10 byte of ...|timeout\":\"20m\",\"oaut|..., bigger context ...|\"enable_sidecars\":true,\"master_pod_move_timeout\":\"20m\",\"oauth_token_secret_name\":\"postgresql-operato|..." pkg=controller

From what I can tell, attempting to unmarshal the MasterPodMoveTimeout field from a CRD-based operator configuration fails because the time.Duration type is not unmarshallable.

The solution to this seems to be to use the Duration type instead:

//Duration shortens this frequently used name
type Duration time.Duration

That type seems to be what's consistently used for other fields, and it is unmarshallable:

// UnmarshalJSON convert to Duration from byte slice of json
func (d *Duration) UnmarshalJSON(b []byte) error {
var (
v interface{}
err error
)
if err = json.Unmarshal(b, &v); err != nil {
return err
}
switch val := v.(type) {
case string:
t, err := time.ParseDuration(val)
if err != nil {
return err
}
*d = Duration(t)
return nil
case float64:
t := time.Duration(val)
*d = Duration(t)
return nil
default:
return fmt.Errorf("could not recognize type %T as a valid type to unmarshal to Duration", val)
}
}

I've tested locally and with this change the error disappeared.

@erthalion
Copy link
Contributor

Good catch! I've checked, and looks like you're right. I'm afraid the same problem is with the configuration via configmap, although it's sort of deprecated by now.

@erthalion
Copy link
Contributor

👍

1 similar comment
@Jan-M
Copy link
Member

Jan-M commented Feb 11, 2020

👍

@erthalion erthalion merged commit 00f00af into zalando:master Feb 11, 2020
@pcornelissen
Copy link

pcornelissen commented Feb 12, 2020

FYI: this is not fixed. I just tried it ad the "... readUint64: unexpected character:" still happens with the postgresql-operator-default-configuration.yaml from a few minutes ago:

time="2020-02-12T05:12:31Z" level=fatal msg="unable to read operator configuration: could not get operator configuration object \"postgresql-operator-default-configuration\": v1.OperatorConfiguration.Configuration: v1.OperatorConfigurationData.Kubernetes: v1.KubernetesMetaConfiguration.MasterPodMoveTimeout: readUint64: unexpected character: \xff, error found in #10 byte of ...|timeout\":\"20m\",\"oaut|..., bigger context ...|\"enable_sidecars\":true,\"master_pod_move_timeout\":\"20m\",\"oauth_token_secret_name\":\"postgresql-operato|..." pkg=controller

@tclass
Copy link

tclass commented Feb 14, 2020

is there some quick fix for this?

@pcornelissen
Copy link

I just removed the value, but I don't know which value is used in that case ;)

@erthalion
Copy link
Contributor

FYI: this is not fixed. I just tried it ad the "... readUint64: unexpected character:" still happens with the postgresql-operator-default-configuration.yaml from a few minutes ago:

Strange, I'll check. @pcornelissen just to make sure, you build the operator from 00f00af or higher?

@pcornelissen
Copy link

I checked out the repo on the 12th and used the files in Master, so the commit should be included.
The image used is: registry.opensource.zalan.do/acid/postgres-operator:v1.3.1

@erthalion
Copy link
Contributor

I checked out the repo on the 12th and used the files in Master, so the commit should be included.
The image used is: registry.opensource.zalan.do/acid/postgres-operator:v1.3.1

I'm confused, do you use a prebuild image postgres-operator:v1.3.1 (which doesn't include this fix yet), or build your own from the master branch?

@pcornelissen
Copy link

I checked out the master and used the config files from there, which use the container image above. I don't know when the code is built, so I assumed that when this is fixed, that the corresponding image would also be updated.
If that is not the case, could you trigger an image rebuild (and update the yamls), so other people don't fall into this trap as well?

@FxKu
Copy link
Member

FxKu commented Feb 25, 2020

@pcornelissen included within the new v1.4.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants