issue setting empty values #19

willnorris · 2013-07-09T19:29:29Z

Depending on how our types are defined, we run into various interesting problems with setting empty values.

Option 1

This is the current behavior throughout the library.

As currently setup, we can't set empty values at all on Create or Update methods. For example, this won't work to remove the description from an existing repository:

// remove the repository description
r := &github.Repository{Description:""}
client.Repositories.Edit("user", "repo", r)

That is actually a no-op because the empty Description field gets dropped, since it is currently defined with omitempty. Fortunately, there are only a handful of mutable values where the zero value is actually meaningful, most notably bool values.

Option 2

We could then instead drop the omitempty, but that has potentially really bad side-effects, particularly on edit methods. Take the above code sample. Without omitempty on any of our fields, this would work as intended. However, it would also have the unintended side-effect of wiping out the repo homepage, [and once we add the additional fields...] making it public, and disabling issues and the wiki for the repo, since all of those fields would be passing their zero values. The solution there is to first call Get, update the response, and then call Edit with the updated object:

// remove the repository description
r, _ := client.Repositories.Get("user", "repo")
r.Description = ""
client.Repositories.Edit("user", "repo", r)

If you forget to follow that flow, you're gonna have a bad time.

Option 3

The third option is to do what goprotobuf does and actually use pointers for all non-repeated fields, since that allows a clear distinction between "unset" (nil) and "empty" (non-nil, zero value). That would result in types like:

type Repository struct {
    ID          *int        `json:"id,omitempty"`
    Owner       *User       `json:"owner,omitempty"`
    Name        *string     `json:"name,omitempty"`
    Description *string     `json:"description,omitempty"`
    CreatedAt   *time.Time  `json:"created_at,omitempty"`
    PushedAt    *time.Time  `json:"pushed_at,omitempty"`
    UpdatedAt   *time.Time  `json:"updated_at,omitempty"`
}

This is by far the safest approach, but does make working with the library a bit cumbersome, since creating pointers to primitive types takes a little extra work (multiplied by the number of fields you are setting). For example, the above code sample now becomes:

// remove the repository description
d = ""
r := &github.Repository{Description:&d}
client.Repositories.Edit("user", "repo", r)

The goprotobuf library makes this a little simpler by providing helper functions for creating pointers to primitives. Using those methods would result in:

// remove the repository description
r := &github.Repository{Description:proto.String("")}
client.Repositories.Edit("user", "repo", r)

Additionally, when working with these values, developers will always have to remember to dereference them where appropriate. While this is common for anyone used to working with protocol buffers in go, I'm not sure how unexpected it will be for the general community.

The text was updated successfully, but these errors were encountered:

willnorris · 2013-07-11T16:36:54Z

some (though not much, at the time of this writing) discussion of this on golang-nuts here. After talking with a couple of engineers at Google, I'm leaning toward option 3, and just updating the docs to recommend use of the helper methods in goprotobuf.

willnorris · 2013-08-01T18:20:52Z

ugh. So goprotobuf's Int() returns an int32 instead of a plain int, since protocol buffers requires int size to be explicit. So we can either declare all of our ints as int32 (not really a fan of that idea), add our own Int() convenience method (and then users of the library will need to remember to use goprotobuf for most types, but our function for ints... not great), or we just duplicate the convenience functions from goprotobuf and remove that dependency altogether.

Looking at things more closely, I'm pretty sure we really only need the Bool(), Int(), and String() functions, so this last option doesn't seem so bad... it's all of like 15 lines of code.

willnorris · 2013-08-02T21:24:44Z

So yesterday I went through and updated all of our structs for GitHub resources to use pointers for singular fields, and was getting ready to write another "this makes me sad" commit message. Except that I realized that by doing this, it makes the output of String() completely useless, since it just outputs a bunch of pointer addresses for all the fields. Kinda wish I would have considered that before I changed everything. 😕

The goprotobuf library handles this by providing a custom String() for all ProtoMessage objects that uses reflection to create really nice output (among a ton of other really awesome things it does with protos). So at this point, I'm planning on just using protos for realzies instead of just trying to cherry-pick bits and pieces. This will certainly add a new barrier for contributors wanting to add new resource types, since they'll have to deal with protoc. Dealing with the structs generated by protoc is also a little different than normal Go structs, but it's not really too bad. This will do a number on our generated documentation, since protoc generates Get*() funcs for every field, but there's not much we can do about that.

For anyone interested on exactly what the generated structs will look like (and more importantly why they do what they do), read the goprotobuf README

willnorris · 2013-08-05T20:56:15Z

So I've switched User and Repository to use protos in 8d2a1c9.

I've also gone ahead and merged that into my personal master branch just to see what the generated docs will look like (see here). They're certainly more verbose than what we had before, particularly because of all the new Get* funcs.

It is customary to have proto files in a separate package ending in "pb" or "proto", so we could move these to github.com/google/go-github/githubpb. That would at least clean up the generated docs a little bit, but I'm not sure that's really worth it. In general, I've really liked the simplicity of having everything in a single package.

The other sad part about moving to protos that I've seen is the lack of support for time.Time values. Since proto doesn't have a notion of a timestamp type, these just get encoded as strings. Converting that to a time.Time would have to be a separate step.

Given that the library simply doesn't work as-is (see original bug report), we have to do something, and this still seems like the best approach, despite the drawbacks.

/cc everyone who has contributed thus far in case anyone wants to weigh in before I move forward with this (@wlynch92, @imjasonh @sqs, @gedex, @stianeikeland, @beyang, @yml, @ktoso, @RC1140), since I'm not sure if you'll see this otherwise.

ktoso · 2013-08-06T00:20:15Z

Hi Will,
So I have to say I had been thinking about this a bit before... For starters it's clear we have to switch to option 3, no doubt about it.

When it comes to proto / no proto, I was initially leaning against using protos because as library devs it's always nice to "pull nothing in", then again when looking at the code without protobuf helpers today... "oh, that's pretty ugly". And it's so common anyway that I'd +1 just using protobuf just like you started already.

As for the pattern with "the pb repo", I've seen it around and think it makes sense to stick to this custom. An example of a project doing so is https://github.com/golang/groupcache (btw. it's awesome 👍), so even though it's overhead it seems the "right thing to do", I'd +1 that. :-)

Of course I'll try to help out again as much as I can - but argh, so much travel lately!
By the way, I'll be in SF / MTV next week, do you think we could meetup somehow? I'd love to do that :-)
If so, I'm @ktosopl on twitter.

PS: As for time.Time I think you meant serialize as int64 timestamps in proto, not strings?

sqs · 2013-08-06T00:44:17Z

I'd guess that most of the users of this library are only reading data from GitHub, not writing/updating data on GitHub. I think that any solution involving pointer fields for non-repeated values would increase the complexity of the API reader use case.

What about a 4th approach, where you must be explicit about the fields to include in the server request when writing/updating data? Then reading data from GitHub would remain as-is, and writing/updating might look like this:

newRepoInfo := &Repository{
  Description: "", // erase description
  Homepage: "https://example.com", // set new homepage
}
client.Repositories.Edit("user", "repo", newRepoInfo.WithFields("description", "homepage"))

Behind the scenes, func (r *Repository) WithFields(fields... string) map[string]interface{} would return a JSON object with only the named fields from the JSON representation of r. All of the API methods that write/update data would take map[string]interface{} (or similar), not the structs.

The obvious downsides to this approach are that it's ad-hoc (protobuf is a superior general solution) and non-typesafe (the fields would be passed as strings). But it would retain the library's ease of use for API readers and may be simpler overall.

Just an idea...the protobuf solution would certainly work for us as well.

willnorris · 2013-08-06T02:36:44Z

PS: As for time.Time I think you meant serialize as int64 timestamps in proto, not strings?

It depends on where in the API it is. Most of the timestamps are returned from GitHub as JSON strings (e.g. "2011-01-26T19:06:43Z"). Go's JSON encoder is smart enough to unmarshal those as native time.Time objects. However, there's no way to have protoc generate structs that include that, so they would have to be encoded simply as strings.

There are a couple of other places where times are returned as ints (notably, the new rate.reset), so yes you're absolutely right there... those would be stored in proto format as int64.

We could of course have some convenience methods for converting both of these (strings and ints) to time.Time, but it wouldn't be as seamless as what we have today.

gedex · 2013-08-06T18:12:27Z

@willnorris I can help with switching Gist and UserEmail to proto. Should I wait for your personal branch getting merged to main repo first? Maybe we can create, for instance, switch_to_proto branch in this repo for proto migration?

willnorris · 2013-08-06T21:43:38Z

@gedex: hold off for now... another update coming soon.

willnorris · 2013-08-06T22:52:54Z

(continuing to update this bug with my progress, as this may be of use to others, or myself, in the future...)

So I'm running into more issues with how we handle Event payloads. Today we decode GitHub's "payload" field into a json.RawMessage, since the proper type really depends on what type of event it is. The Payload() func handles inspecting the event type and then unmarshalling the payload into the correct type. This works pretty well, but sadly we can't do this with protos because there is no way to express a "raw JSON message" type in our .proto file. The equivalent in proto-land would be the bytes type, but encoding/json won't unmarshal a JSON object into a byte slice.

In essence, using protoc to generate Go structs that we will only ever serialize as JSON gives us the worst of both worlds 😖. Or more accurately, it prevents us from using any available methods to solve this, because we're limited to the least common denominator between proto and JSON. In pure JSON, we'd handle this the way we are today with json.RawMessage, and this would be a non-issue. Conversely, if we were actually using the proto wire format, we could take advantage of proto's "unrecognized fields" support. Basically, we wouldn't declare a "payload" field on our Event message, and would then manually pluck the payload data out of the XXX_unrecognized field and parse that. That of course doesn't work because we're using encoding/json to unmarshal the API responses, and it knows nothing about XXX_unrecognized... instead it drops unrecognized fields on the floor.

So two options I'll be exploring:

continue using protos, but with a custom UnmarshalJSON func on the Events type that handles the payload.
switch back over to my pointers branch, and continue investigating the use of pointer values in our hand-written structs. I may be able to pull over enough of the functionality from the goprotobuf library to make that work.

gedex · 2013-08-07T17:28:41Z

@willnorris This maybe irrelevant with current issue, but calling Payload() to Event type doesn't returns corresponding struct type. A type assertion still needed to get the concrete event type, for instance we can't do:

events, _, err := c.Activity.ListEventsPerformedByUser("gedex", true, nil)
if err != nil {
    fmt.Println(err)
    os.Exit(1)
}
for _, e := range events {
    if "PushEvent" == e.Type {
        ev := e.Payload()
        fmt.Printf("%+v\n", ev.Commits) // will panic
    }
}

Using type assertion:

for _, e := range events {
    if "PushEvent" == e.Type {
        ev := e.Payload().(*github.PushEvent)
        fmt.Printf("%+v\n", ev.Commits)
    }
}

Maybe I used it improperly on first example?

willnorris · 2013-08-07T17:31:52Z

@gedex yeah, you're right that you still need to do a type assertion. But Payload() will at least unmarshall the JSON into the right struct so that all the data is there. I'd be interested in other ways we could make this even easier if you have any ideas.

our package docs are too long yet to really need to be in their own file, but I'd like to flesh them out a bit more, particularly once #19 is resolved.

Like a51d6b4, this change makes me sad, mainly because it is a breaking change for all clients, and makes common tasks like reading data out of structs slightly more work, with no direct benefit. Notably, developers will need to make sure and check for nil values before trying to dereference these pointers. Sadly, the change is still necessary, as is more fully explained in issue #19. We can make the nil pointer checks a little easier by adding some Get* funcs like goprotobuf does. I spent a lot of time over the last few weeks exploring this change (switching fields to pointers) versus the much larger change of using protocol buffers for all GitHub data types. While the goprotobuf library is very mature and feature-rich (it's used heavily inside of Google), it's the wrong tool for this task, since we're not actually using the proto wire format. While it does address the immediate concern in #19, it makes way too many other things terribly awkward. One of the biggest drawbacks of this change is that it will make the string output from fmt.Printf("%v") next to useless, since all pointer values are displayed as their memory address. To handle that, I'll be writing a custom String() function for these structs that is heavily inspired by goprotobuf and internals from go's fmt package.

willnorris · 2013-08-20T20:52:39Z

fixed in 3072d06 and 084b5991154b78abe559f04029a66d41a109cbd0. This will break all existing users of the library. Again. 😞

In the end, I decided to use simple pointers for struct fields, and then migrate over the convenience methods from goprotobuf that were helpful for us. I'll open a new bug to look into generating convenience Get* methods similar to goprotobuf. In the meantime, users of the library will need to do their own nil checks.

I did really like Quinn's suggestion, but don't want to lose the type safety. We could pass a separate fieldMask []string parameter, which would keep the type safety, but after talking to a number of other engineers here at Google, I decided to go with the pointer approach. It seems to be the most idiomatic way to address this.

I'm still not completely happy with the final result, but I think it's the least bad option, and this issue has stalled other progress for far too long. If you find new problems this introduces, please open new issues for them.

our package docs are too long yet to really need to be in their own file, but I'd like to flesh them out a bit more, particularly once google#19 is resolved.

Like a51d6b4, this change makes me sad, mainly because it is a breaking change for all clients, and makes common tasks like reading data out of structs slightly more work, with no direct benefit. Notably, developers will need to make sure and check for nil values before trying to dereference these pointers. Sadly, the change is still necessary, as is more fully explained in issue google#19. We can make the nil pointer checks a little easier by adding some Get* funcs like goprotobuf does. I spent a lot of time over the last few weeks exploring this change (switching fields to pointers) versus the much larger change of using protocol buffers for all GitHub data types. While the goprotobuf library is very mature and feature-rich (it's used heavily inside of Google), it's the wrong tool for this task, since we're not actually using the proto wire format. While it does address the immediate concern in google#19, it makes way too many other things terribly awkward. One of the biggest drawbacks of this change is that it will make the string output from fmt.Printf("%v") next to useless, since all pointer values are displayed as their memory address. To handle that, I'll be writing a custom String() function for these structs that is heavily inspired by goprotobuf and internals from go's fmt package.

c4milo · 2014-05-27T14:25:36Z

Well, this certainly sucks. How about sending a patch upstream, to the JSON marshaller, so we can make the decision about omitting empty fields or not upon every marshaling?

omitempty := true
json.Marshal(foo, omitempty)

willnorris · 2014-05-27T14:59:44Z

I'm not sure I understand what you're suggesting. The issue isn't that we need the flexibility to specify whether empty fields are omitted or not at the time of marshaling. The issue is that for a given (non-pointer) field with a zero value, we don't know if it's the zero value because it was simply initialized that way, or if the developer explicitly set it to that. Pointers remove that ambiguity.

If anything, you would need the ability to specify whether empty fields should be omitted on a per-field basis, which is effectively what was suggested above.

Given how Go's zero values work, I actually don't know of a better way to handle this, so I don't think there is really anything to be patched upstream.

heidsoft · 2016-04-09T03:04:33Z

good,

lbdremy · 2016-06-29T08:43:00Z

@willnorris @sqs

I did really like Quinn's suggestion, but don't want to lose the type safety. We could pass a separate fieldMask []string parameter, which would keep the type safety,

Could you elaborate on this topic ?

RussellLuo · 2022-05-31T09:52:55Z

Hi there, sorry for replying to an old issue.

I encountered the the same problem as described in this thread. After some investigation, I think there might be a possible workaround for the problem with handling empty values from JSON:

Decode JSON into map[string]interface{} first.
Then use the above map[string]interface{} as a filed mask (like what protobuf provides)
Finally, we can decode map[string]interface{} into a struct by leveraging some library (such as mapstructure).

See https://go.dev/play/p/aKDfn4HQLxM for a runnable example.

Advantages:

No need for developers to break the struct definitions by using pointers
No need for library users to do nil checks

Disadvantages:

Need to decode twice (JSON -> map[string]interface{} -> struct)
Introduces a new dependency on a third-party library (i.e. mapstructure)

What do you think?

gmlewis · 2022-05-31T14:55:35Z

What do you think?

I'm concerned that this might be quite disruptive at this point with so many users of this repo, making a change like this 9 years later. It seems to me that this would be a major retrofit to users of this client library and a completely different style of usage (using the Has method instead of using nil checks).

It might make sense to create a fork and try out these ideas and see how things go.

RussellLuo · 2022-06-01T03:11:52Z

Thank you for the kind reply! I agree with you that it's unwise to try to change this repo.

I have just turned the idea into a little library called filedmask. As you suggested, I plan to try this library in real-world REST APIs and see how it will go.

Thanks again!

gmlewis · 2022-06-01T03:26:32Z

Excellent! Feel free to report back here and let us know how the experiment goes. Thanks.

license: change from Unlicense to Apache 2.0

willnorris mentioned this issue Jul 31, 2013

Implemented HooksService (w/tests) #23

Closed

willnorris mentioned this issue Aug 19, 2013

Add Privacy, Homepage, Default Branch to Repository Model #38

Closed

willnorris added a commit that referenced this issue Aug 19, 2013

move package docs to doc.go

f4dde7d

our package docs are too long yet to really need to be in their own file, but I'd like to flesh them out a bit more, particularly once #19 is resolved.

willnorris closed this as completed Aug 20, 2013

willnorris mentioned this issue Aug 20, 2013

add Getter methods for data structures #45

Closed

ktoso pushed a commit to ktoso/go-github that referenced this issue Aug 25, 2013

move package docs to doc.go

a59a635

our package docs are too long yet to really need to be in their own file, but I'd like to flesh them out a bit more, particularly once google#19 is resolved.

willnorris mentioned this issue Jan 31, 2014

Using pointers all over the places? #79

Closed

bradfordcp mentioned this issue Jun 5, 2014

Consider switching Channel struct to use pointer fields bradfordcp/teamspeak#5

Open

metakeule mentioned this issue Dec 8, 2014

proposal: spec: create builtin interfaces, let builtin types fulfill them golang/go#8303

Closed

yisiper mentioned this issue Mar 28, 2015

Struct field json:"xxx,omitempty" return nil when reference #173

Closed

jasdel mentioned this issue Sep 2, 2015

aws.String("xxx") ? aws/aws-sdk-go#363

Closed

ma6174 mentioned this issue Nov 21, 2015

【翻译】Go语言, REST APIs 和指针 ma6174/blog#18

Open

timoreimann mentioned this issue Apr 4, 2016

Depointerize maps and slices gambol99/go-marathon#138

Merged

neilotoole mentioned this issue Apr 13, 2016

[golang] Default to pointers for non-primitive types swagger-api/swagger-codegen#2330

Open

dmitshur mentioned this issue Jun 19, 2016

add support for new repository invitations #373

Merged

willnorris mentioned this issue Jul 6, 2016

Add support to update tasks. tambet/go-asana#3

Merged

ojongerius mentioned this issue Aug 25, 2016

Consider updating library to use pointers for optional fields zorkian/go-datadog-api#56

Closed

wanghq mentioned this issue Sep 2, 2016

How to differentiate unset value and zero value in protobuf3? golang/protobuf#225

Closed

ojongerius mentioned this issue Jan 31, 2017

WIP GH-56 Refactor to use pointers for all fields. zorkian/go-datadog-api#83

Merged

dmitshur mentioned this issue Jan 31, 2017

Policy on using pointers/values in structs for receiving data from GitHub servers. #537

Closed

eginez mentioned this issue Nov 30, 2017

Primitives should not be pointers oracle/oci-go-sdk#9

Closed

morlay mentioned this issue Dec 22, 2017

空值检查和非 string 值的 toString() aliyun/alibaba-cloud-sdk-go#6

Closed

ckeyes88 mentioned this issue Jul 23, 2018

Unable to nullify string attribute using the update function... getconversio/go-shopify#104

Closed

3 tasks

andrewhoff mentioned this issue Sep 7, 2018

Unable to nullify string attribute using the update function... bold-commerce/go-shopify#4

Open

3 tasks

This was referenced Feb 13, 2020

Why are all the responses using pointers #1429

Closed

Implement support for actions workflow jobs #1421

Merged

gmlewis mentioned this issue Feb 22, 2020

Switch Workflow to use pointers #1437

Merged

john-m-liu mentioned this issue Mar 2, 2021

Not differentiating between blank string and null causes difficulties in merging pull requests #1815

Closed

adrien-barret pushed a commit to adrien-barret/go-github that referenced this issue Jan 15, 2024

Merge pull request google#19 from bradleyfalzon/apache20

bf9fc26

license: change from Unlicense to Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue setting empty values #19

issue setting empty values #19

willnorris commented Jul 9, 2013

willnorris commented Jul 11, 2013

willnorris commented Aug 1, 2013

willnorris commented Aug 2, 2013

willnorris commented Aug 5, 2013

ktoso commented Aug 6, 2013

sqs commented Aug 6, 2013

willnorris commented Aug 6, 2013

gedex commented Aug 6, 2013

willnorris commented Aug 6, 2013

willnorris commented Aug 6, 2013

gedex commented Aug 7, 2013

willnorris commented Aug 7, 2013

willnorris commented Aug 20, 2013

c4milo commented May 27, 2014

willnorris commented May 27, 2014

heidsoft commented Apr 9, 2016

lbdremy commented Jun 29, 2016

RussellLuo commented May 31, 2022 •

edited

Loading

gmlewis commented May 31, 2022

RussellLuo commented Jun 1, 2022

gmlewis commented Jun 1, 2022

issue setting empty values #19

issue setting empty values #19

Comments

willnorris commented Jul 9, 2013

Option 1

Option 2

Option 3

willnorris commented Jul 11, 2013

willnorris commented Aug 1, 2013

willnorris commented Aug 2, 2013

willnorris commented Aug 5, 2013

ktoso commented Aug 6, 2013

sqs commented Aug 6, 2013

willnorris commented Aug 6, 2013

gedex commented Aug 6, 2013

willnorris commented Aug 6, 2013

willnorris commented Aug 6, 2013

gedex commented Aug 7, 2013

willnorris commented Aug 7, 2013

willnorris commented Aug 20, 2013

c4milo commented May 27, 2014

willnorris commented May 27, 2014

heidsoft commented Apr 9, 2016

lbdremy commented Jun 29, 2016

RussellLuo commented May 31, 2022 • edited Loading

gmlewis commented May 31, 2022

RussellLuo commented Jun 1, 2022

gmlewis commented Jun 1, 2022

RussellLuo commented May 31, 2022 •

edited

Loading