New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json: add "readonly" tag #28143

Open
Carpetsmoker opened this Issue Oct 11, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@Carpetsmoker

Carpetsmoker commented Oct 11, 2018

Previous discussion: #19423

Note: this proposal focuses on the encoding/json package, but the same could be
applied to other encoding/* packages, especially encoding/xml.

Problem

It is currently hard to marshal json while preventing those fields being set to
arbitrary values by outside input. The - tag prevents both unmarshalling and
marshalling.

This is a common requirement in for example REST APIs:

type User struct {
	ID        int       `json:"id"`
	Name      string    `json:"name"`
	Email     string    `json:"email"`
	CreatedBy int       `json:"createdBy"`
	CreatedAt time.Time `json:"createdAt"`

	DefaultUnmarshal bool `json:"-"
}

When showing the data to the user (e.g. GET /user/1) we want to marshal the
ID, CreatedBy, and CreatedAt fields; but we don't want to allow users to
set these fields on create or update endpoints (e.g. PUT /user/1).

As far as I can think of, right now it's hard to write a generic fully correct
solution for this:

  1. It's possible to implement the json.Unmarshaler interface like so:

     func (u *User) UnmarshalJSON(data []byte) error {
     	type Alias User
     	var alias Alias
     	err := json.Unmarshal(data, &alias)
     	if err != nil {
     		return err
     	}
    
     	if u.DefaultUnmarshal {
     		*u = User(alias)
     		return nil
     	}
    
     	// Unmodifiable fields
     	alias.ID = u.ID
     	alias.CreatedAt = u.CreatedAt
     	alias.CreatedBy = u.CreatedBy
    
     	*u = User(alias)
     	return nil
     }
    

    While this works, it's verbose and difficult to generalize without
    reflection complexity. Every type needs to have its own UnmarshalJSON()
    (and/or UnmarshalXML(), etc.)

    A second problem is that in some cases you do want to unmarshal the readonly
    fields, for example from tests:

     func TestUser(t *testing.T) {
     	body := callAPI()
    
     	var u user
     	u.DefaultUnmarshal = true
     	json.Unmarshal(body, &u)
    
     	if !expectedUser(u) {
     		t.Error()
     	}
     }
    

    Hence the DefaultUnmarshal parameter, which needs to be exported to allow
    setting them in other packages.

  2. rsc proposed creating a custom ReadOnlyString with a no-op
    UnmarshalJSON() in #19423. this works well and is not unreasonable, but if
    you want to use e.g. sql.NullString; you would need to add a custom
    ReadOnlyNullString as well. Using it in combination with the
    github.com/guregu/null package means
    creating two extra types (readOnlyNullString and readOnlyZeroString).

    It also doesn't allow easy setting readonly fields from tests.

  3. A third option is to unmarshal the JSON to interface{}, modify the
    data based on the struct tags, marshal back to JSON, and then unmarshal in to
    the type you want.

    I have a working implementation of this, and it works for the simple cases.
    But the "Unmarshal -> Marshal -> Unmarshal" dance seems needlessly
    complicated and inefficient, and dealing with arbitrary JSON is rather
    tricky/verbose in Go.

  4. Another solution (which is probably the most common) is to simply use
    different structs for different purposes. I am not hugely in favour of this,
    as it leads to a lot of duplication.

Proposed solution: readonly struct tag

With the readonly tags added, the above struct would look like:

type User struct {
	ID        int       `json:"id,readonly"`
	Name      string    `json:"name"`
	Email     string    `json:"email"`
	CreatedBy int       `json:"createdBy,readonly"`
	CreatedAt time.Time `json:"createdAt,readonly"`
}

Regular Unmarshal() will not set any of the readonly fields; they are
silently ignored:

json.Unmarshal(data, &u)

An option for json.Decoder can be added to allow setting the readonly
fields, similar to DisallowUnknownFields():

d := json.NewDecoder(r)
d.AllowReadonlyFields()
d.Decode(&u)

The DisallowUnknownFields() option can be used to error out when readonly
fields are attempted to be set (although this could potentially also be a new
option).


In the previous discussion rsc mentioned that this feature would not "pull its
weight" as it's not common enough of a use case. It's true that it's less
common of a use case, but I don't believe it's terrible uncommon, either.

Implementing this correctly by users is fairly complex, and adding this feature
to the encoding/json package seems – unless I am mistaken – quite simple to the
point of being almost trivial. It will of course increase maintenance burden in
the future, but personally, I think it's a fair trade-off.

Prototype implementation

To test the feasibility and complexity of this change I wrote an implementation
of this proposal, which seems to work well. I can make a CL with an expanded
version of this if this proposal is received well.

diff --git i/src/encoding/json/decode.go w/src/encoding/json/decode.go
index fd2bf92dc2..e8e2cd7486 100644
--- i/src/encoding/json/decode.go
+++ w/src/encoding/json/decode.go
@@ -273,6 +273,7 @@ type decodeState struct {
	savedError            error
	useNumber             bool
	disallowUnknownFields bool
+	allowReadonlyFields   bool
 }
 
 // readIndex returns the position of the last byte read.
@@ -695,6 +696,7 @@ func (d *decodeState) object(v reflect.Value) error {
		// Figure out field corresponding to key.
		var subv reflect.Value
		destring := false // whether the value is wrapped in a string to be decoded first
+		readOnly := false // ,readonly tag
 
		if v.Kind() == reflect.Map {
			elemType := t.Elem()
@@ -719,6 +721,9 @@ func (d *decodeState) object(v reflect.Value) error {
			if f != nil {
				subv = v
				destring = f.quoted
+				if !d.allowReadonlyFields {
+					readOnly = f.readOnly
+				}
				for _, i := range f.index {
					if subv.Kind() == reflect.Ptr {
						if subv.IsNil() {
@@ -757,7 +762,9 @@ func (d *decodeState) object(v reflect.Value) error {
		}
		d.scanWhile(scanSkipSpace)
 
-		if destring {
+		if readOnly {
+			_ = d.value(reflect.Value{})
+		} else if destring {
			q, err := d.valueQuoted()
			if err != nil {
				return err
diff --git i/src/encoding/json/decode_test.go w/src/encoding/json/decode_test.go
index b84bbabfcd..74b5c7eccc 100644
--- i/src/encoding/json/decode_test.go
+++ w/src/encoding/json/decode_test.go
@@ -2239,3 +2239,46 @@ func TestUnmarshalPanic(t *testing.T) {
	Unmarshal([]byte("{}"), &unmarshalPanic{})
	t.Fatalf("Unmarshal should have panicked")
 }
+
+func TestReadonly(t *testing.T) {
+	type nested struct {
+		RO string `json:"ro,readonly"`
+		RW string `json:"rw"`
+	}
+
+	type foo struct {
+		RO     string `json:"ro,readonly"`
+		RW     string `json:"rw"`
+		Nested nested `json:"nested"`
+	}
+
+	f := foo{"hello", "hello", nested{"hello", "hello"}}
+	data := `{"ro": "XXXXX", "rw": "XXXXX", "nested": {"ro": "XXXXX", "rw": "XXXXX"}}`
+
+	t.Run("unmarshal", func(t *testing.T) {
+		want := foo{"hello", "XXXXX", nested{"hello", "XXXXX"}}
+		err := Unmarshal([]byte(data), &f)
+		if err != nil {
+			t.Fatal(err)
+		}
+
+		if !reflect.DeepEqual(f, want) {
+			t.Errorf("\ngot:  %#v\nwant: %#v", f, want)
+		}
+	})
+
+	t.Run("allowReadonlyFields", func(t *testing.T) {
+		want := foo{"XXXXX", "XXXXX", nested{"XXXXX", "XXXXX"}}
+		d := NewDecoder(strings.NewReader(data))
+		d.AllowReadonlyFields()
+		err := d.Decode(&f)
+
+		if err != nil {
+			t.Fatal(err)
+		}
+
+		if !reflect.DeepEqual(f, want) {
+			t.Errorf("\ngot:  %#v\nwant: %#v", f, want)
+		}
+	})
+}
diff --git i/src/encoding/json/encode.go w/src/encoding/json/encode.go
index f10124e67d..944be253eb 100644
--- i/src/encoding/json/encode.go
+++ w/src/encoding/json/encode.go
@@ -1040,6 +1040,7 @@ type field struct {
	typ       reflect.Type
	omitEmpty bool
	quoted    bool
+	readOnly  bool
 
	encoder encoderFunc
 }
@@ -1156,6 +1157,7 @@ func typeFields(t reflect.Type) []field {
						index:     index,
						typ:       ft,
						omitEmpty: opts.Contains("omitempty"),
+						readOnly:  opts.Contains("readonly"),
						quoted:    quoted,
					}
					field.nameBytes = []byte(field.name)
diff --git i/src/encoding/json/stream.go w/src/encoding/json/stream.go
index 7d5137fbc7..14463f6842 100644
--- i/src/encoding/json/stream.go
+++ w/src/encoding/json/stream.go
@@ -41,6 +41,8 @@ func (dec *Decoder) UseNumber() { dec.d.useNumber = true }
 // non-ignored, exported fields in the destination.
 func (dec *Decoder) DisallowUnknownFields() { dec.d.disallowUnknownFields = true }
 
+func (dec *Decoder) AllowReadonlyFields() { dec.d.allowReadonlyFields = true }
+
 // Decode reads the next JSON-encoded value from its
 // input and stores it in the value pointed to by v.
 //

@gopherbot gopherbot added this to the Proposal milestone Oct 11, 2018

@gopherbot gopherbot added the Proposal label Oct 11, 2018

@bontibon

This comment has been minimized.

Contributor

bontibon commented Oct 13, 2018

Similar #26636

@deanveloper

This comment has been minimized.

deanveloper commented Oct 15, 2018

Quickly skimming this without reading into details, it's confusing if the readonly applies to the struct attribute (Do not write this into JSON) or the JSON tag (Do not write this into the struct). Perhaps it'd be good to call it something like no-unmarshal or marshalonly? (Is a no-marshal/unmarshalonly also worth adding?)

@Carpetsmoker

This comment has been minimized.

Carpetsmoker commented Oct 16, 2018

Intuitively, a json:"foo,readonly" tag seems to communicate intent fairly clear to me, whereas the no-unmarshal or marshalonly don't really. I'm not sure if I follow what the confusing part is? The tag is json:"foo,readonly", so it seems to me it follows naturally that it only applies to the JSON (un)marshalling?

I don't really have very string opinions at any rate. At this point I think it's more important to discus the general concept and behaviour; the exact name can be changed easily.

Is a no-marshal/unmarshalonly also worth adding?

Are there good use cases for that @deanveloper? I can't really think of any myself, but that could be a failure of my imagination.

I think it's important to identify practical real-world use cases that can't easily be implemented using exciting features, instead of just saying "hey, it might be nice to have ..."

For a readonly tag, I can think of a few possible use cases. My original post described one (reading user input). A similar one might be reading data from an API:

foo := Foo{PrivateField: "dont-change"}
readFromAPI(&foo)

For reasons of future-proofing and security it would be desirable to tag PrivateField as readonly.

@deanveloper

This comment has been minimized.

deanveloper commented Oct 16, 2018

Intuitively, a json:"foo,readonly" tag seems to communicate intent fairly clear to me

I already explained my communication issue, remember that JSON is used in multiple areas, not just in data transmission but also in data storage and data display (which are both arguably "data transmission" but with the JSON ending up in a file/screen rather than a structure)

The communication issue in a "readonly" tag comes in when I see it outside the context of a data transmission. A simple example where "readonly" is confusing would be where JSON is used as a storage system, where "reading" is the Unmarshal step, you may think it's safe to read config values into "readonly" tags, especially if you have no intentions of changing your config during execution. This would result in no values being read, as what "readonly" really does is block the Unmarshal step.

Although you are correct that naming can be judged later.

Are there good use cases for that @deanveloper? I can't really think of any myself, but that could be a failure of my imagination.

This was more meant of a random thought and not really meant to add to our much to the conversation, although it may be useful for a similar case, but where the Unmarshal/Marshal mean Write/Read rather than vice versa.

I think it's important to identify practical real-world use cases that can't easily be implemented using exciting features, instead of just saying "hey, it might be nice to have ..."

I definitely agree here. If there's no use, then by all means we don't need a "no-marshal"

@rsc rsc added the Proposal-Hold label Oct 17, 2018

@rsc

This comment has been minimized.

Contributor

rsc commented Oct 17, 2018

On hold for JSON sweep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment