Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nested Message objects in to_h #4962

Closed
wants to merge 1 commit into from
Closed

Fix nested Message objects in to_h #4962

wants to merge 1 commit into from

Conversation

oggy
Copy link

@oggy oggy commented Jul 24, 2018

For example, for this proto file:

syntax = "proto3";
import "google/protobuf/struct.proto";

message Parent {
  map<string, Child> children = 1;
}

message Child {
  string name = 1;
}

We used to have:

irb(main):001:0> Parent.new(children: {'a' => Child.new(name: 'a')}).to_h
=> {:children=>{"a"=>{:name=>"a"}}}

Now:

irb(main):001:0> Parent.new(children: {'a' => Child.new(name: 'a')}).to_h
=> {:children=>{"a"=><Child: name: "a">}}

It appears this changed in #2847 / #396, but it's unclear what the intent
was with these lines, so I'm assuming this was an unintended side-effect
(it's not needed to solve #1761).

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@oggy
Copy link
Author

oggy commented Jul 24, 2018

CLA signed.

@googlebot
Copy link

CLAs look good, thanks!

@TeBoring
Copy link
Contributor

@oggy Does your code depends on the actual content of to_h? Is there anything broken?

@haberman
Copy link
Member

The current behavior is intentional, there is even a test for it:

https://github.com/google/protobuf/blob/master/ruby/tests/basic.rb#L1026

What is the rationale for changing this? Is there prior art in Ruby that says this should be shallow instead of deep?

@oggy
Copy link
Author

oggy commented Jul 25, 2018

Thanks for the quick responses!

@TeBoring We do have code that uses to_h, and expected the values of the hash to contain nested Message objects. We can rewrite it without too much difficulty, but I found the change in behavior surprising, and saw no mention of backwards-incompatible behavior in CHANGES.txt, so I assumed it wasn't intentional.

@haberman Thanks for the test pointer. I would say there is prior art. Ruby's built-in Struct, e.g., has a shallow to_h.

irb(main):001:0> Parent = Struct.new(:child)
=> Parent
irb(main):002:0> Child = Struct.new(:name)
=> Child
irb(main):003:0> parent = Parent.new(Child.new(name: 'a'))
=> #<struct Parent child=#<struct Child name={:name=>"a"}>>
irb(main):004:0> parent.to_h
=> {:child=>#<struct Child name={:name=>"a"}>}

If deep to_h is desired, I would have gone for either adding a deep to_h under a different name, or support an option like to_h(deep: true). Happy to push code for either of these or something else -- let me know.

@TeBoring
Copy link
Contributor

TeBoring commented Jul 25, 2018

@oggy Thanks for the explanation. I thought to_h is for usage like hash map element. So we juist need to guarantee two messages have the same to_h result if and only if these two messages are the same. Does that make sense to you?

For example, for this proto file:

```
syntax = "proto3";
import "google/protobuf/struct.proto";

message Parent {
  map<string, Child> children = 1;
}

message Child {
  string name = 1;
}
```

We used to have:

```
irb(main):001:0> Parent.new(children: {'a' => Child.new(name: 'a')}).to_h
=> {:children=>{"a"=>{:name=>"a"}}}
```

Now:

```
irb(main):001:0> Parent.new(children: {'a' => Child.new(name: 'a')}).to_h
=> {:children=>{"a"=><Child: name: "a">}}
```

It appears this changed in #2847/#396, but it's unclear what the intent
was with these lines, so I'm assuming this was an unintended side-effect
(it's not needed to solve #1761).
@oggy
Copy link
Author

oggy commented Jul 30, 2018

@TeBoring This is almost true. Hashes compare elementwise, and a Message#== is defined that compares the type & fields, so any child Messages need to match according to ==. The "almost" is because to_h discards the type of the receiving Message, so the "only if" part doesn't quite hold. But this is ok, right?

I've updated the test that @haberman mentioned.

@haberman
Copy link
Member

@oggy I like the idea of having a parameter to control whether it is deep or not.

For the parameter default, I would be in favor of keeping it as it is, unless there is an argument that the shallow behavior is clearly better. I hear your point that we changed the behavior of this a year ago. But that was an intentional change that was a part of a bugfix. The current behavior has been in place for over a year now, so I don't think we should change it now unless it is obviously wrong.

I hear your point about Struct. But to me a key difference is that Struct doesn't know when some of its members might also be Struct. Protobuf does know this. So the comparison doesn't seem exact.

@TeBoring
Copy link
Contributor

TeBoring commented Aug 6, 2018

Closed this for now, since previous change was intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants