Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FieldMask utilities to Message types #1505

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

pouyayarandi
Copy link

@pouyayarandi pouyayarandi commented Nov 18, 2023

Summary

This PR adds a bunch of utilities that enable message types to modify some of their fields (by Google well-known type, FieldMask). Here are added APIs:

extension Message {

  /// Checks whether the given path is valid for Message type.
  public static func isPathValid(_ path: String) -> Bool

  /// Merges fields specified in a FieldMask into another message.
  public mutating func merge(with source: Self, fieldMask: Google_Protobuf_FieldMask) throws

  /// Removes from 'message' any field that is not represented in the given FieldMask.
  public mutating func trim(fieldMask: Google_Protobuf_FieldMask) -> Bool
}

extension Google_Protobuf_FieldMask {

  /// Initiates a field mask with all fields of the message type.
  public init<M>(allFieldsOf messageType: M.Type)

  /// Initiates a field mask from some particular field numbers of a message.
  public init<M>(fieldNumbers: [Int], of messageType: M.Type) throws

  /// Adds a path to FieldMask after checking whether the given path is valid.
  public mutating func addPath<M>(_ path: String, of messageType: M.Type) throws

  /// Converts a FieldMask to the canonical form.
  public var canonical: Google_Protobuf_FieldMask { get }

  /// Creates a union of two FieldMasks.
  public func union(_ mask: Google_Protobuf_FieldMask) -> Google_Protobuf_FieldMask

  /// Creates an intersection of two FieldMasks.
  public func intersect(_ mask: Google_Protobuf_FieldMask) -> Google_Protobuf_FieldMask

  /// Creates a FieldMasks with paths of the original FieldMask that does not included in mask.
  public func subtract(_ mask: Google_Protobuf_FieldMask) -> Google_Protobuf_FieldMask

  /// Returns true if the given path is covered by the FieldMask.
  public func contains(_ path: String) -> Bool

  /// Checks whether the given FieldMask is valid for type M.
  public func isValid<M>(for messageType: M.Type) -> Bool
}

Implementation

This feature is empowered by _ProtoNameProviding to map proto field numbers to names. This enables us to have the aforementioned inits of FieldMask. Also, for masking a message two Decoders have been implemented: GetPathDecoder and SetPathDecoder. The first one gets the value of a field by its path, the latter sets a value to a field of the message by the path. In override function, value will be captured by GetPathDecoder and then, it will be set by the SetPathDecoder.

Use case

Assuming we have a caching system which prevent some fields of a response from recalculation. Backend gives us a proto message and tells that some of its fields should be cached (i.e. field number 1). Next time, when we are going to make a request, we evaluate our client-side's cache and send the cached field numbers in the header of the request (i.e. we send back 1 as a field number that its value is already cached). As a result, backend response with leaving those fields empty (or filled with default values). And in client-side, we need to modify those field numbers with cached values.

For this purpose, first of all, we can store the original backend response somewhere, and in the rest of requests we can override the responses with the cached proto message:

// Defines a mask field which includes particular paths based on field numbers
let mask = Google_Protobuf_FieldMask(fieldNumbers: cachedFieldNumbers, of: MyMessage.self)

// Get response by telling backend not to recalculate those fields and backend leave those fields unset
var response: MyMessage = await network.getResponse(exclude: cachedFieldNumbers)

// Get cached response from storage
let cachedResponse: MyMessage = cache.getCache()

// Merge those fields of response with the cached one.
try response.merge(to: cachedResponse, fieldMask: mask)

Testing

Unit tests have been added for the new feature, and all tests have passed.

Considerations

  • Merge options and trim options have not been implemented as they require some generated code.
  • Dot pattern is supported in paths. Path a.b refers to field b of the field a in the message.
  • Merges will occur in order of FieldMask's paths.

@thomasvl
Copy link
Collaborator

At quick glance you seem to have two distinct proposals in here (they don't seem connected in implementation), if that's the case, I'd suggest splitting this into two PRs so there can be easier discussion on them independently. It likely would help to also provide some examples of what usecases you are trying to enable.

@pouyayarandi pouyayarandi changed the title Add proto reflection Add applying methods generator Nov 20, 2023
@pouyayarandi
Copy link
Author

Hi @thomasvl, the protobuf mirroring feature has been deleted from this PR, so the current one is a proposal just for applying. Also the use case section is added to the description.

@thomasvl
Copy link
Collaborator

Handing off to Apple folks.

Idle thought - Since Decoder is public, could this entirely implemented via a custom decoder that feeds field numbers and then the values via the decoding apis? Then there wouldn't have to be any more codegen to support it? The library code itself would likely be bigger. But if that is possible, this could be entirely done as a layer on top of SwiftProtobuf and not actually need library additions.

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far we have not exposed any of the actual underlying proto definition in our generated code such as proto message name, field numbers or similar AFAIK. This would be diverting from this. Doesn't mean that I am against that but @thomasvl and @tbkka you did most of the work on SwiftProtobuf and are probably better at judging this.

I am also wondering if this is something that other languages do provide.

@pouyayarandi pouyayarandi changed the title Add applying methods generator Add applying methods Nov 27, 2023
@tbkka
Copy link
Contributor

tbkka commented Nov 27, 2023

I just saw the new implementation as a standalone decoder. That has the big advantage of not requiring any additional code generation, so it won't increase generated code size and can therefore be available by default everywhere. This is the right way to implement this functionality.

The real question, then, is whether this new functionality is right for this particular library. As a rule, we do try to stick closely to the protobuf specification and precedents set by other implementations. So I have a couple of questions to help better understand how this proposal relates to other prior art:

  • How would you do this in the C++ protobuf implementation? Java? Python? Go? C#?

  • Can the FieldMask concept that's in the protobuf specification be used for this kind of application? (We have never fully implemented FieldMask in SwiftProtobuf; if that could provide this support, then it might be worth implementing that instead.)

Related: Your ProtobufMirror idea should be implementable as a new encoder, again without requiring any new code generation.

@pouyayarandi
Copy link
Author

@tbkka Thanks for your response.

  • How would you do this in the C++ protobuf implementation? Java? Python? Go? C#?

The concept is the same in all the languages you've mentioned but when it comes to implementation, we would see different APIs based on the nature of each language. As I found in most of them, they use Descriptor and Reflection. Here are some examples:

  • Can the FieldMask concept that's in the protobuf specification be used for this kind of application? (We have never fully implemented FieldMask in SwiftProtobuf; if that could provide this support, then it might be worth implementing that instead.)

Maybe, but AFAIK, it just helps us to tell the backend which fields should be excluded, when the client receives the response we have to replace somehow a bunch of values with those ones we've cached before. Here we need an API to set values by field numbers.

Related: Your ProtobufMirror idea should be implementable as a new encoder, again without requiring any new code generation.

The ProtobufMirror feature had been implemented by visitors and it had no generated code.

@thomasvl
Copy link
Collaborator

I believe FieldMask usually also has helpers that work with copying things from one message to another message - "Copy just these fields". And by having it within the library it would have access to the field names data that currently isn't exposed.

To really make these Apply methods useful would likely be figuring out how to include the field numbers in the generated files for folks. My initial thought would be an enum doesn't really help because you want Int values and you don't want to keep having to do rawValue. So likely just Int constants scoped in some way. But unlike a C based languages where an enum of just using #define is zero runtime cost, I'm not sure what would work for Swift to make the field number constant not add to binary size since would be public and thus if things get linked into a framework they would all become exported symbols.

@tbkka
Copy link
Contributor

tbkka commented Nov 28, 2023

Here we need an API to set values by field numbers.

There are a number of inter-related features here that I have in mind. Since many of these can be implemented in terms of the others, it's worth thinking about them together to figure out how we want to provide them:

  • Descriptors. SwiftProtobuf does not currently embed descriptors into generated messages, mostly due to concerns about binary size.
  • Setting values by field number and/or name. You've shown how we could implement each of these directly with the current decoder APIs.
  • FieldMask utilities. We currently only have support for constructing FieldMask messages. We also want someday to have utilities that use a FieldMask to: remove certain fields from a message; keep only certain fields; copy certain fields from one message to another. (This would be another approach for implementing some of the field caching you describe.). We'd also like utilities that can construct and/or modify FieldMasks: build a FieldMask with every field in a particular message, build a FieldMask from a list of field numbers, intersect or union two FieldMasks.

If we had APIs to query, set, and clear fields by name and by number, those could be used to implement the FieldMask utilities. They could also be used as building blocks for Descriptor-based utilities. I think those APIs could all be built using the techniques you've been exploring.

(And coincidentally, in #1504, Max is proposing a different way to organize the traversal logic that might make the above more efficient.)

@pouyayarandi
Copy link
Author

pouyayarandi commented Nov 28, 2023

To sum up, we'd better implement some utilities as extensions on the FieldMask and Messages. Here are some of those utils I found helpful:

extension Message {

  /// Returns a field mask with all fields of the message type.
  static var fieldMask: Google_Protobuf_FieldMask

  /// Builds a field mask from some particular field numebers of a message
  /// - Parameter fieldNumbers: Field numbers of paths
  /// - Returns: Field mask that include paths of corresponding field numbers
  static func fieldMask(fieldNumbers: [Int]) -> Google_Protobuf_FieldMask

  /// Builds a field mask by excluding some paths
  /// - Parameter paths: Paths to be excluded
  /// - Returns: Field mask that does not include the paths
  static func fieldMask(excludedPaths paths: [String]) -> Google_Protobuf_FieldMask

  /// Builds a field mask by reversing another mask
  /// - Parameter mask: Mask to be reversed
  /// - Returns: Field mask that does not include the `mask` paths
  static func fieldMask(reverse mask: Google_Protobuf_FieldMask) -> Google_Protobuf_FieldMask

  /// Clears masked fields and keep the other fields unchanged
  /// - Parameter mask: Field mask which determines what fields shoud be cleared
  mutating func mask(by mask: Google_Protobuf_FieldMask)

  /// Clears all fields of a message except particular fields refered in mask
  /// - Parameter mask: Fields should be ignored in clearance
  mutating func mask(except mask: Google_Protobuf_FieldMask)

  /// Overrides value of masked fields in original message with the input message
  /// - Parameters:
  ///   - message: Message which overrides some fields of the original message
  ///   - mask: Field mask which determines what fields shoud be overriden
  mutating func override(with message: Self, by mask: Google_Protobuf_FieldMask)
 
  func masked(by mask: Google_Protobuf_FieldMask) -> Self
  func masked(except mask: Google_Protobuf_FieldMask) -> Self
  func overriden(with message: Self, by mask: Google_Protobuf_FieldMask) -> Self
}

extension Google_Protobuf_FieldMask {
  func union(_ mask: Google_Protobuf_FieldMask) -> Self
  func intersection(_ mask: Google_Protobuf_FieldMask) -> Self
}

If you agree with these APIs, I could implement it. Please let me know if you have any other idea. @tbkka @thomasvl

@thomasvl
Copy link
Collaborator

I haven't had a chance to really think about these or compare to other apis; but one general question that does come to mind:

Would the static functions make more sense to be in an extension on Google_Protobuf_FieldMask and have them take a Message.Type (instead of being on Message)?

I'm not sure which makes for a more swifty api, but maybe something to think about. They may actually make more sense to developers as convenience initializers?

@tbkka
Copy link
Contributor

tbkka commented Nov 29, 2023

Can these all be implemented with the information we generate today? It seems like the name map might have everything you need for this? Or does this require something like #1504 first, in order to expose the field information in a more easily-digestible form?

@pouyayarandi
Copy link
Author

pouyayarandi commented Nov 29, 2023

Can these all be implemented with the information we generate today?

Yes it could be implemented already by _ProtoNameProviding. But obviously something like #1504 makes it much easier

@pouyayarandi
Copy link
Author

@tbkka @thomasvl New implementation is added for some field mask utils. Note that some other utils like (intersecting, union, etc.) would better to get implemented separately.

@pouyayarandi pouyayarandi changed the title Add applying methods Add FieldMask utilities Dec 3, 2023
@pouyayarandi
Copy link
Author

Just for follow up, build issues with swift 5.8 and later should be fixed now. @thomasvl

@thomasvl
Copy link
Collaborator

thomasvl commented Apr 1, 2024

Just for follow up, build issues with swift 5.8 and later should be fixed now. @thomasvl

Thanks, triggered the CI again to run on things.

Since this is some new functionality, I'd like @tbkka (or some of the other Apple folks) to sign off on it before it's merged. Given the time of year, the Apple folks might be a little busy with WWDC prep, so it make take some more time before they get a chance to look.

@pouyayarandi
Copy link
Author

Thank @thomasvl, BTW I also found an issue in decoding oneOf fields. First for a better vision, when a non one of field is getting decoded it passes its current value to the decoder as an inout parameter. Like this:

case 11: try { try decoder.decodeSingularFloatField(value: &_storage._singularFloat) }()

But in one of fields it happens completely different. It's a sample of code that generated for decoding a one-of field:

case 71: try {
  var v: Float?
  try decoder.decodeSingularFloatField(value: &v)
  if let v = v {
    if _storage._o != nil {try decoder.handleConflictingOneOf()}
    _storage._o = .oneofFloat(v)
  }
}()

In the above code the inout parameter that is passing to decode method is always nil (v variable is never set before passing to decoder method). My suggestion would be to add a line of code setting the current value to variable if it's possible:

case 71: try {
  var v: Float?
  if case .oneofFloat(let _v)? = _storage._o {v = _v}    // add this line
  try decoder.decodeSingularFloatField(value: &v)
  if let v = v {
    if _storage._o != nil {try decoder.handleConflictingOneOf()}
    _storage._o = .oneofFloat(v)
  }
}()

This line of code can be added with this PR (as it's directly affecting field masks) or could be added during another PR. If you prefer second one, we'd better drop supporting for one-of fields in this PR. What's your thought?

@thomasvl
Copy link
Collaborator

thomasvl commented Apr 2, 2024

@pouyayarandi – Can you open a new issue for that so we can track it and the fix? It should be pretty easy to make a test case using the Merging api to show the problem and also do a solution, but rather than flutter this PR up with that process, a new issue will make it easier to discuss (and provide an easy to follow history if need be).

@pouyayarandi
Copy link
Author

pouyayarandi commented Apr 3, 2024

Can you open a new issue for that so we can track it and the fix?

I opened the issue #1608 and will fix it in a new PR, ASAP.

@pouyayarandi
Copy link
Author

I've made some changes which fixes the problem with decoding oneof fields. I used visitor for getting values of a message instead of GetPathDecoder. @thomasvl can you please check it again?

@thomasvl
Copy link
Collaborator

Sorry, been swapped with a few things of late, will try to look again soon.

Copy link
Collaborator

@thomasvl thomasvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read back through everything to try and catch back up on the code.

Sources/SwiftProtobuf/Message+FieldMask.swift Show resolved Hide resolved
Sources/SwiftProtobuf/Message+FieldMask.swift Outdated Show resolved Hide resolved
Sources/SwiftProtobuf/NameMap.swift Show resolved Hide resolved
Sources/SwiftProtobuf/PathDecoder.swift Outdated Show resolved Hide resolved
Sources/SwiftProtobuf/PathDecoder.swift Outdated Show resolved Hide resolved
Sources/SwiftProtobuf/PathVisitor.swift Outdated Show resolved Hide resolved
@pouyayarandi
Copy link
Author

Comments are fixed

Copy link
Collaborator

@thomasvl thomasvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, kicked off the CI on the current state of things.

@tbkka can you take a look again to provide final decision since it is a new addition to the library.

@thomasvl
Copy link
Collaborator

@pouyayarandi looks like atleast one unittest test is failing.

@pouyayarandi
Copy link
Author

I changed my code a little bit, but I couldn't reproduce the failing test on my local machine and all tests passed. So it may fail again. Can you please re-run pipeline? @thomasvl

@pouyayarandi
Copy link
Author

pouyayarandi commented May 16, 2024

It failed again 😢
Do you have any idea why it's happening? I run make test on my machine and everything seems okay. In addition, in some checks the failing test passes
Is there any way to run the exact check on my machine to see the error?

@thomasvl
Copy link
Collaborator

It failed again 😢 Do you have any idea why it's happening? I run make test on my machine and everything seems okay. In addition, in some checks the failing test passes Is there any way to run the exact check on my machine to see the error?

So it's only failing on linux?

You could try using Docker to use do a local linux build (can't say I have any experience with that).

The failure seems to be around group merging? You could try making a temporary test case out of it to try and zero in on things. i.e. - make a test that just merges two messages with the fields in question, but doesn't use the FieldMask support, does that work as intended? Ensure the calculated mask has what you expect, etc.

You can also tweak .github/workflows/build.yml to have it run on your branch name (and disable all but one of the checks needed), and enable CI on your own repo, then, then you can simply upload and let github test for you. If you do this, maybe clone this branch into another one while you test things out, so each upload doesn't also update this PR, and go back to this branch when you've been able to figure it out.

@pouyayarandi
Copy link
Author

Bug fixed @thomasvl

Problem was about singularGroup field of the message. Since this field name is .unique which means its proto name and json name differs, it caused the bug. I made some changes which just allow using proto names for merging purposes. Please re-run the tests they must fully passed now same as my forked repo.

@thomasvl
Copy link
Collaborator

Bug fixed @thomasvl

Problem was about singularGroup field of the message. Since this field name is .unique which means its proto name and json name differs, it caused the bug. I made some changes which just allow using proto names for merging purposes. Please re-run the tests they must fully passed now same as my forked repo.

Make sure you are synced up to HEAD on main, there as a change around groups as part of the Protobuf Editions work a few weeks ago, so more forms are accepted in TextFormat. So that could impact things.

@pouyayarandi
Copy link
Author

Make sure you are synced up to HEAD on main

I fixed the bug on another branch which I rebased with synced fork of your main and then, cherry-picked the fix commit here. So I guess it should be synced with the latest changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants