Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unrecognised fields #1

Merged
merged 35 commits into from Apr 21, 2022
Merged

Conversation

VysotskiVadim
Copy link
Collaborator

Companion PR of mapbox/mapbox-java#1394

The problem

I want to process a json file, for example:

{
    "a": "a",
    "b": "b"
}

having a model which isn't aware of all the properties from the json file, for example:

@AutoValue
public class Model {
  
  public abstract String a();
  
  public abstract Builder toBuilder();

  public static Builder builder() {
    return new AutoValue_Model.Builder();
  }
 
  public static TypeAdapter<Model> typeAdapter(Gson gson) {
    return new AutoValue_Model.GsonTypeAdapter(gson);
  }

  @AutoValue.Builder
  public class Builder {
     public abstract Builder a(String value);
     public abstract Model build();
  }
  
}

but when I serialize the model back to json after processing I want the unrecognized property b to be present in json.

For example, when I deserialize, update property a to a value "c", and serialize json back I want the following output:

{
    "a": "c",
    "b": "b"
}

See more mapbox/mapbox-java#1344

Desired behaviour in code

String json = "{\"a\": \"a\", \"b\": \"b\"}"; // the same source JSON as in the example ☝️ 
// serializing json to Model
GsonBuilder gson = new GsonBuilder();
JsonObject jsonObject = gson.create().fromJson(json, JsonObject.class);
gson.registerTypeAdapterFactory(GeneratedAdaptersFactory.create());
Model model = gson.create().fromJson(jsonObject, Model.class);
// updating the model
Model updatedModel = model.toBuilder().a("c").build();
// deserializing back to json
String outputJson =  gson.create().toJson(updatedModel);

assertEquals( "{\"a\": \"c\", \"b\": \"b\"}", outputJson);

Idea

I can manually add an unrecognized property to the model and mark it with @UnrecognisedJsonProperties annotation:

@AutoValue
public class Model {
  
  public abstract String a();

  @UnrecognisedJsonProperties
  abstract Map<String, Object> unrecognized();
  
  public abstract Builder toBuilder();

  public static Builder builder() {
    return new AutoValue_Model.Builder();
  }
 
  public static TypeAdapter<Model> typeAdapter(Gson gson) {
    return new AutoValue_Model.GsonTypeAdapter(gson);
  }

  @AutoValue.Builder
  public class Builder {
     public abstract Builder a(String value);
     abstract Builder unrecognized(Map<String, Object> value);
     public abstract Model build();
  }
  
}

If the generated type adapters were aware of the unrecognized property and put all the unrecognized fields from a parsed json there during reading, and wrote unrecognized properties from back to json file during writing I would get desired behavior.

@VysotskiVadim
Copy link
Collaborator Author

@LukasPaczos , @RingerJK , because of repo recreation old comments were removed. I will transfer unresolved ones manually

import java.io.Serializable;
import java.util.Objects;

public class SerializableWrapper implements Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why do we need this class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JsonObject and other classes from gson aren't serialisable. If your model is Serializable, unrecognised fields breaks java serialisation. Serializable wrapper is a workaround to let unrecognised fields to be serialised.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to SerializableJsonElement

.findFirst()
.orElse(null);
if (unrecognisedJsonPropertiesContainer != null) {
TypeName mapOfObjects = ParameterizedTypeName.get(LinkedHashMap.class, String.class, Object.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinkedHashMap - why is the order important so that we need to use a linked map?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's convenient for testing when an order of unrecognised fields preserves during deserialisation/serialisation. See UnrecognisedFieldsTest.
Do you see any problem with LinkedHashMap?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capturing from docs:

Performance is likely to be just slightly below that of HashMap

That's my only consideration - we don't take real advantage of the ordering in production, so we could probably drop it. If it significantly improves debugging and testing then let's leave it in place, with the amount of data we're dealing with here it's probably an insignificant difference anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first answer from stack overflow says that it takes 30% more time to create LinkedHashMap than HashMap but iteration over LinkedHashMap is 50% faster (BTW I don't trust house test in terms of access, LinkedHashMap can't be faster in access). For round trip we need both creation and iteration.

.findFirst()
.orElse(null);
if (unrecognisedJsonPropertiesContainer != null) {
TypeName mapOfObjects = ParameterizedTypeName.get(LinkedHashMap.class, String.class, Object.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capturing from docs:

Performance is likely to be just slightly below that of HashMap

That's my only consideration - we don't take real advantage of the ordering in production, so we could probably drop it. If it significantly improves debugging and testing then let's leave it in place, with the amount of data we're dealing with here it's probably an insignificant difference anyway.

String.format("Only one method can be annotated with %s", UnrecognizedJsonProperties.class.getSimpleName()),
unrecognisedProperties.get(1).element
);
return unrecognisedProperties.get(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not throw here and stop processing? It's an obvious mistake that we shouldn't really continue with.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't dig deep in this questions, but here's what I've learned about annotation processors so far:

The doc recommends to prefer error reporting to uncaught exceptions.

processor should throw an uncaught exception only in situations where no error recovery or reporting is feasible.

I guess it's because users would like to see an error like "Only one method can be annotated with ..." rather than "Annotation processor X failed with an exception".

I also noticed that this processor doesn't throw exceptions in case of other errors. It logs error(logged error fails compilation for user) and carries on with generation. I decided to just stick to existing approach and don't spend time for digging deeper. Any way, if you find this question very important I'm ready to dig deeper and learn more about best practises in annotation processing.


@Nullable
@UnrecognizedJsonProperties
abstract Map<String, Object> unknownProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
abstract Map<String, Object> unknownProperties();
abstract Map<String, JsonElement> unknownProperties();

Is that always true? I understand yes, if that's the case we can even enforce it as part of the annotation processor (that the annotated type is always Map<String, JsonElement>). That could be the next step though, not a blocker.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be if we didn't need java serialization. Right now it's always SerializableWrapper

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is transition later from Map<String, Object> to Map<String, JsonElement> can be considered as a breaking change? Java's Map isn't covariant like Kotlin's one. Scenarios like object.unknownProperties().put("new key", "new value") could be broken. I will check if I can make the map immutable under the hood

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to abstract Map<String, SerializableJsonElement> unknownProperties();

Copy link
Member

@LukasPaczos LukasPaczos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only a request to update the docs.

import java.io.Serializable;
import java.util.Objects;

public class SerializableJsonElement implements Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a short doc entry for this class and its public methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

import static java.lang.annotation.RetentionPolicy.SOURCE;

/**
* Indicates that a filed of type Map<String, Object> is a container for unrecognised JSON properties.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Indicates that a filed of type Map<String, Object> is a container for unrecognised JSON properties.
* Indicates that the annotated field is a container for unrecognised JSON properties. The field has to be of type Map<String, SerializableJsonElement>.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, thanks

@VysotskiVadim VysotskiVadim merged commit de44ad8 into main Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants