-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polymorphic [de]serialization #147
Comments
+1, we have this kind of implementation in Apache Johnzon as well - it is portable: https://github.com/apache/johnzon/blob/master/johnzon-json-extras/src/test/java/org/apache/johnzon/jsonb/extras/polymorphism/PolymorphicTest.java#L84 |
Yes, thats helpfull. Many implementations have a concepts with annotations or other strategies. I'd prefer also one without annotations, as this would enable a transformation without changing legacy classes. Also interesting, why did the expert group omit this feature in 1.0? |
@mdzaebel What is likely not possible is to make it transparent without any config (type=fully qualified name) so it requires to have explicit aliases for all children and whitelist the allowed children otherwise you get immediately a security issue. Therefore it is not strictly omitted in 1.0 since it is trivial to handle it in a (de)serializer. Hope it helps. |
At first, thanks so much for your help, which is really important for us (big insurance company). E.g. I did not know about the security aspect. I'm not an expert, so could you show me a serializer, that inserts a $type attribute and a deserializer that creates the instances out of $type? Currently, the security aspect could be solved by testing an instanceof relation in our case. |
May be JSON Binding 1.0 rather focus on the wrapping strategy, documented in https://javaee.github.io/javaee-spec/javadocs/javax/json/bind/serializer/JsonbDeserializer.html. So the usecase above seems to be difficult, but should be supported in future. |
@mdzaebel all the code is contained in this class: https://github.com/apache/johnzon/blob/master/johnzon-json-extras/src/main/java/org/apache/johnzon/jsonb/extras/polymorphism/Polymorphic.java - you can also import johnzon-extras module in your project and not fork it. Then just configure it as mentionned in previous comment (with the example link). |
Great stuff. That should be sufficient. Thank you Romain! |
With Yasson, your serializer doesn't work (Yasson Issue) e.g.
leads to "Recursive reference has been found in class ...", but isn't this correct, as the identical object has to be serialized in a deeper level recursively? |
Roman Grigoriadi now explained, that Yasson has to be used slighty different (see Issue) |
This feature is definitely a must to have so I thought about it and these are my suggestions which should be covered in terms of polymorphic (de)serialization.
It would also be possible to specify the name of the new property in each option mentioned above. I would also suggest two levels of how strict it should be. For example aliases are set for some set of classes and during serialization is "not mentioned" class found.
What do you guys think about it? Any suggestions? |
@Verdent -10000 for your "1", this is exactly the 0-day vulnerability java serialization got and jackson is still trying to fix. 2 requires to use aliases - and it is not an option - known from the server (and not class names) for the same reason. Isn't 3 a custom deserializer - so already built-in? |
Yes I know about this. This would be just for those who want to use it internally. And there should be also warning that this might be potential security problem. But I wrote it here because it could be fairly simple to set up for a user. In case of that 2. I would say that aliases are not that bad way to go. You have only several possible values and those are not invoked on the server. They are matched with corresponding class. What is wrong with that option? :-) The 3. option was more or less 2. but in configuration way. You can create deserializer, but I was trying to find some way how to do it easier and more "automatic". |
My point on 2 was that only aliases are an option (being required and not just one option) since classname is a no go. |
@rmannibucau I agree that option 1 adds a vulnerability. But if it's switched off by default and users know that enabling it opens a security hole it's fine. Basic HTTP Authentication is an even bigger security hole, but still used. The same I can say for option 2. It's more secured, but opens a vulnerability in case alias is not specified. Again, users should know about it before enabling it. Do we have other options to make it more secured? |
We dont have to provide it at all, basic is still there cause it had been possible if you follow me. |
What about defining a list of classes which we allow to serialize/deserialize in configuration. If JSONB tries to deserialize a class which is not in configuration it will throw an exception. It should solve the security issue, is it?
|
toJson can pass but the fromJson is the one which must be controlled. Your solution works. |
@m0mus serializable(...) is an approach to handle security problems for polymorphic instantiation. It's only one aspect of it and should be defined within a polymorphic configuration. serializable(...) could be missunderstood as a general constraint, rather than a special one for polymorphic instantiation. I would drop duplicate methods like serializableAndDeserializable and use one method in both directions. Differences can always be defined with new /** Polymorphic [de]serialization */
public JsonbConfig polymorphic(Polymorphic polymorphic); Where Polymorphic could be defined like e.g.: class Polymorphic {
String typeAttribute="@type";
Function<Class<?>, String>stringResolver; // String alias of typeAttribute
Function<String, Class<?>>classResolver; // Deserialzation of alias value
Strategy strategy; // How to serialize class types
/** Classes/interfaces of which subclasses/implementations are handled:
* Serialization: typeAttribute added.
* Deserialization: Instantiation by typeAttribute */
Class<?>[] classes;
} This solves the security issue and the polymorphic configuration. The proposals of @Verdent have to be decided. I would support them completely. It might be helpful to analyse the concepts already used by other implementations. |
jsonio used kinf of that and it failed in some of our project cause the lack of evolutivity in time (too rigid) + payloads where too big. Adapters/deserializers enable to be more clever and bypass type attributes IMHO since they can identify the type per attributes directly and not change the shape of the serialization which breaks js clients for example. So I'd prefer a lighter solution/abstraction if generic and not adapter based and not something against js. So at the end it is just providing a default serializer/adapter impl so let s keep it simple without new concept maybe? The whitelist is still needed though. |
Automatic type detection via attributes could also be a candidate for the "strategy". However, in many cases it's not sufficient, as attributes/values might be the same for different instance types. |
@mdzaebel there is no interoperability standard at all (in the sense "commonly used standard") so if we change the shape of the payload then the clients will not consume it right (missing wrapper, custom attribute). It is likel producing a record following a schema without control. Therefore it must only be activable but never activated without a stringly explicit flag (annotation, no config probably). |
@rmannibucau Why no config? This would be more dynamic, while annotations are fixed at classes/methods/parameters. Each strategy has it's own settings, that have to be configurable by preferably one polymorphic configuration. |
@rmannibucau Just like Yasson, also Johnzon 1.1.13 fails at the same location:
What actually happens is a Stack Overflow due to endless recursion:
Apparently even latest Johnzon tries to wrap again and again and again as soon once it serializes the wrapper itself but then sees the wrapper's payload and jumps into the polymorphic deserializer again. Romain, can you please decide whether this is a bug in Johnzon or simply a missing rule in the JSON-B spec? |
@mkarg it looks expected and normal with current spec - thought throwing a JsonbException("invalid") could have been saner. This code works using annotations and not global converters which leads to that. We likely don't want to skip serializer/deserializer/adapter in SerializationContext. If you set your impl globally (on jsonbconfig), it must likely have the standard threadlocal to prevent the loop if possible. That said, this use case does not fit serializer API which is really about how to serializer, it fits adapter API which is about changing the shape of a model then the new shape can have a serializer if you want I think. |
@rmannibucau |
@amoscatelli that's what I thought so let's explicit it in the spec then we should be cleaner :) |
Ok, can we also be more explicit regarding the JsonParser instance in the spec ? Also, I would like to give Johnzon a try (I am using Wildfly). |
@amoscatelli agree for jsonparser but can be worth another issue/PR. about johnzon: it has some typevariable support even if not 100% complete and parameterized type support yes. |
@amoscatelli I am currently rewriting and improving Yasson serialization and deserialization more or less from scratch and I am also adding support for polymorphism (so far implementation specific, but it will change later possibly) You will be able to use config and also annotations to enable polymorphism and customize it as desired, so you do not need to write anything extra by yourself and Yasson will do that for you. Hopefully I will be able to create the first RC soon. |
@Verdent we still need a portable solution since org.yasson internal imports are rarely desired when you use a spec which is primarly intended to write vendor independent apps. Can you ensure yasson works with previous code? |
@rmannibucau By previous code you mean code which users created before my rework? Absolutely, only classes from internal packages are changed/removed. And I am currently even passing all of the tests (without the need to change them) and TCKs so I would say it will not break any previously created user code. I completely agree here that implementation specific imports are usually not what users what to have and as soon as it is described in the spec, it will be changed to support annotations and interfaces from the API. If you want, I can describe here in more details how I am implementing it there and we can have a talk here if it is how we want to add it into spec or what needs to be changed. |
@Verdent this one #147 (comment) |
@rmannibucau Yes I have to test that, but I would say it will work as expected. I see no reason why it shouldn't with the new version. From my point of view it might make sense to even fix that in the current version, but I will have to go though the code first. Perhaps only line |
@Verdent it is one of the challenge if deserializers, if used at root it must be at start_object but then in nested objects it is not (but it is another issue than this one). |
@rmannibucau But your deserializer is used at root object in this case so parser should be at START_OBJECT upon entering your deserializer. That is why I am mentioning that calling ensureNext with START_OBJECT is probably incorrect there in this particular case. Anyway with that mentioned polymorphism support, we do support handling of the mentioned |
@Verdent thank you. Please let's just stick with official interfaces, please don't assume the use of any internal implementation. Frankly, I don't think Yasson should include its own JSONP implementation, let's not reinvent the wheel. Even if you have to, do not assume its utilization. Jsonb and JsonP should talk to each other just using the spec interfaces. Imho org.eclipse.yasson.internal.JsonbParser should never been there, if you have a state, it should have been kept in the Yasson DeserializationContext implementation. DeserializationContext should accept ANY Parser created via JsonParserFactory#createParser(JsonObject). This alone would be a step further. Now, if you have in mind something to support Polymorphism please describe it here. I created the first topic about supporting polymorphism and it was three years ago. I believe it's important to bring it to the spec as soon as possible as we don't want to use any implementation specific API. Johnzon has already its own Polymoprphism support so maybe we can extract a good abstraction from both of them and make it for the next spec release. As I already stressed out, other specifications are going to use Jsonb also, and they'll need good polymorphism support. Again, thank you. |
@Verdent also, is the 2.x branch of Yasson the rework you are talking about ? |
@amoscatelli I am not adding anything like JSON-P implementation. Why would I do that? :-) Yeah I have removed org.eclipse.yasson.internal.JsonbParser, I could not get my head around why it was there in the first place. Actually I have already described it here. But I changed it a bit in form of structure now, but the idea is the same. I guess I can make a new full description of how I feel it should look like :-) No the reworked version will be 3.x. It is not there yet. |
Folks, first of all thanks for your feedback and interest in evolving JSONB spec. It's great and I appreciate it. Let me clarify what's going on. As @Verdent mentioned already, he is working on new version of Yasson. It's a huge rework. The new version has better internal design and it's much faster than the current version. This work is in progress, but he is close to present a draft. When it's ready, he will create a new branch in Yasson project and push his work there, so all of you can review it. The new version has deserialisation support and some other new features such as externalised configuration, etc. We are trying to use implementation first approach. If these features are well accepted by the community, we will add them to the next version of JSONB spec. We plan to do it for Jakarta EE 10. |
@Verdent how does a deserializer know it is in an array or object when set on a list then? Thisbis the main issue skipping events, deserializers handle the struct behind the equivalent struct. |
@rmannibucau What I am trying to say is that you are making the check which would fail in your example since parser should be on that event already. I am talking here about this particular JSON But I feel like we are getting out of the scope of this issue here. |
@Verdent I understood - and agree we can move it to another issue - but checking null does not mean not serving the full model to the deserializer (pushback logic), other impl lead to broken deserializers. |
Is there already a public proposal how to user's view on polymorphism is like, e. g. what annotations to set etc? |
Not yet. But I am currently making one :-) |
OpenJDK 17 comes with sealed classes, including an extension to the Reflection API which allows to compute a complete hierarchy of possible subclasses at runtime. It would be great if JSON-B follows the "feels-like-java" approach and automatically provides polymorphism for all sealed classes without the need for any additional annotations in that case (CoC). That would be terrific! |
-1, polymorphism is about modifying the serialization shape to add a discriminator with wrapping the payload or not. Sealed classes are not only about this use case - most other trivial one is about pattern matching only, no subattribute so no discriminator in terms of shape. I'd stick on the standard solution which enables the user to decide. |
+1 Sealed classes lend themselves perfectly for polymorphic (de)serialization. We'd need to answer a number of questions. How do we derive the discriminator from the class name? Maybe just the simple class name? How can I override it? Maybe with an annotation And I think that users may wonder why it works when serializing some classes but not on others, and there's no error message that could explain this (only the If we later choose to add support also for non-sealed classes, we can reuse all the annotations and conventions defined for the sealed classes. |
@t1 the opposite will happen, polymorphism is under work, sealed classes will just work as others. |
Using the full class name from the (Serializer) JSON is a serious security issue. An attacker could make you deserialize an arbitrary class, and some classes can cause remote code execution. [ I usesd switch but not working. can someone help me |
Please ask on Gitter, stack overflow etc (or open a question or discussion). Hijacking issues with new topics is not the way to go. |
I'd like to repeat the feature proposal from eclipse-ee4j/yasson#133 here. There should at least be a description of how to implement Serializer/Deserializer that maintain runtime types. As there are more possible strategies, they should be defined too.
The text was updated successfully, but these errors were encountered: