Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization by interface type rather than implementation #8805

Closed
dankurka opened this issue Jun 10, 2015 · 29 comments
Closed

Serialization by interface type rather than implementation #8805

dankurka opened this issue Jun 10, 2015 · 29 comments
Assignees

Comments

@dankurka
Copy link
Member

Originally reported on Google Code with ID 8844

The main problem I'd like to solve is that Guava's ImmutableMap and friends have many
implementations, each of which would require a custom field serializer (mainly because
of issue 1054, complicated by issue 3315 and perhaps issue 1452). In practice, we don't
write serializers for each, and users discover that *most* ImmutableMaps serialize,
but others do not, even though all look like plain ImmutableMaps to the user. (That's
because all the implementation classes are package-private.) For example: https://code.google.com/p/guava-libraries/issues/detail?id=1821
Additionally, even if we were to write serializers for each class, that would require
that any ImmutableMap-transferring GWT application include every ImmutableMap implementation
in its compiled JavaScript, even if some implementations would never be used. This
leads to increases in code size.

What we'd like to do is somehow tell GWT that every ImmutableMap subclass should be
serialized and deserialized in the same way. I thought that I remembered hearing about
a possible "serialization by interface" feature back in early 2013, but I can't find
a reference now. Is this something that is in your plans? Thanks.

Reported by cpovirk@google.com on 2014-08-01 15:40:55

@dankurka
Copy link
Member Author

Can't comment on the future plans for GWT-RPC, just assigning the appropriate category.

Reported by t.broyer on 2014-08-03 19:33:22

  • Labels added: Category-RPC

@dankurka dankurka self-assigned this Jun 10, 2015
@dankurka
Copy link
Member Author

I think I must have talked about serializing all lists or maps the same way but it was
just an idea, not actually a plan. It would be tricky. Besides the private implementations,
there are also public subclasses like ImmutableBiMap and users could declare fields
of this type, so it's not safe in general to lose the type on deserialization. Also,
though it's not a good idea, they could even create a subtype of ImmutableMap with
an extra field that needs to be serialized.

The safest way to transfer data over the wire is to copy the data into an object of
known (final) type, rather than serializing whatever subtype the caller gives you:

final class MyTransferObject implements Serialiable {
  private TransferMap<Foo, Bar> items;
  MyTransferObject(Map<Foo, Bar> items) {
    this.items = TransferMap.copyOf(items);
  }

  ImmutableMap<Foo, Bar> getItems() { return items; }
}

final class TransferMap<K, V> extends ImmutableMap<K, V> {
  ...
}

But since ImmutableMap is already GWT-serializable, I don't know how we could get there
from here.

Reported by skybrian@google.com on 2014-08-04 19:20:28

@dankurka
Copy link
Member Author

I vaguely recall speculating about something like this:

class MyTransferObject implements Serializable {
  @GwtRpc(deserializeAs=ImmutableMap.class)
  Map<Foo, Bar> items;
  ...
}

This would save a copy on transfer object construction and reduce code bloat, but it
wouldn't be automatic. The person writing the transfer object would have to be aware
of the issue.

Reported by skybrian@google.com on 2014-08-04 19:30:51

@dankurka
Copy link
Member Author

@Brian, can you explain it more?

Let's say, if we have functionality to say "subtypes of ImmutableMap can use ImmutableMap_Serializer
if it is missing a serializer"; i.e.:

<extend-property name="gwt.rpc.genericSerializers="c.g.c.colloect.ImmutableMap" />

class MyTransferObject implements Serializable {

 ImmutableMap items1; // generates serializers for all subtypes or uses generic base
                      // ImmutableMap serializer if a given type is missing the serializer

 ASerializableImmutableMap items2; // generates serializers for all subtypes, cannot
use
                                   // generic ImmutableMap serializer
 ....
}

Then, what would be the problem?

Reported by goktug@google.com on 2014-08-04 19:59:47

@dankurka
Copy link
Member Author

The question is which subtype to construct on deserialization? If we try to preserve
the subtype, then the generic custom serializer needs to somehow know which constructor
to call. When there are final fields, there is usually not a no-arg constructor, so
the custom serializer must call the constructor and it's unclear how.

Alternately, we could construct a different subtype on deserialization, since for a
type like ImmutableMap, the subtype probably doesn't need to be preserved. But this
is tricky.

Reported by skybrian@google.com on 2014-08-04 20:59:47

@dankurka
Copy link
Member Author

The ImmutableBiMap (or ImmutableSortedMap) example is a good one that I had completely
forgotten. We could certainly end up making mistakes. On the plus side, I think that
those mistakes won't break things any worse than they already are: Either ImmutableBiMap
has a serializer, in which case the new scheme would detect and use it like before,
or it doesn't, in which case the new scheme will fail in some of the cases in which
the originally did, namely where an ImmutableBiMap is required (albeit with a ClassCastException
rather than outright failure). A bonus is that it now works if the class happens to
need only an ImmutableMap but had an ImmutableBiMap. And of course we still have the
main benefit, the reduction in the number of FooImmutableMap serializers we have to
write.

This sounds good to me as a naive outsider. But I'm sure I'm overlooking other tricky
things.

Reported by cpovirk@google.com on 2014-08-04 21:16:02

@dankurka
Copy link
Member Author

Yes, the alternative option is what I'm proposing. I think it is OK for the designer
of the API (in this case guava team) to say that the subtype doesn't matter and can
return a generic implementation in deserialization. If somebody doesn't like it, they
can define a more specific serializer/deserializer.

Reported by goktug@google.com on 2014-08-04 21:19:40

@dankurka
Copy link
Member Author

Ugh, Firefox / issue tracker ate my other response. In short, to answer your specific
question, the generic ImmutableMap serializer would return a "normal" ImmutableMap
like the one you get from ImmutableMap.of(...). The generic serializer would be used
only, as goktug notes, when the runtime type doesn't have its own serializer (as ImmutableBiMap
would). For us, we would set things up so that every publicly accessible type has its
own serializer.

Reported by cpovirk@google.com on 2014-08-04 21:21:40

@dankurka
Copy link
Member Author

re#4, currently the serializer chosen is not context-senstive. It's just a mapping from
runtime types to serializers. Making the serialization context-sensitive would be tricky
because a single object can be stored in many places. For example, suppose all of the
following fields contain references to the same object, so it will only be serialized
and deserialized once when they are sent over the wire together. Which type should
be used for deserialization?

ImmutableMap field1;
ASerializableImmutableMap field2;
List<? extends ImmutableMap> field3;
Serializable field4;

class Foo<T> {
  T value;
}

Foo<ImmutableMap> field5;
Foo<ASerializableImmutableMap> field6;
Foo<Serializable> field7;

I think to avoid confusion, the serializer shouldn't care what references there are
to an object when serializing it. But then substituting a different subtype of ImmutableMap
on deserialization breaks type safety for some fields.

Also, in JavaScript, we wouldn't get a ClassCastException unless we explicitly generated
the code to do a runtime check after deserialization for each field assignment.

(That means my proposal using an annotation has problems too.)

Reported by skybrian@google.com on 2014-08-04 22:17:45

@dankurka
Copy link
Member Author

I think making final fields work might be easier, actually, since it would remove the
need to write most custom serializers.

I'm doubtful that we can make final fields work by default (the migration is tricky)
but we could at least turn them on using an annotation:

@GwtRpc(finalFields=true) 
class Foo {
  private final String first;
  ...
}

Reported by skybrian@google.com on 2014-08-04 22:27:00

@dankurka
Copy link
Member Author

 > Which type should be used for deserialization?

Most specific deserializer associated with runtime type? (If any reason it is ambiguous
then the type is not serializable without a custom serializer)

 > substituting a different subtype of ImmutableMap on deserialization breaks type
safety for some fields

I don't understand this one. Can you elaborate more? If a sub-type of ImmutableMap
is declared in a field and we don't have a serializer for it then it is a compilation
error. Just to be clear, I'm not proposing using ImmutableMap serializer in place of
serializing a field declared as SomeOtherImmutableMap.

Reported by goktug@google.com on 2014-08-04 22:46:06

@dankurka
Copy link
Member Author

We discussed this offline. Goktug's idea is that the generic custom serializer would
be used for subtypes that never appear explicitly as the type of a field in a serializable
type. Once you declare a field with that subtype, it has to have its own custom serializer
or it's a compile error.

I think this would avoid breaking type safety but it could lead to weird runtime errors.
For example, does this code work?

class Foo extends Serializable {
  ImmutableMap<A,B> items;
  ...
}

... elsewhere ...

MyImmutableMap x = (MyImmutableMap) fooFromServer.items;

This cast will succeed if MyImmutableMap appears in a serializable type anywhere in
the program, pulling in the custom serializer. Otherwise it will fail, because foo.items
was deserialized using the generic custom serializer. So changing code in one file
could result in a runtime error in a seemingly unrelated file.

(However, I suspect there are similar cases already.)

Reported by skybrian@google.com on 2014-08-04 23:17:55

@dankurka
Copy link
Member Author

 > the generic custom serializer would be used for subtypes that never appear explicitly
as the type of a field in a serializable type.

I think there is a misunderstanding, what I meant is;

Generic custom serializer will be used for subtypes that doesn't have a custom field
serializer (the type may appear in a serializable field or may not - doesn't matter).
If the subtype appears in type of a field in a serializable type, it or at least one
of its subtypes *must* be serializable; otherwise it is a compilation error. You can
still try to send a non-serializable subtype and that is where we change the behavior
(fail on serialization (old behavior) vs fail on downcasting (new behavior)) - see
the example below...


 > This cast will succeed if MyImmutableMap appears in a serializable type anywhere
in the program

Appearing in a serializable type doesn't change the behavior here. We will still discover
MyImmutableMap's serializer as it is a subtype of ImmutableMap.

To understand the behavior change, let me try to go through different combinations
of scenarios based on your example (i.e. ImmutableMap appears in a serializable type
and later downcasted to MyImmutableMap):

Old behavior:
a) MyImmutableMap and all its subtypes are serializable:
This will just work fine.
b) MyImmutableMap has non-serializable subtypes:
This might or might not work depending on the actual subtype that ends on the runtime
- will potentially throw serialization exceptions regardless of any downcasting exists
in program or not.
c) MyImmutableMap and all its subtypes are non-serializable:
This will always throw serialization exception on runtime regardless of any downcast
exists in program or not.

New behavior:
a) MyImmutableMap and all its subtypes are serializable:
This will just work fine.
b) MyImmutableMap has non-serializable subtypes:
This will only throw exception if somebody downcasts to non-serializable subtype; otherwise
works fine.
c) MyImmutableMap and all its subtypes are non-serializable:
This will only throw exception if somebody downcasts to MyImmutableMap; otherwise works
fine.

There is one gotcha we need to handle specifically and wasn't a problem earlier:

 class Foo extends Serializable {
   NonSerializableImmutableMap<A,B> items;
   ...
 }

This is not an error if a serializable subtype of NonSerializableImmutableMap exists
in program.
With the old behavior, if we send any of NonSerializableImmutableMap's subtypes that
is not serializable, it will be a serialization exception. With the new behavior as
we will not have a serialization exception (because we will match to ImmutableMap's
generic serializer), we need to ensure that it doesn't escape type checking with JSNI.

So in summary, I think what needs to be done is:
1. Define a configuration property for introducing generic customer serializers.
2. Modify runtime serialization/deserialization selection logic to match generic custom
serializers (Perhaps we need to generate more runtime information to make this performant
on client side?).
3. Ensure that we have explicit type assertion when we write a field with JSNI in generator.

Reported by goktug@google.com on 2014-08-05 02:46:42

@bjrke
Copy link

bjrke commented Oct 27, 2016

I think this is still a problem, I can't find any solution for the Problems, especially since the nested class ImmutableList.SubList has become a problem since guava-20.

@cpovirk
Copy link
Contributor

cpovirk commented Oct 31, 2016

My impression (not that I'm a GWT contributor) has been that GWT-RPC may eventually be deprecated. Here's a slide that suggests as much.

@dankurka
Copy link
Member Author

dankurka commented Nov 2, 2016

GWT RPC is deprecated and we have communicated this for years. I even wrote a blog post about why it needs to go away:
http://blog.daniel-kurka.de/2016/07/gwt-rpcs-future.html

@TDesjardins
Copy link

I think then it would be helpful to set GWT RPC deprecated in the code (com.google.gwt.user.client.rpc.RemoteService) and the documentation (http://www.gwtproject.org/).

@branflake2267
Copy link
Contributor

RPC is currently the bread and butter of most for communication choice. I suspect the easiest way to put it down is provide an easy alternative to use. That said I know where this leads next...

@TDesjardins
Copy link

I fully agree with you, Brandon. We also make heavy use of GWT RPC in our applications. So we need a comfortable solution for replacement in future. May be https://github.com/intendia-oss/autorest is a candidate.

@tbroyer
Copy link
Member

tbroyer commented Nov 7, 2016

AFAICT, nobody has yet expressed an interest into somehow "porting" GWT-RPC to J2Cl (GWT 3), and Google has explicitly said they wouldn't invest any more time on it.
That leaves us, for GWT 2.x, with "community contributions" ("community" here meaning non-Google, thus including Steering Committee members such as Sencha or Vaadin), and I don't think anyone is ready to start the work outlined above (let them prove me wrong!)

All that being said, IMO, the Guava regression in 20.0 should be somehow fixed on the Guava side (because it's caused by a Guava change).

@jonl-percsolutions-com
Copy link
Contributor

@tbroyer, I see a lot of expressed interest for it to continue as far as porting it goes, I personally would be waiting on to see what that would take, but would be willing and interested in participating in it. As far as GWT-RPC goes, I don't personally think the mechanism is flawed, I think the Serialization Policy generation is unnecessarily coupled with the RPC mechanism and a modular mechanism would solve the majority of issues with it and allow for different serialization mechanisms under the hood and make GWT-RPC a powerful part of GWT for the long term. I would be willing to contribute to the cause of separating and modularizing it as well, however, hesitation arises because there is no clear picture as to what is going to happen with GWT going forward. My understanding was the GWT 2.8.x branch would be a long lived branch, but I've seen recent discussion that it would be going into maintenance mode instead of accepting new features in the long run, which will be worrying to some people who may need to be able an ever diversifying set of browsers. I think before we get large buy in from the community on taking up pieces of what core GWT is, many of us out here in the community may need to see a plan on what that means as far as both the GWT 2.8.x and 3.x branches are concerned.

@gkdn
Copy link
Contributor

gkdn commented Nov 7, 2016

Just for clarification since this might be misunderstood;
GWT 3 is not J2CL. J2CL is Google's alternative transpiler technology that GWT may choose to use as the compiler of GWT3 after it gets open-sourced.

@gkdn
Copy link
Contributor

gkdn commented Nov 7, 2016

hesitation arises because there is no clear picture as to what is going to happen with GWT going forward.

If you stick to JsInterop and APT, there is no reason to hesitate. It will work with current GWT compiler and will work with J2CL as well if it comes to that. See related talks.

@jonl-percsolutions-com
Copy link
Contributor

@gkdn, we've been developing in GWT since 2.5.x, we have generators, RPC w/ custom serializers, super source and custom Widgets, all which have to be replaced or converted in order for us to stick to JsInterop and APT all while continuing to move our current work forward.

@gkdn
Copy link
Contributor

gkdn commented Nov 7, 2016

The discussion/bug is around GWT-RPC. So I basically pointing you don't need to hesitate improving GWT-RPC (assuming that's what the community wants) if you go with APT and JsInterop.

@jonl-percsolutions-com
Copy link
Contributor

Ah, I see what you are saying now, perhaps I'll have to try in that regard to take a look!

@bjrke
Copy link

bjrke commented Nov 9, 2016

All that being said, IMO, the Guava regression in 20.0 should be somehow fixed on the Guava side (because it's caused by a Guava change).

It was not a problem in older versions because the most used class RegularImmutableList.sublist created a RegularImmutableList, but the other Lists had this problem all the time. So it was broken but simply not a problem.
I tried to build a serializer for the class ImmutableList.SubList, but because it is nested I was unable to name it correctly. The gwt compiler does not accept ImmutableList$SubList_CustomFieldSerializer. Using ImmutableList.SubList_CustomFieldSerializer, like it is done for Arrays.ArrayList was also not possible because it would also override the Implementation of ImmutableList in the generated code.
So I don't see, how the guava devs or any other could fix that without changing their implementation completely.
Anyway I currently don't see us migrating from GWT-RPC to any other technology in our project.

tbroyer referenced this issue in google/guava Nov 9, 2016
These seemed to be designed to create a faster sublist(), but our base
implementation of sublist seems likely to be quite similar in performance.
Plus by eliminating these fields we:

* eliminate a preconditions check and extra field reads on each call to  get(int)
* save 8 bytes for the fields (and maybe more for alignment on some platforms)

This design seems to be from the very first version of RegularImmutableList and there don't appear to be benchmarks covering ImmutableList.get(),  so this change is somewhat speculative.  I'm curious what your thoughts are.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=133794845
@tbroyer
Copy link
Member

tbroyer commented Nov 9, 2016

Changing GWT-RPC to look for an ImmutableList_CustomFieldSerializer.SubList would be much easier than what had been originally proposed in this issu (should be a few lines in https://github.com/gwtproject/gwt/blob/2.8.0/user/src/com/google/gwt/user/rebind/rpc/SerializableTypeOracleBuilder.java#L361 looping over enclosing types, and possibly a small change in https://github.com/gwtproject/gwt/blob/2.8.0/user/src/com/google/gwt/user/rebind/rpc/CachedRpcTypeInformation.java#L115 to get the JType back from the TypeOracle).

And in the mean time, Guava could revert cb3a29fb67936c2bc52b1cfee08cedab62950282 if the serializability was expected and somehow "guaranteed per-contract", and not just pure coincidence (in other words: if they care about the breaking change or not). Though it also depends which one of GWT or Guava will cut a release first, so maybe not worth it.

(note: I think one could use com.google.gwt.user.client.rpc.core.com.google.common.collect.ImmutableList.SubList_CustomFieldSerializer, it's undocumented but could probably be used as a workaround)

@tbroyer
Copy link
Member

tbroyer commented Nov 10, 2016

FWIW, @cpovirk's comment on the offending Guava commit:

I'd be happy to review a pull request. You'd probably need to make SubList a top-level type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants