Skip to content
This repository has been archived by the owner on Jan 22, 2019. It is now read-only.

Add support for Table<R, C, V> #11

Closed
nezda opened this issue Nov 2, 2012 · 19 comments
Closed

Add support for Table<R, C, V> #11

nezda opened this issue Nov 2, 2012 · 19 comments

Comments

@nezda
Copy link

nezda commented Nov 2, 2012

The Guava Table<R, C, V> data structure is a convenience wrapper around a Map<R, Map<C, V>>. This form can be easily serialized and deserialized. Here's the basic code I'm using for that:

    public static class TableSerializer extends JsonSerializer<Table<?, ?, ?>> {
        @Override
        public void serialize(final Table<?, ?, ?> value, final JsonGenerator jgen, final SerializerProvider provider) throws IOException,
                    JsonProcessingException {
            jgen.writeObject(value.rowMap());
        }


        @Override
        public Class<Table<?, ?, ?>> handledType() {
            return (Class)Table.class;
        }
    } // end class TableSerializer

    public static class TableDeserializer extends JsonDeserializer<Table<?, ?, ?>> {
        @Override
        public Table<?, ?, ?> deserialize(final JsonParser jp, final DeserializationContext ctxt) throws IOException, JsonProcessingException {
            final ImmutableTable.Builder<Object, Object, Object> tableBuilder = ImmutableTable.builder();
            final Map<Object, Map<Object, Object>> rowMap = jp.readValueAs(Map.class);
            for (final Map.Entry<Object, Map<Object, Object>> rowEntry : rowMap.entrySet()) {
                final Object rowKey = rowEntry.getKey();
                for (final Map.Entry<Object, Object> cellEntry : rowEntry.getValue().entrySet()) {
                    final Object colKey = cellEntry.getKey();
                    final Object val = cellEntry.getValue();
                    tableBuilder.put(rowKey, colKey, val);
                }
            }
            return tableBuilder.build();
        }
    } // end class TableDeserializer

I have not tested if this correctly supports contextual keys or type (de)serialization. I looked at MultimapSerializer and MultimapDeserializer, and they are much more complicated even though a Multimap<K, V> is just a convenience wrapper around Map<K, Collection<V>>, so maybe this is too simplistic?

Notably, this implementation always deserializes to an ImmutableTable. Heuristics could be used to determine what Table implementation to reconstruct, e.g., if every row, column value is non-null, could create ArrayTable. HashBasedTable would be a good mutable default.

@pgelinas
Copy link
Member

pgelinas commented Nov 2, 2012

The MultimapDeserializer is a bit convoluted for what it achieve, I've been wanting to rewrite it but haven't got the time right now to do that. You could instead take a look a GuavaMultisetDeserializer to have a better example of what a "normal" deserializer looks like (and GuavaDeserializers for how to register deserializer).

And, as far as I can tell, your Deserializer will not work properly for type deserialization and custom key deserialization, as it's missing type information for the values and keys. If you could provide some basic unit test with your code in a pull request I might take a look at how to include support for Table. I didn't need Table support so that's why I haven't included that yet ;)

@maxkremer
Copy link

+1 for table support. While the above implementation is missing code for type handling I think a variation can be used for the simple case of Table<String, String, String>

@cowtowncoder
Copy link
Member

Code really needs to consider type information. Type parameters are to be handled by Serializers/Deserializers implementation, similar to what is done with other structured types.

But what is really needed is a PR. :)

@michaelhixson
Copy link
Contributor

I have an implementation of this locally. Before I open up a PR, are you guys ok with these details?

  • The serializer is a StdDelegatingSerializer whose input type is the JavaType of the table, and whose output type is a raw Map type constructed by the provided TypeFactory. (The generics don't seem to matter in serialization.)
  • Serialization omits null cell values, which may be present in ArrayTable instances. Edit: Nevermind, this is bad because then ArrayTable can lose information in serialization.
  • There are four deserializers for different kinds of tables, all of which are StdDelegatingDeserializer. Their output types are the JavaType of the table. Their input types are basically LinkedHashMap<R, LinkedHashMap<C, V>>, constructed by the provided TypeFactory, where R, C, V are read from the table's JavaType.
  • In deserialization, you'll get a HashBasedTable, ArrayTable, or ImmutableTable only if you explicitly ask for one of those as your type.
  • In deserialization, the default (if you just ask for a Table) is a custom mutable table that uses LinkedHashMap to preserve the ordering of entries in the JSON, unlike HashBasedTable:
Table<R, C, V> table = Tables.newCustomTable(
    new LinkedHashMap<>(),
    LinkedHashMap::new);

I'm on the fence about that last part. It wouldn't take much to convince me that it's better to default to ImmutableTable. Edit: I've convinced myself, it should treat plain Table as ImmutableTable.

Any thoughts?

@cowtowncoder
Copy link
Member

I am fine with use of delegating serializer and think it is a great starting point; if improvements are needed, it should be easier to do that incrementally.

I am ok with mapping of types, although do not know enough to necessarily know possible issues.

My main concern is with actual JSON structure used, since once we choose to use certain structure, it becomes our "standard" so to speak; more difficult to change in future. So I do not have concerns with specific scheme used, which sounds acceptable, but rather just the concern that we may not have enough time to get enough developers to review it.

But so: looks good to me; happy to merge unless others have concerns, as soon as we have CLA.
If that happens quickly, I may consider this to go in 2.6.0, although timing is bit tight.
Will probably mention this on dev mailing list.

@cowtowncoder
Copy link
Member

@stevenschlansker WDYT?

@michaelhixson
Copy link
Contributor

@cowtowncoder I want to go through my employer for the CLA, so it may take a while.

@cowtowncoder
Copy link
Member

@michaelhixson note that you can choose whether to do individual CLA or corporate CLA, depending on your specific circumstances. Corporate CLA makes most sense when there are multiple contributors.
At any rate, if you do end up going CCLA route, you typically want to use this one:

https://github.com/FasterXML/jackson/blob/master/contributor-agreement-corporate.txt

and not the individual one:

https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf

This based on my previous experiences; former has been used by multiple big (fortune-500) companies; latter has caused heartburn for lawyers of the same (for reasons that are not entirely clear to me :) ).

No problem with respect to time it takes: I can merge changes at a later point and will not assume it will get done within day or two. Thank you for the update, it's just good to know what to expect, at this point in release cycle.

@stevenschlansker
Copy link
Contributor

I'm a fan of immutable table over a "custom mutable table", FWIW. Don't have too many other strong opinions.

@cowtowncoder
Copy link
Member

Thank you @stevenschlansker. Mostly just wanted a sanity check, easier to make changes upfront, had some bad experiences with date/time lib support where I merged in changes that proved controversial.

Choice of immutable/mutable is difficult just because it completely depends on intent (whether usage requires modifications), but as long as more specific type may be used to change behavior that should be fine either way.

@fgaule
Copy link

fgaule commented Aug 23, 2015

Is it planned to be supported?

@michaelhixson
Copy link
Contributor

I made a pull request for this and sent in a signed CLA, but I never heard back. I imagine they've been busy with the 2.6 release. It might be about time to ping though...

@cowtowncoder Do you know the status of that PR or CLA?

@codyaray
Copy link

+1 waiting for this feature too.

@smullins7
Copy link

+1 I am also waiting for this feature like the gentlemen in the comment above

@stevenschlansker
Copy link
Contributor

I just did a brief review of the code in the linked PR and it looks very nice. Thank you for the contribution, I'm sure the CLA / merging stuff will get sorted soon.

@cowtowncoder
Copy link
Member

I was on vacation for past 2 weeks, back now, and will follow up.

@cowtowncoder
Copy link
Member

As per comment, 2.6.2 contains serialization support. Still need to implement deserialization, based on some combination two PRs we have.

@melonhead901
Copy link

Actually 2.6.2. does not contain serialization support. This on the schedule for 2.7.0 https://github.com/FasterXML/jackson-datatype-guava/blob/master/release-notes/VERSION

@cowtowncoder
Copy link
Member

Quick note: moved to FasterXML/jackson-datatypes-collections#1
due to move of Guava module itself, goal being to streamline release process (Jackson has 3 dozen official modules now, and releases take too much time)

Not sure how PRs can be moved over; if anyone has time to redo one or both, that would be great.

Apologies for hassle.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants