Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding mixed types in structured objects #553

Open
simonrenoult opened this issue May 15, 2015 · 6 comments
Open

Avoiding mixed types in structured objects #553

simonrenoult opened this issue May 15, 2015 · 6 comments

Comments

@simonrenoult
Copy link

Hi there !

As stated in the documentation about structured objects here, elasticsearch-jdbc can map rows as arrays based off the _id of the document, which is great. However, as shown in the documentation example, the mapping result produces mixed types. Here's what I'm talking about (shameless copy/paste) :

mysql> select "relations" as "_index", orders.customer as "_id", orders.customer as "contact.customer", employees.name as "contact.employee"  from orders left join employees on employees.department = orders.department;
+-----------+-------+------------------+------------------+
| _index    | _id   | contact.customer | contact.employee |
+-----------+-------+------------------+------------------+
| relations | Big   | Big              | Smith            |
| relations | Large | Large            | Müller           |
| relations | Large | Large            | Meier            |
| relations | Large | Large            | Schulze          |
| relations | Huge  | Huge             | Müller           |
| relations | Huge  | Huge             | Meier            |
| relations | Huge  | Huge             | Schulze          |
| relations | Good  | Good             | Müller           |
| relations | Good  | Good             | Meier            |
| relations | Good  | Good             | Schulze          |
| relations | Bad   | Bad              | Jones            |
+-----------+-------+------------------+------------------+
11 rows in set (0.00 sec)

Which will result in :

index=relations id=Good {"contact":{"employee":["Müller","Meier","Schulze"],"customer":"Good"}}
index=relations id=Bad {"contact":{"employee":"Jones","customer":"Bad"}}

As you can see, employee can contain either a string ("Jones") or an array (["Müller","Meier","Schulze"]).

Is there a way to have a consistent result (eg : an array containing "Jones") ? This behavior can be a real headache, especially when working with JSON parsers thart require a specific type (say String[]) and will shout at you when reading an unexpected type.

Thanks !

@simonrenoult simonrenoult changed the title Avoid mixed types Avoiding mixed types in structured objects May 15, 2015
@jprante
Copy link
Owner

jprante commented May 15, 2015

Can you give more details about JSON parsers that can not work? It works perfectly here.

If you work with NEST, there are known troubles and you have to work around in the parser, like shown in elastic/elasticsearch-net#227 (comment)

@simonrenoult
Copy link
Author

I work with Boon and it raises a ClassCastException when trying to affect a CharSequenceValue into the Collection. Here is the exact error :

SoftenedException: fieldName idList of class class elasticsearch.beans.MyBean had issues 
for value 1154004984494 for field FieldInfo [name=idList, type=class [Ljava.lang.String;, 
parentType=class elasticsearch.MyBean]

CAUSE java.lang.ClassCastException :: org.boon.core.value.CharSequenceValue cannot be 
cast to java.util.Collection]

CharSequenceValue is the type used by Boon to read a string. It obviously fails when trying to affect it into a Collection.

@jprante
Copy link
Owner

jprante commented May 15, 2015

This looks like a bug in your code.

@simonrenoult
Copy link
Author

Well, yes and no.

I have a class that accepts an array of string. The JSON provided by the elasticsearch river can be either a string or an array depending on the number of row found in the database. The parser fails since it is only capable of parsing an array and not a single string (and will raise this ClassCastException). I have two workarounds : either I try to fix the parser with a custom serializer (which, as far as I know, is not possible with Boon) or I try to normalize the JSON output. Which is why I'm here.

@jprante
Copy link
Owner

jprante commented May 15, 2015

"You can completely customize serialization and deserialization." boonproject/boon#268

Anyway, I recommend Jackson, because it comes with the feature, so it's much easier to do array coercion. Example:

public class Person {

    String name;

    String[] attributes;

    public Person() {
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setAttributes(String... attributes) {
        this.attributes = attributes;
    }

    public String[] getAttributes() {
        return attributes;
    }
}

import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectReader;

        ObjectMapper mapper = new ObjectMapper();
        Person person = new Person();
        person.setName("Jörg");
        person.setAttributes("foo", "bar");
        String s = mapper.writeValueAsString(person);
        System.err.println(s);
        ObjectReader objectReader = mapper.reader(Person.class);
        String json = "{\"name\":\"J\\u00f6rg\",\"attributes\":\"foobar\"}";
        Person p = objectReader
                .with(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY)
                .readValue(json);
        System.err.println(p.getName());
        System.err.println(Arrays.toString(p.getAttributes()));

@simonrenoult
Copy link
Author

Jackson is not an option since my project has been using Boon for quite some time now and relies on it. I'll check the custom Boon serializer, thanks.

About my question though, is there a way to enforce a consistent data type ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants