Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take id property from the source when deserializing an entity [DATAES-936] #1511

Closed
spring-projects-issues opened this issue Sep 23, 2020 · 1 comment

Comments

@spring-projects-issues
Copy link

@spring-projects-issues spring-projects-issues commented Sep 23, 2020

Kevin Ullrich opened DATAES-936 and commented

When trying to search for data through spring-data-elasticsearch (see below for an example) but the data class it's supposed to create instances of happens to have an id field, spring-data-elasticsearch will use the _id field of the document instead of the id field of the source, which in this example will result in a NumberFormatException (see attachment for exception) since the class id field is an integer, while the elasticsearch "_id" field is not (even the "id" field in the mapping of elasticsearch is an integer). I'm not entirely sure if this is what actually happens, but it very much seems like it.

Setup:

This is what spring should deserialize the searchhits from elasticsearch to (Yes, it combines jpa with elasticsearch, haven't had an issue with that yet though).

@Entity
@Document(createIndex = false, indexName = "exampleIndex")
@Table(name = "someTable")
public class Person implements Serializable {

@Id
@Field
@Column
private Integer id;

@Field
@Column
private String name;

This is how I request the data from elasticsearch through spring:

NativeSearchQuery nativeQuery = new NativeSearchQueryBuilder.withFilter(someFilter()).withPageable(somePageable()).build();
SearchHits<Person> result = restTemplate.search(nativeQuery, Person.class);

However, I get an exception (longer version as attachment):

java.lang.NumberFormatException: For input string: "CaFHtHMBfczCkwGFm0zd" 

Thankfully, spring allows to send the elasticsearch request to the console. So I can easily execute the request in kibana and check its result:

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "exampleIndex",
                "_type": "_doc",
                "_id": "CaFHtHMBfczCkwGFm0zd",
                "_score": 1.0,
                "_source": {
                    "id": 123,
                    "name": "Example"
                }
            }
        ]
    }
}

As you can see, _id contains the value spring-data-elasticsearch is failing at, id on the other hand has an integer value as the data class suggested.

This makes spring-data-elasticsearch basically useless for my case (it worked great in spring-boot 2.2.0 but with the 2.3.4 release (probably prior to that as well), it stopped working


Affects: 4.0.4 (Neumann SR4)

Attachments:

Referenced from: pull request #523

Backported to: 4.0.5 (Neumann SR5)

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Sep 23, 2020

sothawo commented

Basically this error comes from a - sorry to say that - misuse of the id property of the entity. When you define a property as id-property - and in Spring Data Elasticsearch this is done by annotating the property with @Id or naming it "id" - this property is marked to be used as identifier in Elasticsearch, when indexing a document, when deleting or updating a document.

The program that stores the entries in Elasticsearch should have used the value of the id property (123) to index this document, but instead, no id was set and so Elasticsearch assigned a String value "CaFHtHMBfczCkwGFm0zd" as identifier.

In Spring Data Elasticsearch 3.2x when reading the search results, the _id property was not used to fill the entity properties, rather the code was relying on the id property being in the source. So you could read in these objects.

In Spring Data Elasticsearch 4.0 the value of the Elasticsearch _id is set into the id-property, because this is the property that is explicitly marked to contain this id. And this leads to this error, because the types don't match.

I can - and will -  change the code, so that the value of the property is taken from the _source (in your case the numeric value) and only if that is not available, the _id is taken - and then the type must match.

So while this will solve your issue, please keep in mind that your setup is an accident waiting to happen (and it did already). You don't have the Elasticsearch id in your entities, you cannot update or delete them and you can even get multiple documents from a search that have the same id value.

The good thing with Spring Data Elasticsearch is, that you get instances of SearchHit from the queries, and these return both the object and the Elasticsearch id, in my sample program this looks like (after committing my fix)

{
    "aggregations": null,
    "empty": false,
    "maxScore": 1.0,
    "scrollId": null,
    "searchHits": [
        {
            "content": {
                "firstName": "James",
                "id": 42,
                "lastName": "Bond",
            },
            "highlightFields": {},
            "id": "0oAou3QBn_1j3GC-BjB1",
            "index": "person",
            "innerHits": {},
            "nestedMetaData": null,
            "score": 1.0,
            "sortValues": []
        }
    ],
    "totalHits": 1,
    "totalHitsRelation": "EQUAL_TO"
}

 

 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.