Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IllegalStateException on indexing with completion suggester #10987

Closed
Duetting opened this issue May 5, 2015 · 1 comment

Comments

@Duetting
Copy link

commented May 5, 2015

I'm getting a java.lang.IllegalStateException: from state (0) already had transitions added exception if i try to index certain documents with a mapping that has fields with "type": "completion".

This happens at least in Version 1.5.1 and 1.5.2 and seems to work correctly up to 1.3.9.

First i create a new index:
curl -XPUT localhost:9200/ses_firma/
Then i add the mapping with:
curl -s -XPUT localhost:9200/ses_firma/_mapping/firma --data-binary @MappingFirma.txt

{
    "firma": {
        "_source": {
            "enabled": false
        },
        "_id": {
            "path": "oid",
            "store": true
        },
        "properties": {
            "adressen": {
                "properties": {
                    "ort": {
                        "type": "string"
                    },
                    "postleitzahl": {
                        "type": "string"
                    },
                    "strasse": {
                        "type": "string"
                    },
                    "bundesland": {
                        "include_in_all": false,
                        "properties": {
                            "name": {
                                "include_in_all": false,
                                "type": "string"
                            }
                        }
                    },
                    "postfach": {
                        "type": "string"
                    },
                    "land": {
                        "include_in_all": false,
                        "properties": {
                            "iso": {
                                "include_in_all": false,
                                "type": "string"
                            },
                            "archiv": {
                                "type": "boolean"
                            },
                            "name": {
                                "include_in_all": false,
                                "type": "string"
                            }
                        }
                    },
                    "oid": {
                        "type": "string"
                    },
                    "postfachplz": {
                        "type": "string"
                    }
                }
            },
            "bemerkung": {
                "type": "string",
                "fields": {
                    "suggest": {
                        "max_input_length": 50,
                        "payloads": false,
                        "analyzer": "simple",
                        "context": {
                            "type_context": {
                                "path": "_type",
                                "default": [
                                    "*"
                                    ,
                                    "firma"
                                ],
                                "type": "category"
                            }
                        },
                        "preserve_position_increments": true,
                        "type": "completion",
                        "preserve_separators": true
                    }
                }
            },
            "standardadresse": {
                "properties": {
                    "ort": {
                        "type": "string"
                    },
                    "postleitzahl": {
                        "type": "string"
                    },
                    "strasse": {
                        "type": "string"
                    },
                    "bundesland": {
                        "include_in_all": false,
                        "properties": {
                            "name": {
                                "include_in_all": false,
                                "type": "string"
                            }
                        }
                    },
                    "postfach": {
                        "type": "string"
                    },
                    "land": {
                        "include_in_all": false,
                        "properties": {
                            "iso": {
                                "include_in_all": false,
                                "type": "string"
                            },
                            "archiv": {
                                "type": "boolean"
                            },
                            "name": {
                                "include_in_all": false,
                                "type": "string"
                            }
                        }
                    },
                    "oid": {
                        "type": "string"
                    },
                    "postfachplz": {
                        "type": "string"
                    }
                }
            },
            "abkuerzung": {
                "type": "string",
                "fields": {
                    "suggest": {
                        "max_input_length": 50,
                        "payloads": false,
                        "analyzer": "simple",
                        "context": {
                            "type_context": {
                                "path": "_type",
                                "default": [
                                    "*"
                                    ,
                                    "firma"
                                ],
                                "type": "category"
                            }
                        },
                        "preserve_position_increments": true,
                        "type": "completion",
                        "preserve_separators": true
                    }
                }
            },
            "telefonnummern": {
                "properties": {
                    "nummer": {
                        "type": "string"
                    }
                }
            },
            "oid": {
                "type": "string"
            },
            "email": {
                "type": "string",
                "fields": {
                    "suggest": {
                        "max_input_length": 50,
                        "payloads": false,
                        "analyzer": "keyword",
                        "context": {
                            "type_context": {
                                "path": "_type",
                                "default": [
                                    "*"
                                    ,
                                    "firma"
                                ],
                                "type": "category"
                            }
                        },
                        "preserve_position_increments": true,
                        "type": "completion",
                        "preserve_separators": true
                    }
                }
            },
            "firmenname": {
                "type": "string",
                "fields": {
                    "suggest": {
                        "max_input_length": 50,
                        "payloads": false,
                        "analyzer": "simple",
                        "context": {
                            "type_context": {
                                "path": "_type",
                                "default": [
                                    "*"
                                    ,
                                    "firma"
                                ],
                                "type": "category"
                            }
                        },
                        "preserve_position_increments": true,
                        "type": "completion",
                        "preserve_separators": true
                    }
                }
            }
        }
    }
}

Then indexing the following documents:
curl -s -XPOST localhost:9200/_bulk --data-binary @BulkError.txt

{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Alfred Reiter Bau GmbH","abkuerzung":"ROTTMEIER","email":"","adressen":[{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"}],"standardadresse":{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"},"telefonnummern":[{"nummer":"08751/5171"},{"nummer":"08751/9400"},{"nummer":"0170/7369223"},{"nummer":"08751/9400"},{"nummer":"0170/2847356 Alfred"}],"bemerkung":"info@reiter-bau.de","oid":"40d50149-aafa-4727-9c9b-0b95ae41d4bb"}
{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Volkswagen Bankdirect","abkuerzung":"VOLKSWAGEN","email":"","adressen":[{"strasse":"Gifthorner Str. 57","postleitzahl":"38112","ort":"Braunschweig","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Niedersachsen"},"oid":"60672aff-dbb4-4291-9b3f-fd2ebc177cb6"}],"standardadresse":{"strasse":"Gifthorner Str. 57","postleitzahl":"38112","ort":"Braunschweig","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Niedersachsen"},"oid":"60672aff-dbb4-4291-9b3f-fd2ebc177cb6"},"telefonnummern":[{"nummer":"0531/2121732"},{"nummer":"0531/2122836"},{"nummer":"0531/2122880"}],"bemerkung":"VOLKSWAGEN","oid":"2c1dc58b-e5a2-4513-aea6-1327714b07b8"}
{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Die Bayerische","abkuerzung":"BBV","email":"","adressen":[{"strasse":"Thomas-Dehler-Str. 25","postleitzahl":"81737","ort":"München","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"1482071b-c049-4802-8a40-e37aba535317"}],"standardadresse":{"strasse":"Thomas-Dehler-Str. 25","postleitzahl":"81737","ort":"München","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"1482071b-c049-4802-8a40-e37aba535317"},"telefonnummern":[{"nummer":"Kfz 089/6787-2222"},{"nummer":"089/6787-0"},{"nummer":"089/6787-9150"}],"bemerkung":"BBV","oid":"2786c191-b882-4c2d-a61f-1f962f74cdcd"}

and i get the aforementioned error on all three documents. There are documents that work correctly but i can cut out all content (e. g. "firmenname":"") from all fields of these documents from above apart from the oid field and i nevertheless get the exception.

Here is the call stack for one of the errors from the log file:

[2015-05-05 11:05:32,818][DEBUG][action.bulk              ] [Amina Synge] [ses_firma][0] failed to execute bulk item (index) index {[ses_firma][firma][40d50149-aafa-4727-9c9b-0b95ae41d4bb], source[{"firmenname":"Alfred","abkuerzung":"","email":"","adressen":[{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"}],"standardadresse":{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"},"telefonnummern":[{"nummer":"08751/5171"},{"nummer":"08751/9400"},{"nummer":"0170/7369223"},{"nummer":"08751/9400"},{"nummer":"0170/2847356 Alfred"}],"bemerkung":"","oid":"40d50149-aafa-4727-9c9b-0b95ae41d4bb"}
]}
org.elasticsearch.index.engine.IndexFailedEngineException: [ses_firma][0] Index failed for [firma#40d50149-aafa-4727-9c9b-0b95ae41d4bb]
    at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:368)
    at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:498)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:427)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:149)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:515)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:422)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalStateException: from state (0) already had transitions added
    at org.apache.lucene.util.automaton.Automaton.addTransition(Automaton.java:158)
    at org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester.replaceSep(XAnalyzingSuggester.java:302)
    at org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester.toFiniteStrings(XAnalyzingSuggester.java:932)
    at org.elasticsearch.search.suggest.completion.AnalyzingCompletionLookupProvider.toFiniteStrings(AnalyzingCompletionLookupProvider.java:371)
    at org.elasticsearch.search.suggest.completion.CompletionTokenStream.incrementToken(CompletionTokenStream.java:63)
    at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:618)
    at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359)
    at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318)
    at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:241)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:465)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1526)
    at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1252)
    at org.elasticsearch.index.engine.InternalEngine.innerIndex(InternalEngine.java:431)
    at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:362)
    ... 8 more

I tried this on three different machines also with a completely new installed Elasticsearch instance.

Best Regards,
Markus Dütting

@areek

This comment has been minimized.

Copy link
Contributor

commented May 14, 2015

Hi @Duetting,

Thanks for reporting this. It seems the reason for this error is because the field email has an empty string. To avoid this error, you could omit the empty string completion fields from indexing.
Ideally, this should be handled by elasticsearch, I have opened a PR for this (#11158).

Indexing the following works around this issue:

{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Alfred Reiter Bau GmbH","abkuerzung":"ROTTMEIER","adressen":[{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"}],"standardadresse":{"strasse":"Salvatorbergstr. 21","postleitzahl":"84048","ort":"Mainburg","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"452d44ff-b34d-4f40-b5be-5169b2621960"},"telefonnummern":[{"nummer":"08751/5171"},{"nummer":"08751/9400"},{"nummer":"0170/7369223"},{"nummer":"08751/9400"},{"nummer":"0170/2847356 Alfred"}],"bemerkung":"info@reiter-bau.de","oid":"40d50149-aafa-4727-9c9b-0b95ae41d4bb"}
{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Volkswagen Bankdirect","abkuerzung":"VOLKSWAGEN","adressen":[{"strasse":"Gifthorner Str. 57","postleitzahl":"38112","ort":"Braunschweig","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Niedersachsen"},"oid":"60672aff-dbb4-4291-9b3f-fd2ebc177cb6"}],"standardadresse":{"strasse":"Gifthorner Str. 57","postleitzahl":"38112","ort":"Braunschweig","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Niedersachsen"},"oid":"60672aff-dbb4-4291-9b3f-fd2ebc177cb6"},"telefonnummern":[{"nummer":"0531/2121732"},{"nummer":"0531/2122836"},{"nummer":"0531/2122880"}],"bemerkung":"VOLKSWAGEN","oid":"2c1dc58b-e5a2-4513-aea6-1327714b07b8"}
{"index":{"_index":"ses_firma","_type":"firma","_version_type":"external_gte","_version":3}}
{"firmenname":"Die Bayerische","abkuerzung":"BBV","adressen":[{"strasse":"Thomas-Dehler-Str. 25","postleitzahl":"81737","ort":"München","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"1482071b-c049-4802-8a40-e37aba535317"}],"standardadresse":{"strasse":"Thomas-Dehler-Str. 25","postleitzahl":"81737","ort":"München","land":{"name":"Deutschland","iso":"DEU","archiv":false},"bundesland":{"name":"Bayern"},"oid":"1482071b-c049-4802-8a40-e37aba535317"},"telefonnummern":[{"nummer":"Kfz 089/6787-2222"},{"nummer":"089/6787-0"},{"nummer":"089/6787-9150"}],"bemerkung":"BBV","oid":"2786c191-b882-4c2d-a61f-1f962f74cdcd"}

I confirmed that the empty string values do work up to v1.3.9. I will have to dig deeper to figure out what has changed in the meantime.

areek added a commit to areek/elasticsearch that referenced this issue May 14, 2015
@areek areek closed this in af6b69e May 14, 2015
areek added a commit that referenced this issue May 14, 2015
areek added a commit that referenced this issue May 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.