Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is possible to index a document that contains two discrete JSON objects #7299

Closed
polyfractal opened this issue Aug 15, 2014 · 2 comments
Closed

Comments

@polyfractal
Copy link
Contributor

This is most noticeable with bulk requests because of the bulk format. Since the newline is used as a separator, omitting a newline will index a document that contains two JSON objects. The first JSON object is indexed properly, while the second is completely ignored (not found in the mapping, etc).

However, this causes problems with source retrieval, search, GET, etc because the JSON is invalid.

There should either be an error returned when trying to index a document containing two (or more) discrete objects, or the non-indexed objects should be removed from the source. The exception seems preferable so the user knows something went wrong.

Reproduction (Bulk):

curl -XPOST "http://localhost:9200/cars/transactions/_bulk" -d'
{ "index": {"_id":1}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }{ "index": {"_id":2}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {"_id":3}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
'

{
   "took": 72,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "cars",
            "_type": "transactions",
            "_id": "1",
            "_version": 1,
            "status": 201
         }
      }
   ]
}
curl -XGET "http://localhost:9200/cars/transactions/1"

{
   "_index":"cars",
   "_type":"transactions",
   "_id":"1",
   "_version":1,
   "found":true,
   "_source":{
      "price":10000,
      "color":"red",
      "make":"honda",
      "sold":"2014-10-28"
   }   {
      "index":{
         "_id":2
      }
   }
}

Reproduction (Single Doc):

curl -XPOST "http://localhost:9200/cars/transactions/5" -d'
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }{ "index": {"_id":2}}'

{
   "_index": "cars",
   "_type": "transactions",
   "_id": "5",
   "_version": 1,
   "created": true
}
curl -XGET "http://localhost:9200/cars/transactions/5"

{
   "_index":"cars",
   "_type":"transactions",
   "_id":"5",
   "_version":1,
   "found":true,
   "_source":{
      "price":10000,
      "color":"red",
      "make":"honda",
      "sold":"2014-10-28"
   }   {
      "index":{
         "_id":2
      }
   }
}
@clintongormley
Copy link

This will probably be fixed by #2315

@polyfractal
Copy link
Contributor Author

Closing this, resolved by #11414. If you attempt to index a "double" JSON object, ES throws an exception up-front now instead of silently accepting it:

{
   "error": {
      "root_cause": [
         {
            "type": "illegal_argument_exception",
            "reason": "Malformed action/metadata line [3], expected START_OBJECT or END_OBJECT but found [VALUE_NUMBER]"
         }
      ],
      "type": "illegal_argument_exception",
      "reason": "Malformed action/metadata line [3], expected START_OBJECT or END_OBJECT but found [VALUE_NUMBER]"
   },
   "status": 400
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants