Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in response codes for _bulk API request body JSON errors #92443

Open
tommyers-elastic opened this issue Dec 19, 2022 · 5 comments
Labels
:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >feature Team:Distributed Meta label for distributed team

Comments

@tommyers-elastic
Copy link

Elasticsearch Version

8.7.0-SNAPSHOT

Installed Plugins

No response

Java Version

bundled

OS Version

macos

Problem Description

I was investigating issues relating to invalid JSON in _bulk requests. I noticed that depending on where in the request body the JSON error is found, the response code is different:

the following returns a 400 with {"error":{"root_cause":[{"type":"x_content_parse_exception","reason":"[1:27] Unrecognized token 'test': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')\n at ...

(note the missing quote around test)

{"index": {"_index": test"}}
{"fields": {"name": "tom"}}

whereas the following returns a 200 with {"took":3,"errors":true,"items":[{"index":{"_index":"test","_id":"y3rbKYUBDnKSxzO74aQ8","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"x_content_parse_exception","reason":"[1:25] Unrecognized token 'tom': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')\n at

(note the missing quite around tom)

{"index": {"_index": "test"}}
{"fields": {"name": tom"}}

Why should these equivalent parsing errors return different response codes?

Steps to Reproduce

Call the _bulk api with the data detailed above.

Logs (if relevant)

No response

@tommyers-elastic tommyers-elastic added >bug needs:triage Requires assignment of a team area label labels Dec 19, 2022
@DaveCTurner
Copy link
Contributor

The difference is that Elasticsearch must read every metadata line (e.g. the {"index"... one) to be able to even understand what you're trying to ask it to do, but it knows the payload document is terminated by a newline so an unparseable document don't render the whole request invalid. If you did something like this ...

{"index": {"_index": "test"}}
{"fields": {"name": tom"}}
{"index": {"_index": "test"}}
{"fields": {"name": "harry"}}

... then it would index the harry document and return an error for the tom document. There's no sensible way to encode this kind of partial failure as a HTTP status code, so the best ES can do is to send a 200 OK to indicate that the request was successfully received and the coordination phase of its processing worked, but the individual items each have their own status code too.

As this is more of a user question than something needing action from the Elasticsearch team, we'd like to direct these kinds of things to the Elasticsearch forum. If you can stop by there, we'd appreciate it. This allows us to use GitHub for verified bug reports, feature requests, and pull requests.

There's an active community in the forum that should be able to help get an answer to your question. As such, I hope you don't mind that I close this.

@DaveCTurner DaveCTurner removed the needs:triage Requires assignment of a team area label label Dec 19, 2022
@ruflin
Copy link
Member

ruflin commented Dec 20, 2022

We need to separate 2 errors modes here:

  1. Errors with potential solutions
  2. Errors with no solutions

When ingesting documents into Elasticsearch, errors can happen like mapping conflict, ingest pipeline fails or cluster is overloaded. But all these errors can be resolved by changing configs / assets of the Elasticsearch cluster. In general we are trying to become more lenient with the mappings to reject less data and hit case 1 much less frequent. See #89743 for related discussions.

Then there is 2 where there is no solution out of it. No change to mappings, ingest pipeline, retry can fix this issue, is is invalid json. On the one hand I like that all the other docs which have valid json are still ingested and I expect in most production environments, it can be assumed the shipped json docs are mostly valid. But it makes debugging really hard without having custom error implementations on the client side. This becomes even more tricky if the client side is not owned.

As a potential solution (feature request ;-) ), what if we could pass a param as part of the bulk request ?validate=true that would ensure all json docs are valid and would return an error (for everything) if it is not the case?

@ruflin ruflin reopened this Dec 20, 2022
@ruflin ruflin added >feature and removed >bug labels Dec 20, 2022
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Dec 20, 2022
@DaveCTurner
Copy link
Contributor

As a potential solution (feature request ;-) ), what if we could pass a param as part of the bulk request ?validate=true that would ensure all json docs are valid and would return an error (for everything) if it is not the case?

I think it would make more sense to check for well-formedness of the JSON on the sending client(s). Today there's no need for the coordinating node to even parse the JSON it receives, it just passes the raw bytes straight through to the various primaries. If we were to validate the docs on the coordinating node first we'd need to do this additional parsing work, imposing extra load in the cluster and potentially introducing an indexing bottleneck.

@DaveCTurner DaveCTurner added :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. team-discuss and removed needs:triage Requires assignment of a team area label labels Dec 20, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Dec 20, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tommyers-elastic
Copy link
Author

related: #60442

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >feature Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

5 participants