-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapping updates should be synchronous #8688
Comments
+1 I have issues with this every now and then, worst case when it happens I loose a index, not good. |
+1 - Pretty sure this is the issue we have every day. Screenshot: https://twitter.com/s_mcleod/status/560237698758111232
|
@sammcj can you post the failure in the logs for the shards being allocated? it might be related, and might not, depends on the actual failure. |
@kimchy it looks one of the situations where we've seen this might be caused by the timestamp field in logstash: https://gist.githubusercontent.com/sammcj/66c212311d99a9ca93fe/raw/de99f6ae149745c7b42262d6c2c26c46458e253f/logstash_shard_logs |
@sammcj it looks like your are using a timestamp field, which is correctly recognized as a date but has the data in the wrong format:
This shouldn't fail the recovery (as it should not have been accepted by the primary) but I don't see (yet) how async dynamic mapping creation should have caused this failure. Do you use any custom index templates, the default type or dynamic templates? |
Hi @bleskes - I could see the error around the date and am going to track down which app is logging it to logstash - I was thinking that this shouldn't cause ES to fail recovery (as you said). The index type for logstash is created and managed by logstash itself - as far as our other indexes we just use the default type. This sounds like it is unrelated - so I'll log a separate bug for not handling recovery as not to pollute this ticket if required. |
@sammcj it maybe unrelated. When you say the recovery failed - did you check using |
@bleskes It looks like ES permanently stuck intializing shard 2 Index Info: https://gist.github.com/sammcj/5128f1479df8754c1bc5 Resulting in:
|
@sammcj thx. The index info doesn't show any initializing shard. Can you say which command you used? also , can you post the output of |
I'm seeing this as well on a dynamic field mapping that according to
|
…ing from the API. We have two completely different code paths for mappings updates, depending on whether they come from the API or are guessed based on the parsed documents. This commit makes dynamic mappings updates execute like updates from the API. The only change in behaviour is that a document that fails parsing can not modify mappings anymore (useful to prevent issues such as elastic#9851). Other than that, this change should be fairly transparent to users but working this way opens doors to other changes such as validating dynamic mappings updates on the master node (elastic#8688). The way it works internally is that Mapper.parse now returns a Mapper instead of being void. The returned Mapper represents a mapping update that has been performed in order to parse the document. Mappings updates are propagated recursively back to the root mapper, and once parsing is finished, we check that the mappings update can be applied, and either fail the parsing if the update cannot be merged (eg. because of a concurrent mapping update from the API) or merge the update into the mappings. However not all mappings updates can be applied recursively, `copy_to` for instance can add mappings at totally different places in the tree. Because of it I added ParseContext.rootMapperUpdates which `copy_to` fills when the field to copy data to does not exist in the mappings yet. These mappings updates are merged from the ones generated by regular parsing. One particular mapping update was the `auto_boost` setting on the `all` root mapper. Being tricky to work on, I removed it in favour of search-time checks that payloads have been indexed. One interesting side-effect of the change is that concurrency on ObjectMapper is greatly simplified since we do not have to care anymore about having concurrent dynamic mappings and API updates.
…ing from the API. We have two completely different code paths for mappings updates, depending on whether they come from the API or are guessed based on the parsed documents. This commit makes dynamic mappings updates execute like updates from the API. The only change in behaviour is that a document that fails parsing can not modify mappings anymore (useful to prevent issues such as elastic#9851). Other than that, this change should be fairly transparent to users but working this way opens doors to other changes such as validating dynamic mappings updates on the master node (elastic#8688). The way it works internally is that Mapper.parse now returns a Mapper instead of being void. The returned Mapper represents a mapping update that has been performed in order to parse the document. Mappings updates are propagated recursively back to the root mapper, and once parsing is finished, we check that the mappings update can be applied, and either fail the parsing if the update cannot be merged (eg. because of a concurrent mapping update from the API) or merge the update into the mappings. However not all mappings updates can be applied recursively, `copy_to` for instance can add mappings at totally different places in the tree. Because of it I added ParseContext.rootMapperUpdates which `copy_to` fills when the field to copy data to does not exist in the mappings yet. These mappings updates are merged from the ones generated by regular parsing. One particular mapping update was the `auto_boost` setting on the `all` root mapper. Being tricky to work on, I removed it in favour of search-time checks that payloads have been indexed. One interesting side-effect of the change is that concurrency on ObjectMapper is greatly simplified since we do not have to care anymore about having concurrent dynamic mappings and API updates.
…ing from the API. We have two completely different code paths for mappings updates, depending on whether they come from the API or are guessed based on the parsed documents. This commit makes dynamic mappings updates execute like updates from the API. The only change in behaviour is that a document that fails parsing can not modify mappings anymore (useful to prevent issues such as elastic#9851). Other than that, this change should be fairly transparent to users but working this way opens doors to other changes such as validating dynamic mappings updates on the master node (elastic#8688). The way it works internally is that Mapper.parse now returns a Mapper instead of being void. The returned Mapper represents a mapping update that has been performed in order to parse the document. Mappings updates are propagated recursively back to the root mapper, and once parsing is finished, we check that the mappings update can be applied, and either fail the parsing if the update cannot be merged (eg. because of a concurrent mapping update from the API) or merge the update into the mappings. However not all mappings updates can be applied recursively, `copy_to` for instance can add mappings at totally different places in the tree. Because of it I added ParseContext.rootMapperUpdates which `copy_to` fills when the field to copy data to does not exist in the mappings yet. These mappings updates are merged from the ones generated by regular parsing. One particular mapping update was the `auto_boost` setting on the `all` root mapper. Being tricky to work on, I removed it in favour of search-time checks that payloads have been indexed. One interesting side-effect of the change is that concurrency on ObjectMapper is greatly simplified since we do not have to care anymore about having concurrent dynamic mappings and API updates.
This commit changes dynamic mappings updates so that they are synchronous on the entire cluster and their validity is checked by the master node. There are some important consequences of this commit: - a failing index request on a non-existing type does not implicitely create the type anymore - dynamic mappings updates cannot create inconsistent mappings on different shards - indexing requests that introduce new fields might induce latency spikes because of the overhead to update the mappings on the master node Close elastic#8688
This commit changes dynamic mappings updates so that they are synchronous on the entire cluster and their validity is checked by the master node. There are some important consequences of this commit: - a failing index request on a non-existing type does not implicitely create the type anymore - dynamic mappings updates cannot create inconsistent mappings on different shards - indexing requests that introduce new fields might induce latency spikes because of the overhead to update the mappings on the master node Close elastic#8688
Does this problem still exists in es1.5.2 ? |
Yes, this fix will be available in 2.0. |
So, is the |
@chenryn no - that was never added. we're always safe |
Today, a new field can be added to the local mapping of two shards simultaneously. If the detected field type is different, then each shard can end up with a different field type. These shards then send their new mapping to the master, but only of the mappings will win.
This can result in incorrect results and even data loss, eg: one shard thinks that the field is a string, and the other shard (and the master) thinks that it is a number. In this case, sorting and aggregations will be wrong (#8485). Then, when the string-shard is allocated to a new node, that node will receive the "number" mapping from the master. Replaying the transaction log can cause shard failures (eg #8684). Or new numeric doc values are written but, when we try to merge segments, it fails with a doc values exception (#8009).
The only way to ensure that dynamic fields are the same on all shards is to wait for the master to acknowledge the mapping changes before indexing the document, so:
This will potentially slow down indexing when many dynamic fields are being added, but it is the only way to automatically protect against data loss due to mapping conflicts.
Should a user wish to disable waiting for the master (and they are certain that their dynamic mapping rules are good enough to prevent these problems) then we should allow them to opt-out by setting
"dynamic": "unsafe"
The text was updated successfully, but these errors were encountered: