Refactor state format to use incremental state IDs #10316

Merged
merged 1 commit into from Mar 31, 2015

Projects

None yet

3 participants

@s1monw
Contributor
s1monw commented Mar 30, 2015

Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.

@bleskes bleskes and 1 other commented on an outdated diff Mar 30, 2015
...va/org/elasticsearch/gateway/MetaDataStateFormat.java
this.format = format;
- this.deleteOldFiles = deleteOldFiles;
+ this.prefix = prefix;
+ this.stateFilePattern = Pattern.compile(prefix + "(\\d+)(" + MetaDataStateFormat.STATE_FILE_EXTENSION + ")?");
@bleskes
bleskes Mar 30, 2015 Member

can we use Pattern.qoute for the prefix? just being paranoid..

@s1monw
s1monw Mar 30, 2015 Contributor

sure

@s1monw
Contributor
s1monw commented Mar 30, 2015

@bleskes pushed a new commit

@bleskes bleskes commented on an outdated diff Mar 30, 2015
...va/org/elasticsearch/gateway/MetaDataStateFormat.java
@@ -210,25 +210,47 @@ public boolean accept(Path entry) throws IOException {
}
}
+ long findMaxStateId(final String prefix, Path... locations) throws IOException {
+ final DirectoryStream.Filter<Path> filter = new DirectoryStream.Filter<Path>() {
+ @Override
+ public boolean accept(Path entry) throws IOException {
+ final String entryFileName = entry.getFileName().toString();
+ return Files.isRegularFile(entry)
+ && entryFileName.startsWith(prefix); // only state files
+ }
+ };
+ long maxId = -1;
+ // now clean up the old files
@bleskes
bleskes Mar 30, 2015 Member

left over comment.

@bleskes bleskes and 1 other commented on an outdated diff Mar 30, 2015
...va/org/elasticsearch/gateway/MetaDataStateFormat.java
Preconditions.checkArgument(locations != null, "Locations must not be null");
Preconditions.checkArgument(locations.length > 0, "One or more locations required");
- String fileName = prefix + version + STATE_FILE_EXTENSION;
+ final long maxStateId = Math.max(findMaxStateId(prefix, locations)+1, version);
@bleskes
bleskes Mar 30, 2015 Member

I don't feel comfortable with putting the version in the mix. I think I see why - the file name will be the version most of the time, but it is not guaranteed which makes me wonder if it's worth it. Maybe we should have prefix.version.id.state where id is always incremented?

@s1monw
s1monw Mar 30, 2015 Contributor

I really want to prevent yet another file name format. I think one ID is fine we can also just use -1 as the base and increment it on each new state. but not another format please. I think it really doesn't matter since the ID is only used to find the latest nothing todo with the actual version

@bleskes
bleskes Mar 30, 2015 Member

OK. so let's just remove the version part?

@s1monw
s1monw Mar 31, 2015 Contributor

+1

@bleskes
Member
bleskes commented Mar 30, 2015

I left one comment regarding the use of versions in determining the counter. Another thing I was wondering about is how we deal with the scenario 1.5.0 can live us in - a higher id legacy file, with a lower id and non legacy file.

@s1monw
Contributor
s1monw commented Mar 30, 2015

I left one comment regarding the use of versions in determining the counter. Another thing I was wondering about is how we deal with the scenario 1.5.0 can live us in - a higher id legacy file, with a lower id and non legacy file.

we don't deal with that at all. IMO this requires user interaction - no way to resolve it automatically and I don't think we should unless it's evident that there is a real problem here that happens regularly.

@bleskes
Member
bleskes commented Mar 30, 2015

I don't think we should unless it's evident that there is a real problem here that happens regularly.

Fair enough. Let's wait and see.

@s1monw
Contributor
s1monw commented Mar 31, 2015

@bleskes can you take another look?

@bleskes bleskes and 1 other commented on an outdated diff Mar 31, 2015
...va/org/elasticsearch/gateway/MetaDataStateFormat.java
* the given one.
*/
- private static final class VersionAndLegacyPredicate implements Predicate<PathAndVersion> {
- private final long version;
+ private static final class VersionAndLegacyPredicate implements Predicate<PathAndStateId> {
@bleskes
bleskes Mar 31, 2015 Member

rename this to StateIdAndLegacyPredicate?

@s1monw
s1monw Mar 31, 2015 Contributor

yeah good catch

@bleskes
Member
bleskes commented Mar 31, 2015

LGTM

@s1monw s1monw [STATE] Refactor state format to use incremental state IDs
Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.

Closes #10316
78d86bc
@s1monw s1monw merged commit 78d86bc into elastic:master Mar 31, 2015

1 check passed

CLA Commit author is a member of Elasticsearch
Details
@s1monw s1monw deleted the s1monw:refactor_state_format branch Mar 31, 2015
@s1monw s1monw added a commit to s1monw/elasticsearch that referenced this pull request Apr 1, 2015
@s1monw s1monw [STATE] Refactor state format to use incremental state IDs
Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.

Closes #10316
f6cc8d9
@s1monw s1monw added a commit to s1monw/elasticsearch that referenced this pull request Apr 1, 2015
@s1monw s1monw [STATE] Refactor state format to use incremental state IDs
Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.

Closes #10316
07079a4
@clintongormley clintongormley changed the title from [STATE] Refactor state format to use incremental state IDs to Refactor state format to use incremental state IDs May 29, 2015
@mute mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
@s1monw s1monw [STATE] Refactor state format to use incremental state IDs
Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.

Closes #10316
1e918c4
@clintongormley clintongormley removed the review label Aug 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment