Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make index uuid available in Index, ShardRouting & ShardId #16217

Closed
wants to merge 11 commits into from

Conversation

Projects
None yet
3 participants
@bleskes
Copy link
Member

commented Jan 25, 2016

In the early days Elasticsearch used to use the index name as the index identity. Around 1.0.0 we introduced a unique index uuid which is stored in the index setting. Since then we used that uuid in a few places but it is by far not the main identifier when working with indices, partially because it's not always readily available in all places.

This PR start to make a move in the direction of using uuids instead of name by making sure that the uuid is available on the Index class (currently just a wrapper around the name) and as such also available via ShardRouting and ShardId.

Note that this is by no means an attempt to do the right thing with the uuid in all places. In almost all places it falls back to the name based comparison that was done before. It is meant as a first step towards slowly improving the situation.

PS. This came from starting to think about implementing storing indices on disk using folders NAME_UUID pattern (instead of just name as we do today) and thus being able to avoid collision between indices that are named the same (but deleted and recreated). The hope is that this will als remove the need for the in memory shard locks we have now. The problem was that I didn't have easy access to the uuid in all places.

While the tests are not all passing ( 99% of them are). I think this is ready to get initial feedback.

@bleskes bleskes changed the title Make index uuid available on ShardRouting, ShardId Make index uuid available in Index, ShardRouting & ShardId Jan 26, 2016

@s1monw

View changes

core/src/main/java/org/elasticsearch/index/Index.java Outdated

private Index() {

}

public Index(String name) {
public Index(String name, String uuid) {

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 26, 2016

Contributor

can we please get rid of the interning? I think we should do this in a sep PR before this get's merged

This comment has been minimized.

Copy link
@bleskes

bleskes Jan 26, 2016

Author Member

sure - didn't want to make things controversial. I'll make a PR to remove the name.intern()

@s1monw

View changes

core/src/main/java/org/elasticsearch/index/Index.java Outdated
@@ -31,21 +32,23 @@
public class Index implements Streamable {

private String name;

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 26, 2016

Contributor

also make sure these are both final and impl Writeable as we are at it? I did that before it's not too much of an overhead

This comment has been minimized.

Copy link
@bleskes

bleskes Jan 26, 2016

Author Member

sure - wanted to keep this as small as possible (it's big enough :)). I'll tackle writable too.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jan 26, 2016

I love this - infact I started the same effort last week but didn't get too far time wise. I am +100 on this. I really think we have to try to get rid of the _na_ as much as possible or at least use assertions to ensure we never get it when we really need a real uuid! please get this in ASAP :)

@bleskes bleskes referenced this pull request Jan 26, 2016

Closed

remove string interning #16227

bleskes added some commits Jan 24, 2016

@bleskes bleskes force-pushed the bleskes:index_uuid branch to 7d05214 Jan 26, 2016

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Jan 26, 2016

@s1monw I rebased on master and addressed your comments. All test pass now. Can you take another look?

bleskes added some commits Jan 26, 2016

}
} else {
minimumCompatibleLuceneVersion = null;
}

return new IndexMetaData(index, version, state, numberOfShards, numberOfReplicas, tmpSettings, mappings.build(),
final String uuid = settings.get(SETTING_INDEX_UUID, INDEX_UUID_NA_VALUE);

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 27, 2016

Contributor

I am not sure but shouldn't this one be a generated one if it's not in the settings?

This comment has been minimized.

Copy link
@bleskes

bleskes Jan 27, 2016

Author Member

At least at the moment the uuid generation is done on MetaDataCreateIndexService#createIndex - here we just create a IndexMetaData object, not knowing if it's new or old. The way I did it mimics what the old code did (by return this from the settings directly). I want to keep the semantic change as small as possible.

if (!shardId.getIndex().equals(restoreSource.index())) {
snapshotShardId = new ShardId(restoreSource.index(), shardId.id());
if (!shardId.getIndexName().equals(restoreSource.index())) {
snapshotShardId = new ShardId(restoreSource.index(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());

This comment has been minimized.

Copy link
@s1monw

s1monw Jan 27, 2016

Contributor

should we preserve the UUID from the incoming shard?

This comment has been minimized.

Copy link
@bleskes

bleskes Jan 27, 2016

Author Member

Good question. I wasn’t sure of the exact semantics - we do change the index name, at least as far as identity is now, it’s different for many things so I figured it’s “safer” to indicate this as an I don’t know UUID…

On 27 Jan 2016, at 14:05, Simon Willnauer notifications@github.com wrote:

In core/src/main/java/org/elasticsearch/index/shard/StoreRecovery.java:

@@ -254,8 +255,8 @@ private void restore(final IndexShard indexShard, final IndexShardRepository ind
translogState.totalOperationsOnStart(0);
indexShard.prepareForIndexRecovery();
ShardId snapshotShardId = shardId;

  •        if (!shardId.getIndex().equals(restoreSource.index())) {
    
  •            snapshotShardId = new ShardId(restoreSource.index(), shardId.id());
    
  •        if (!shardId.getIndexName().equals(restoreSource.index())) {
    
  •            snapshotShardId = new ShardId(restoreSource.index(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
    

should we preserve the UUID from the incoming shard?


Reply to this email directly or view it on GitHub.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jan 27, 2016

@bleskes LGTM I left minor comments in the code, can we maybe also add an assertion that the uuid is a real UUID when we create IndexService? that would be awesome. No need for another review!!! thanks so much for doing this

@bleskes bleskes closed this in 2a137b5 Jan 28, 2016

@@ -76,7 +76,7 @@ public boolean equals(Object o) {
if (this == o) return true;
if (o == null) return false;
ShardId shardId1 = (ShardId) o;
return shardId == shardId1.shardId && index.name().equals(shardId1.index.name());
return shardId == shardId1.shardId && index.getName().equals(shardId1.index.getName());

This comment has been minimized.

Copy link
@jasontedor

jasontedor Jan 29, 2016

Member

I think this should be:

        return shardId == shardId1.shardId && index.equals(shardId1.index);
@@ -112,7 +112,7 @@ public void writeTo(StreamOutput out) throws IOException {
@Override
public int compareTo(ShardId o) {
if (o.getId() == shardId) {
return index.name().compareTo(o.getIndex());
return index.getName().compareTo(o.getIndex().getName());

This comment has been minimized.

Copy link
@jasontedor

jasontedor Jan 29, 2016

Member

I think this should be:

            int compare = index.getName().compareTo(o.getIndex().getName());
            if (compare != 0) {
                return compare;
            }
            return index.getUUID().compareTo(o.getIndex().getUUID());
@@ -250,7 +250,7 @@ public void testDeleteShardState() throws IOException {
ShardStateMetaData shardStateMetaData = load(logger, env.availableShardPaths(shard.shardId));
assertEquals(shardStateMetaData, getShardStateMetadata(shard));

routing = TestShardRouting.newShardRouting(shard.shardId.index().getName(), shard.shardId.id(), routing.currentNodeId(), null, routing.primary(), ShardRoutingState.INITIALIZING, shard.shardRouting.allocationId(), shard.shardRouting.version() + 1);
routing = TestShardRouting.newShardRouting(shard.shardId.getIndexName(), shard.shardId.id(), routing.currentNodeId(), null, routing.primary(), ShardRoutingState.INITIALIZING, shard.shardRouting.allocationId(), shard.shardRouting.version() + 1);

This comment has been minimized.

Copy link
@jasontedor

jasontedor Jan 29, 2016

Member

I think this should be

        routing = TestShardRouting.newShardRouting(shard.shardId.getIndex(), shard.shardId.id(), routing.currentNodeId(), null, routing.primary(), ShardRoutingState.INITIALIZING, shard.shardRouting.allocationId(), shard.shardRouting.version() + 1);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.