Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed every single index operation to not replace the index within the original request #7223

Conversation

Projects
None yet
3 participants
@javanna
Copy link
Member

javanna commented Aug 11, 2014

An anti-pattern that we have in our code, noticeable if you use java API, is that we modify incoming requests by replacing the index or alias with the concrete index. This way not only the original user request has changed, but all following communications that use that request will lose the information on whether the original request was performed against an alias or an index, and its name.

Refactored the following base classes: TransportShardReplicationOperationAction, TransportShardSingleOperationAction, TransportSingleCustomOperationAction, TransportInstanceSingleOperationAction and all subclasses by introducing an InternalRequest object that holds the original request plus additional info (e.g. the concrete index). This internal request doesn't get sent over the transport but rebuilt on each node on demand (not different to what currently happens anyway, as concrete index gets re-set on each node). When the request becomes a shard level request, instead of using the only int shardId we serialize the ShardId that contains both concrete index name (which might then differ from the original one within the request) and shard id.

Using this pattern we can move get, multi_get, explain, analyze, term_vector, multi_term_vector, index, delete, update, bulk to not replace the index name with the concrete one within the request. The index name within the original request will stay the same.

Made it also clearer within the different transport actions when the index needs to be resolved (user facing requests) and when that's not needed (e.g. shard level request), by exposing resolveIndex method. Moved check block methods to parent classes as their content was always the same on every subclass.

Improved existing tests by randomly introducing the use of an alias, and verifying that the responses always contain the concrete index name and not the original one, as that's the expected behaviour.

Added backwards compatibility tests to make sure that the change is applied in a backwards compatible manner.

@javanna javanna self-assigned this Aug 11, 2014

@jpountz

View changes

src/main/java/org/elasticsearch/action/bulk/TransportBulkAction.java Outdated
this.metaData = metaData;
}

String put(String index, IndicesOptions indicesOptions) {

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

This class confuses me a bit since it is not really a hash table and this method is not really a put. Maybe not extend HashMap and have more explicit method name?

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

(but it could wrap a hashmap)

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

sure, I agree, will change

@jpountz

View changes

...a/org/elasticsearch/action/support/replication/TransportShardReplicationOperationAction.java Outdated
@@ -786,4 +796,25 @@ public Object payload() {
return payload;
}
}

protected class InternalRequest {

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

can it be static and have some doc?

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

don't think it can be static, cause it uses the Request type. Will add some javadocs

@@ -323,7 +329,6 @@ public void writeTo(StreamOutput out) throws IOException {
}

public void start() {
observer = new ClusterStateObserver(clusterService, request.timeout(), logger);

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

Is this change right? (the move to the constructor) start can never be called several times?

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

If it can be done as part of another PR I would like it better :)

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

right...it was needed on a previous version, not needed anymore, will revert

@jpountz

View changes

...ava/org/elasticsearch/action/support/single/custom/TransportSingleCustomOperationAction.java Outdated
@@ -46,6 +48,8 @@
*/
public abstract class TransportSingleCustomOperationAction<Request extends SingleCustomOperationRequest, Response extends ActionResponse> extends TransportAction<Request, Response> {

protected static final ShardId NO_SHARD_ID = new ShardId("_none", -1);

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

I am a little worried that such a variable could leak. Would null be suitable instead? Or should it extend writeTo and readFrom to throw an assertion error?

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

I used this to actually avoid using the null invariant....but I could use null, sure

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

What do you mean by leaking in this context though?

@jpountz

View changes

...ava/org/elasticsearch/action/support/single/custom/TransportSingleCustomOperationAction.java Outdated
}
}

protected class InternalRequest {

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

Can it be static?

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

And there are several similar classes like this, should there be a single one?

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

good point, not sure, the small difference between the different variants is whether concreteIndex is final or not, which depends on when we call concreteIndices in the corresponding transport action. The other matter is visibility...in order to share it we would need to make it public. Shall we do it or not?

this.listener = listener;
}

public void start() {
observer = new ClusterStateObserver(clusterService, request.timeout(), logger);

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 11, 2014

Contributor

is it correct?

This comment has been minimized.

Copy link
@javanna

javanna Aug 11, 2014

Author Member

again, not needed, will revert. I'd love to have a failing test though if I break something like this without knowing :) must be hard to write though

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 11, 2014

I left some questions, but big +1 on the proposed refactoring!

@jpountz jpountz removed the review label Aug 11, 2014

@javanna

This comment has been minimized.

Copy link
Member Author

javanna commented Aug 12, 2014

Updated according to feedback I got

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 12, 2014

LGTM

@jpountz jpountz removed the review label Aug 12, 2014

Internal: changed every single index operation to not replace the ind…
…ex within the original request

An anti-pattern that we have in our code, noticeable for java API users, is that we modify incoming requests by replacing the index or alias with the concrete index. This way not only the request has changed, but all following communications that use that request will lose the information on whether the original request was performed against an alias or an index.

Refactored the following base classes: `TransportShardReplicationOperationAction`, `TransportShardSingleOperationAction`, `TransportSingleCustomOperationAction`, `TransportInstanceSingleOperationAction` and all subclasses by introduced an InternalRequest object that contains the original request plus additional info (e.g. the concrete index). This internal request doesn't get sent over the transport but rebuilt on each node on demand (not different to what currently happens anyway, as concrete index gets set on each node). When the request becomes a shard level request, instead of using the only int shardId we serialize the ShardId that contains both concrete index name (which might then differ ffrom the original one within the request) and shard id.

Using this pattern we can move get, multi_get, explain, analyze, term_vector, multi_term_vector, index, delete, update, bulk to not replace the index name with the concrete one within the request. The index name within the original request will stay the same.

Made it also clearer within the different transport actions when the index needs to be resolved and when that's not needed (e.g. shard level request), by exposing `resolveIndex` method. Moved check block methods to parent classes as their content was always the same on every subclass.

Improved existing tests by randomly introducing the use of an alias, and verifying that the responses always contain the concrete index name and not the original one, as that's the expected behaviour.

Added backwards compatibility tests to make sure that the change is applied in a backwards compatible manner.

Closes #7223

@javanna javanna closed this in 5d987ad Aug 12, 2014

javanna added a commit that referenced this pull request Aug 14, 2014

javanna added a commit that referenced this pull request Aug 14, 2014

javanna added a commit that referenced this pull request Sep 8, 2014

Internal: changed every single index operation to not replace the ind…
…ex within the original request

An anti-pattern that we have in our code, noticeable for java API users, is that we modify incoming requests by replacing the index or alias with the concrete index. This way not only the request has changed, but all following communications that use that request will lose the information on whether the original request was performed against an alias or an index.

Refactored the following base classes: `TransportShardReplicationOperationAction`, `TransportShardSingleOperationAction`, `TransportSingleCustomOperationAction`, `TransportInstanceSingleOperationAction` and all subclasses by introduced an InternalRequest object that contains the original request plus additional info (e.g. the concrete index). This internal request doesn't get sent over the transport but rebuilt on each node on demand (not different to what currently happens anyway, as concrete index gets set on each node). When the request becomes a shard level request, instead of using the only int shardId we serialize the ShardId that contains both concrete index name (which might then differ ffrom the original one within the request) and shard id.

Using this pattern we can move get, multi_get, explain, analyze, term_vector, multi_term_vector, index, delete, update, bulk to not replace the index name with the concrete one within the request. The index name within the original request will stay the same.

Made it also clearer within the different transport actions when the index needs to be resolved and when that's not needed (e.g. shard level request), by exposing `resolveIndex` method. Moved check block methods to parent classes as their content was always the same on every subclass.

Improved existing tests by randomly introducing the use of an alias, and verifying that the responses always contain the concrete index name and not the original one, as that's the expected behaviour.

Added backwards compatibility tests to make sure that the change is applied in a backwards compatible manner.

Closes #7223

javanna added a commit that referenced this pull request Sep 8, 2014

@clintongormley clintongormley changed the title Internal: changed every single index operation to not replace the index within the original request Changed every single index operation to not replace the index within the original request Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.