Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary data copy on index add #13964

Closed
TomaszGaweda opened this Issue Oct 17, 2018 · 0 comments

Comments

Projects
None yet
4 participants
@TomaszGaweda
Copy link

TomaszGaweda commented Oct 17, 2018

We have following situation:

  • cluster contains 16 nodes
  • cluster if filled by many Gigabytes of data
  • one node is joining after some time
    -> code is the same as in other nodes, so addIndex is called.

The index already exists (was created previously by other nodes), so addIndex is de-facto no-op. However, addIndex do full copy of entries which can take much time. Note, that the same is done by MapMigrationAwareService, so the logic is duplicated

@mmedenjak mmedenjak added this to the 3.12 milestone Oct 17, 2018

taburet added a commit to taburet/hazelcast that referenced this issue Oct 24, 2018

Track indexed partitions to avoid unnecessary reindexing
This change fixes the following issues:

1. When a user invokes IMap.addIndex, AddIndexOperation is sent to all
members and the index in question is fully rebuilt even if it's already
populated. Typically, such addIndex invocations are performed from user
code for each member joining a cluster during the member initialization.
If the cluster is storing a lot of data, rebuilding all of the indexes
may be a very costly operation.

2. When a new member joins a cluster, a local map proxy object is
created on that new member. During the proxy initialization,
AddIndexOperation is sent to all cluster members for each index
configured by the new member XML configuration to ensure the indexes
are present on all members (see MapProxySupport.initializeIndexes).
Basically, we are rebuilding all of the indexes on all of the members.

This change tracks a set of partitions indexed by each index. If a
partition is already indexed by a certain index, the reindexing is
skipped.

Fixes: hazelcast#13964

taburet added a commit to taburet/hazelcast that referenced this issue Oct 24, 2018

Track indexed partitions to avoid unnecessary reindexing
This change fixes the following issues:

1. When a user invokes IMap.addIndex, AddIndexOperation is sent to all
members and the index in question is fully rebuilt even if it's already
populated. Typically, such addIndex invocations are performed from user
code for each member joining a cluster during the member initialization.
If the cluster is storing a lot of data, rebuilding all of the indexes
may be a very costly operation.

2. When a new member joins a cluster, a local map proxy object is
created on that new member. During the proxy initialization,
AddIndexOperation is sent to all cluster members for each index
configured by the new member XML configuration to ensure the indexes
are present on all members (see MapProxySupport.initializeIndexes).
Basically, we are rebuilding all of the indexes on all of the members.

This change tracks a set of partitions indexed by each index. If a
partition is already indexed by a certain index, the reindexing is
skipped.

Fixes: hazelcast#13964

@taburet taburet self-assigned this Oct 24, 2018

@dbrimley dbrimley modified the milestones: 3.12, 3.13 Nov 8, 2018

@taburet taburet modified the milestones: 3.13, 3.12 Dec 10, 2018

taburet added a commit that referenced this issue Dec 21, 2018

Track indexed partitions to avoid unnecessary reindexing (#13984)
This change fixes the following issues:

1. When a user invokes IMap.addIndex, AddIndexOperation is sent to all
members and the index in question is fully rebuilt even if it's already
populated. Typically, such addIndex invocations are performed from user
code for each member joining a cluster during the member initialization.
If the cluster is storing a lot of data, rebuilding all of the indexes
may be a very costly operation.

2. When a new member joins a cluster, a local map proxy object is
created on that new member. During the proxy initialization,
AddIndexOperation is sent to all cluster members for each index
configured by the new member XML configuration to ensure the indexes
are present on all members (see MapProxySupport.initializeIndexes).
Basically, we are rebuilding all of the indexes on all of the members.

This change tracks a set of partitions indexed by each index. If a
partition is already indexed by a certain index, the reindexing is
skipped.

Fixes: #13964
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.