Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] First cut at dynamic sharding support #392

Closed

Conversation

emmanuelbernard
Copy link
Member

I need to write down tests to finish that up but @Sanne could you give me your first impression on the proposal. I did not have to break any existing public interface. I even reuse the IndexShardingStrategy.

If you have some advice on making the structure more scalable concurrency wise, I'm interested too.

@ghost ghost assigned Sanne Apr 6, 2013
@Sanne
Copy link
Member

Sanne commented Apr 8, 2013

+1 very clever approach :)

Generally speaking, don't you think the ShardIdentifierProvider should superseed the IndexShardingStrategy ?
It's great we have a backwards compatible migration path with the dual-design but I'm wondering if we shouldn't deprecate IndexShardingStrategy or if it makes sense to keep both around.

I add some more details in the code as comments.

@Sanne Sanne closed this Apr 8, 2013
IndexManagerHolder indexManagerHolder,
IndexManagerFactory indexManagerFactory) {
if ( !isDynamicSharding && providers.length == 0 ) {
throw log.entityWithNoShard( type );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a zero-shards entity invalid?
I'm wondering about the multitenancy use case, if the application should not be able to deploy with an initial state of zero tenants.

Queries would return zero elements; while on an add operation the ShardIdentifierProvider has to return an id anyway.. which would trigger creation of an indexmanager.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 shard is not legal for static shards as ti will always stay 0 :). In case of dynamic sharding (value set to dynamic), we don't throw the exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right :)
misinterpreted the condition.

@Sanne
Copy link
Member

Sanne commented Apr 8, 2013

I've added many more comments on the code - this is just to make sure you see them as I'm not sure it will notify you as I closed the issue initially.

@Sanne Sanne reopened this Apr 8, 2013
@Sanne Sanne closed this Apr 8, 2013
@emmanuelbernard
Copy link
Member Author

I did not think about deprecating IndexShardingStrategy but looking at it now, I could not find a good reason to keep it except maybe to eagerly initialize the IndexManagers but if the lazy code is good enough that might not be necessary.

@Sanne
Copy link
Member

Sanne commented Apr 9, 2013

We could use ShardIdentifierProvider#getAllShardIdentifiers for that?

Since we validate IndexManager configuration options only when it's started it would be good to be eager on initializing those for which it's possible.

@Sanne
Copy link
Member

Sanne commented Apr 9, 2013

Different thought: looks like the user code should be able to interact directly with the ShardIdentifierProvider instance, right?

I mean in the multi-tenant use case you want to be able to let it know you are creating a new tenant.

another nice use case coming to mind is per-language independent indexes.. would it make sense to combine two levels of dynamic sharding?

@emmanuelbernard
Copy link
Member Author

Different thought: looks like the user code should be able to interact directly with the ShardIdentifierProvider instance, right?
I mean in the multi-tenant use case you want to be able to let it know you are creating a new tenant.

Yes, in an ideal world ShardIdentifierProvider could have CDI injection points but worse case, threadlocals could be used. You're saying we should provide access to the ShardIdentifierProvider instance so that a user could call specific methods to it

MyImpl provider = (MyImpl) searchFactory.getMetadata().getEntityMetadata(User.class).getShardIdentifierProvider();

Not entirely satisfactory, would be nice to enlist a callback.

another nice use case coming to mind is per-language independent indexes.. would it make sense to combine two levels of dynamic sharding?

Dow we need to have built-in layers of sharding or leave that to the ShardIdentifierProvider implementor? It seems to be the later.

@emmanuelbernard
Copy link
Member Author

I could not find a good reason to keep it except maybe to eagerly initialize the IndexManagers

It turns out, when building the SF, we call EIB.getIndexManagers() and thus eagerly initialize the indexes. so we are good.

@Sanne
Copy link
Member

Sanne commented Apr 9, 2013

what you called

searchFactory.getMetadata().getEntityMetadata(User.class)

is ~ available today as

org.hibernate.search.spi.SearchFactoryIntegrator.getIndexBindingForEntity(Class<?>)

but I agree the getMetada() is looking better.. just we don't have that yet.

@Sanne
Copy link
Member

Sanne commented Apr 10, 2013

From today's forum posts, I'm convinced it would be awesome to have a dynamic sharding option working out of the box for multiple languages, extending the use case we documented for dynamic analyzers:

http://docs.jboss.org/hibernate/search/4.2/reference/en-US/html_single/#d0e3840

As the tricky part for the above link is actually running queries on the appropriate index: needs to use a shard sensitive filter.
http://docs.jboss.org/hibernate/search/4.2/reference/en-US/html_single/#query-filter-shard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants