Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-11129 High Availability for non-shared indexes on DIST caches #7792

Merged

Conversation

gustavocoding
Copy link

@yrodiere
Copy link
Contributor

yrodiere commented Feb 4, 2020

This should not prevent you from merging this PR at all, but I was wondering: in Hibernate Search 6, you might want to do the "segment id" filtering in a more transparent fashion?

In Elasticsearch, there is the concept of "routing". Essentially a routing key is similar to your concept of segment ID.
The same concept of routing key is available in Search 6, and I think we should be able to change a few things to do exactly what you need. You'd then be able to get rid of all the code that adds custom fields in Infinispan. You may also be able to rid yourself from the changes in the KeyTransformationHandler.

Out of the top of my head, here are the changes we would need in Hibernate Search 6. They're actually quite reasonable:

  • For the Lucene backend, we'd need to index the routing key: currently it's just used for routing, not indexed.
  • For the Lucene and Elasticsearch backends, we'd need to automatically add a filter to the query when routing keys are specified.
  • For the Lucene and Elasticsearch backends, we'd need to offer a way to retrieve the routing key of a particular search hit... maybe? Not sure you need this.
  • For the Lucene backend, we'd need to introduce a new sharding strategy where routing keys are enabled but only used as discriminators, not for actual sharding. Or you could just use the "hash" sharding strategy and set the number of shards to 1.

@gustavocoding
Copy link
Author

in Hibernate Search 6, you might want to do this filtering in a more transparent fashion?

Certainly, but as this fix is supposed to be supported in HSearch 5 (we are going to backport it to Infinispan 10.1), it was done with what HSearch 5 supported.

@yrodiere
Copy link
Contributor

yrodiere commented Feb 4, 2020

Sure. I was just fishing to see if it makes sense to work on it in Search 6.

Created HSEARCH-3824.

@gustavocoding
Copy link
Author

gustavocoding commented Feb 4, 2020

The requirements for this extra field is:

  • Numeric integer field with low cardinality [0..256]
  • This field is not part of any explicit mapping and applies to every entity type
  • I'd need to be able to define its value for each indexed document, which is calculated from the Infinispan key in its Object form (this same object is indexed as a string documentId after being passed through the KeyTransformationHandler). The calculation involves some internal Infinispan components.
  • Any queriy can be or not filtered by one or more values of this field
  • There should be a way to delete from the index entries that match one or more values of this field

I am not familiar with the routing field, would it be able to do al the above?

@yrodiere
Copy link
Contributor

yrodiere commented Feb 4, 2020

Numeric integer field with low cardinality [0..256]

It'll be a string field, but as far as I can see that's already what you are using.

This field is not part of any explicit mapping and applies to every entity type
I'd need to be able to define its value for each indexed document, which is calculated from the Infinispan key in its Object form (this same object is indexed as a string documentId after being passed through the KeyTransformationHandler). The calculation involves some internal Infinispan components.

That can be done in Search 6 by applying a custom RoutingKeyBridge to Object.class. Everything listed above is already possible.

Any queries can be or not filtered by one or more values of this field

Yep.

There should be a way to delete from the index entries that match one or more values of this field

This is not currently exposed, but I suppose we can add it. Do you mean delete all entries for a given list of segments, or all entries with a given ID in a given list of segments?

@gustavocoding
Copy link
Author

gustavocoding commented Feb 4, 2020

This is not currently exposed, but I suppose we can add it. Do you mean delete all entries for a given list of segments, or all entries with a given ID in a given list of segments?

Delete all entries for a given list of segments for sure, but I will know better soon if any other deletion is required. Regarding HSearch 5, is it currently doable to delete by query on this field I added in this PR?

EDIT: I can probably use class DeleteByQueryWork extends Work

@yrodiere
Copy link
Contributor

yrodiere commented Feb 4, 2020

Delete all entries for a given list of segments for sure, but I will know better soon if any other deletion is required. Regarding HSearch 5, is it currently doable to delete them in this PR?

In both Search 5 and 6 you can do a purge to delete all documents of the index. I suppose this is not enough for you.

If you need to delete all documents for a given list of segments, then your only solution right now in Hibernate Search 5 is to use an experimental SPI: org.hibernate.search.backend.spi.DeleteByQueryWork. See for example org.hibernate.search.elasticsearch.test.deletebyquery.DeleteByQueryIT#canDeleteByQuery.

We haven't restored this SPI in Search 6 yet: HSEARCH-3304

@gustavocoding
Copy link
Author

gustavocoding commented Feb 4, 2020

Regarding DeleteByQuery support in Hibernate Search 5, I found some issues. It fails saying that my custom segment field is not mapped, when I tried using SingularTermDeletionQuery in Work deleteWork = new DeleteByQueryWork(type, deletionQuery);

I tried to extend SingularTermDeletionQuery but it is final, and finally tried to provide my own implementation of org.hibernate.search.backend.spi.DeletionQuery but it is not accepted, since org.hibernate.search.backend.impl.DeleteByQuerySupport only allows a single type:

public static boolean isSupported(Class<? extends DeletionQuery> type) {
		if ( SingularTermDeletionQuery.class == type ) {
			return true;
		}
		return false;
}

Is there a workaround for it?

EDIT: Maybe get hold of the IndexManager or IndexWriter?

The other minor issue is that the Work SPI is a per-entity-type, so I need to go through searchFactory.getIndexBindings() and for each type perform the deletion

@yrodiere
Copy link
Contributor

yrodiere commented Feb 4, 2020

Is there a workaround for it?

Maybe try to use .classBridge(SegmentFieldBridge.class).name(SegmentFieldBridge.SEGMENT_FIELD) when you apply your bridge, so that Hibernate Search knows which fields your bridge contributes. You will probably need to implement StringBridge in your bridge, but that should be straightforward.

EDIT: Maybe get hold of the IndexManager or IndexWriter?

I'm not sure you can get hold of the IndexWriter directly? At least that seems dangerous.

@tristantarrant
Copy link
Member

Merged

@gustavocoding gustavocoding deleted the ISPN-11129_ha_non_shared branch February 10, 2020 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants