Skip to content

DATAES-233 Utility class to support rolling index strategy #184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

zzt93
Copy link
Contributor

@zzt93 zzt93 commented Jul 5, 2017

  • You have read the Spring Data contribution guidelines.
  • There is a ticket in the bug tracker for the project in our JIRA.
  • You use the code formatters provided here and have them applied to your changes. Don’t submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.
  • You added yourself as author in the headers of the classes you touched. Amend the date range in the Apache license header if needed. For new types, add the license header (copy from another file and set the current year only).

zzt93 added 5 commits July 5, 2017 10:23
…annotations for a set of similar indices.

The utility class should be registered as a bean (or spring boot auto config) to be used in index name expression language, which is a wrapper of thread local variable to avoid multi-thread issues.
…annotations for a set of similar indices.

Add test cases for utility class.
…annotations for a set of similar indices.

Update bugs in test cases and add comments in `Documents` annotation.
…annotations for a set of similar indices.

Add ticket in test.
…annotations for a set of similar indices.

Remove deprecated constructor.
@edwardcapriolo
Copy link

Wouldnt using aliases accomplish the same thing?

@zzt93
Copy link
Contributor Author

zzt93 commented May 6, 2019

It seems not same.

  • Alias use different names point to a single index;
  • Index sharding using same clazz with suffix to point to multiple indicies, to use different index at different situation;

@edwardcapriolo
Copy link

The complex part will be how to decide what index to writer to and read from fo a given search.You end up trying to build a table-by-day like soluttion

Page<Chat> s1 = repository.findBySender("s1", PageRequest.of(0, 10));
// then
assertThat(s1.getTotalElements() == 2, is(true));
System.out.println(s1.getContent());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println here

Page<Chat> s1 = repository.findBySender("s1", PageRequest.of(0, 10));
// then
assertThat(s1.getTotalElements() == 1, is(true));
System.out.println(s1.getContent());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println here

@Getter
@NoArgsConstructor
@Builder
@Documents(indexPattern = "chat_#{indexNameVar.toString()}", type = "chat")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this reference a field in the document that contains date information?It feels like the reference should be to something in the object to be persisted

Copy link

@edwardcapriolo edwardcapriolo May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@field(name="createDate" store=false index= false)
private Date createDate

or @IndexVar
String getIndexName(){
return //date based logic based on pojo
}

@documents(indexPattern = "chat_#{getIndexName()}", type = "chat")

@edwardcapriolo
Copy link

Can one of the contributers comment on the approach and the code. table by day is very popular and if there is no support many of my applications can not use spring-data-elastic

@zzt93
Copy link
Contributor Author

zzt93 commented Jul 16, 2019

@edwardcapriolo This PR is opened two years ago, I will try to recall the thought and consider your suggestions to adjust the implementation.

@sothawo
Copy link
Collaborator

sothawo commented Jul 17, 2019

I just wrote a comment in the jira issue, I would like to first challenge this solution, if it the best way to solve a problem like rolling indices.

@zzt93
Copy link
Contributor Author

zzt93 commented Jul 17, 2019

@sothawo I read your comment in jira. As far as I can see, the best practice is to put the index template in ES at the beginning, so we don't need to do any check for index existence and let ES do other thing. Then, we can just save entity to ES easily.

As far as about the index name binding mechanism, @edwardcapriolo provides a suggestions: using a field in that entity, which is more intuitive. In that case, we can't use a global bean for root of Spring EL. We need to set the root object (Again, using thread local context to parse index name) before saving this kind of document, however, in that case, It seems only making index obj a little easier, and not for all other operation. Besides, it break the backward compatibility, so I wonder whether it is a better solution.

@sothawo How do you think?

@sothawo
Copy link
Collaborator

sothawo commented Jul 17, 2019

why a ThreadLocal? What if some process submits tasks to a threadpool that insert into ES? Then these tasks run on different threads but should probably use the same index, although there are different name reolvers?

Having the option to define a field on the object to provide the indexname - perhaps together with a @Transient annotation to not store it in the index - might be a solution which could override the indexname set in the @Document annotation. Then the indexname creation is completely out of Spring Data Elasticsearch and up to the code creating the objects to be stored in the index.

But we need to have control over the creation of the index, because if we let Elasticsearch just create an index and store some document in it, then Elasticsearch will use the default mapping for the properties of the document, and we loose all the information that might be annotated with @Field or the other relevant configurations that must be put into the mapping.

@zzt93
Copy link
Contributor Author

zzt93 commented Jul 18, 2019

Maybe I not express clearly, but ThreadLocal is not the core of index suffix. The suffix must be decided by biz logic (may be time or id etc), then use a threadlocal bean to store and parse. ThreadLocal is just to avoid thread issue in a web service, any other way to solve concurrent bean access is fine.

When it comes to index creation, index template is more convenient and suitable for this kind index partitioning as its official guide said. We can use @Field and relevant conf to create right index template, rather than manually create index.

@sothawo
Copy link
Collaborator

sothawo commented Jul 18, 2019

The creation of a name for the index is business logic, that's right. The question is, how is the index name for the current document passed into Spring Data Elasticsearch?

This can be for example by a bean resolved via SpEL. If this is not feasible, we could add an annotation like @IndexName which can be put on a property of the document designating the value of this property being the index name.

But there is the problem that for new indices, the ElasticsearchOperations.createIndex() and ElasticsearchOperations.putMapping() methods need to be called to have a consistent mapping across the different indices. Applications using ElasticsearchOperations implementations (*Template classes) directly, they can handle this by themselves.

But when the *Repository classes are used, this must be handled be the Repository implementations and this is currently not supported. (1)

A different solution would be to support index templates like you mentioned, but this is not implemented yet. (2)

I will create two tickets later for (1) and (2)

Edit:
when implementing (2) we would not need any custom index creation and mapping

@sothawo
Copy link
Collaborator

sothawo commented Nov 7, 2020

DATAES-233 is closed as we now support index templates, and have an improved alias management. For how to get an index name dynamically from an entity I wrote a short example (https://www.sothawo.com/2020/08/use-an-index-name-defined-by-the-entity-to-store-data-in-spring-data-elasticsearch-4-0/).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants