Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Lucene Extensions Blocked #74150

Open
mattweber opened this issue Jun 15, 2021 · 5 comments
Open

Custom Lucene Extensions Blocked #74150

mattweber opened this issue Jun 15, 2021 · 5 comments
Labels
>enhancement :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@mattweber
Copy link
Contributor

It appears that the introduction of centralized lucene extensions (#71416) we are no longer able to use non-default lucene codecs that introduce additional file extensions. For example, uniform split that introduces the ustd and ustb extensions or uniform split shared terms which introduces stustd and stustb extensions.

Can we support custom extensions via plugins? Maybe just remove the assert on unknown extension which is tripping in my plugin tests.

cc @tlrx @jpountz

@mattweber mattweber added >bug needs:triage Requires assignment of a team area label labels Jun 15, 2021
@nik9000 nik9000 added :Search/Search Search-related issues that do not fall into other categories team-discuss and removed needs:triage Requires assignment of a team area label labels Jun 15, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@tlrx
Copy link
Member

tlrx commented Jun 16, 2021

Thanks for reporting @mattweber and sorry to hear it breaks your own tests. We added this assertion to catch file extensions that were not explicitly declared so that we are forced to declare them and to configure some options about them: should the file be memory mapped, should it be cached differently than other files etc

I think this assertion is valuable because it prevents us from missing some Lucene files. If the file extension is provided by a codec maintained in Lucene I think it is perfectly fine to update the list of file extensions and just add them. If users who maintain their own Lucene codecs are blocked by this assertion we can add a system prop to disable the assertion for their tests. Providing a support for custom extensions via plugins is possible but looks too heavy to me for now.

I'm deferring the decision to the es-search team but I'd be glad to help implementing the solution.

@ywelsch
Copy link
Contributor

ywelsch commented Jun 16, 2021

+1 on all points here. Ideally all file extensions are known to the system so that it can make the right decision when it comes to caching / memory mapping the files and provide optimal performance. I'm inclined to further lock this down in the future (not only assert, but also check at runtime when assertions are disabled). At the same time it's important to allow plugins to use codecs that introduce additional file extensions. While adding an explicit plugin extension point for new file extensions is certainly an option (and a requirement if we further lock this down), there's some plumbing required to make that happen. To get @mattweber unblocked as quickly as possible, a system property for opting out of this check sounds like a good short-term solution to me. @mattweber WDYT?

@mattweber
Copy link
Contributor Author

@tlrx @ywelsch thanks for the quick review and discussion. I think a system property would be perfectly fine. In the future, I would love better support for custom codecs (and stores) which could have hooks into this extension map. Thank you!

ywelsch added a commit that referenced this issue Jun 21, 2021
Allows plugin developers of custom codecs to opt out of the assertion in LuceneFilesExtensionsTests that checks that all
encountered Lucene file extensions are known to this class. In the future, we would like to add a proper plugin extension
point for this.

Relates #74150
ywelsch added a commit that referenced this issue Jun 21, 2021
Allows plugin developers of custom codecs to opt out of the assertion in LuceneFilesExtensionsTests that checks that all
encountered Lucene file extensions are known to this class. In the future, we would like to add a proper plugin extension
point for this.

Relates #74150
@javanna javanna added >enhancement and removed >bug labels May 3, 2023
@javanna javanna added :Search Foundations/Search Catch all for Search Foundations and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 17, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine removed the Team:Search Meta label for search team label Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

8 participants