Add Flint Index Purging Logic #2372

kaituo · 2023-10-26T01:43:53Z

Description

Introduce dynamic settings for enabling/disabling purging and controlling index TTL.
Reuse default result index name as a common prefix for all result indices.
Change result index to a non-hidden index for better user experience.
Allow custom result index specification in the data source.
Move default result index name from spark to core package to avoid cross-package references.
Add validation for provided result index name in the data source.
Use pattern prefix + data source name for default result index naming.

Testing:

Verified old documents are purged in a cluster setup.
Checked result index naming with and without custom names, ensuring validation is applied.

Note: Tests will be added in a subsequent PR.

Issues Resolved

#2331

Check List

New functionality includes testing.
- All tests pass, including unit test, integration test and doctest
New functionality has been documented.
- New functionality has javadoc added
- New functionality has user manual doc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

vamsi-amazon · 2023-10-26T01:59:10Z

opensearch/src/main/java/org/opensearch/sql/opensearch/setting/OpenSearchSettings.java

+          Setting.Property.NodeScope,
+          Setting.Property.Dynamic);
+
+  public static final Setting<Boolean> AUTO_INDEX_MANAGEMENT_ENABLED_SETTING =


Is this for both the indices?

vamsi-amazon · 2023-10-26T02:00:00Z

core/src/main/java/org/opensearch/sql/datasource/model/DataSourceMetadata.java

    this.properties = properties;
    this.allowedRoles = allowedRoles;
-    this.resultIndex = resultIndex;
+
+    if (errorMessage != null) {


Minor Nit: can we move this up, In case there is a new revision.

yes, will do

codecov · 2023-10-26T02:06:46Z

Codecov Report

Merging #2372 (003fe92) into main (88b1f03) will decrease coverage by 0.91%.
The diff coverage is 14.00%.

@@             Coverage Diff              @@
##               main    #2372      +/-   ##
============================================
- Coverage     96.46%   95.55%   -0.91%     
  Complexity     4918     4918              
============================================
  Files           465      468       +3     
  Lines         13522    13668     +146     
  Branches        913      915       +2     
============================================
+ Hits          13044    13061      +17     
- Misses          458      587     +129     
  Partials         20       20

Flag	Coverage Δ
sql-engine	`95.55% <14.00%> (-0.91%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
...rch/sql/opensearch/setting/OpenSearchSettings.java	`100.00% <100.00%> (ø)`
...org/opensearch/sql/spark/client/EmrClientImpl.java	`100.00% <ø> (ø)`
...arch/sql/spark/client/EmrServerlessClientImpl.java	`100.00% <100.00%> (ø)`
...earch/sql/spark/data/constants/SparkConstants.java	`0.00% <ø> (ø)`
...sql/spark/response/JobExecutionResponseReader.java	`100.00% <100.00%> (ø)`
...g/opensearch/sql/spark/response/SparkResponse.java	`100.00% <100.00%> (ø)`
...org/opensearch/sql/spark/cluster/IndexCleanup.java	`0.00% <0.00%> (ø)`
...nsearch/sql/spark/cluster/FlintIndexRetention.java	`0.00% <0.00%> (ø)`
...sql/spark/cluster/ClusterManagerEventListener.java	`0.00% <0.00%> (ø)`

penghuo · 2023-10-26T02:49:49Z

related to #2331

vamsi-amazon · 2023-10-26T02:55:01Z

spark/src/main/java/org/opensearch/sql/spark/cluster/IndexCleanup.java

+   * @param queryForDeleteByQueryRequest query request
+   * @param listener action listener
+   */
+  public void deleteDocsBasedOnShardSize(


Are we using this?

we don't, will remove

kaituo · 2023-10-26T03:47:30Z

related to #2331

added the issue in pr description

- Introduce dynamic settings for enabling/disabling purging and controlling index TTL. - Reuse default result index name as a common prefix for all result indices. - Change result index to a non-hidden index for better user experience. - Allow custom result index specification in the data source. - Move default result index name from spark to core package to avoid cross-package references. - Add validation for provided result index name in the data source. - Use pattern prefix + data source name for default result index naming. Testing: - Verified old documents are purged in a cluster setup. - Checked result index naming with and without custom names, ensuring validation is applied. Note: Tests will be added in a subsequent PR. Signed-off-by: Kaituo Li <kaituo@amazon.com>

Signed-off-by: Kaituo Li <kaituo@amazon.com>

core/src/main/java/org/opensearch/sql/datasource/model/DataSourceMetadata.java

spark/src/main/java/org/opensearch/sql/spark/cluster/ClusterManagerEventListener.java

penghuo · 2023-10-26T04:22:08Z

spark/src/main/java/org/opensearch/sql/spark/cluster/FlintIndexRetention.java

+        this::handleSessionPurgeError);
+  }
+
+  private void handleSessionPurgeResponse(Long response) {


purgeStatementIndex() is independent of purgeSessionIdex, right?

right. I do it in sequence since delete by query is not a cheap query and our purging is not time sensitive. I want to achieve the purging without too much performance impact.

* Add Flint Index Purging Logic - Introduce dynamic settings for enabling/disabling purging and controlling index TTL. - Reuse default result index name as a common prefix for all result indices. - Change result index to a non-hidden index for better user experience. - Allow custom result index specification in the data source. - Move default result index name from spark to core package to avoid cross-package references. - Add validation for provided result index name in the data source. - Use pattern prefix + data source name for default result index naming. Testing: - Verified old documents are purged in a cluster setup. - Checked result index naming with and without custom names, ensuring validation is applied. Note: Tests will be added in a subsequent PR. Signed-off-by: Kaituo Li <kaituo@amazon.com> * address comments Signed-off-by: Kaituo Li <kaituo@amazon.com> --------- Signed-off-by: Kaituo Li <kaituo@amazon.com> (cherry picked from commit 1bcacd1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add Flint Index Purging Logic - Introduce dynamic settings for enabling/disabling purging and controlling index TTL. - Reuse default result index name as a common prefix for all result indices. - Change result index to a non-hidden index for better user experience. - Allow custom result index specification in the data source. - Move default result index name from spark to core package to avoid cross-package references. - Add validation for provided result index name in the data source. - Use pattern prefix + data source name for default result index naming. Testing: - Verified old documents are purged in a cluster setup. - Checked result index naming with and without custom names, ensuring validation is applied. Note: Tests will be added in a subsequent PR. * address comments --------- (cherry picked from commit 1bcacd1) Signed-off-by: Kaituo Li <kaituo@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…search-project#2389)" This reverts commit dd48b9b. Signed-off-by: Eric <menwe@amazon.com>

…search-project#2389)" This reverts commit dd48b9b.

* Revert "Add more metrics and handle emr exception message (#2422) (#2426)" This reverts commit b57f7cc. * Revert "Block settings in sql query settings API and add more unit tests (#2407) (#2412)" This reverts commit 3024737. * Revert "Added session, statement, emrjob metrics to sql stats api (#2398) (#2400)" This reverts commit 6e17ae6. * Revert "Redefine Drop Index as logical delete (#2386) (#2397)" This reverts commit e939bb6. * Revert "add concurrent limit on datasource and sessions (#2390) (#2395)" This reverts commit deb3ccf. * Revert "Add Flint Index Purging Logic (#2372) (#2389)" This reverts commit dd48b9b. * Revert "Refactoring for tags usage in test files and also added explicit denly list setting. (#2383) (#2385)" This reverts commit 37e010f. * Revert "Enable session by default (#2373) (#2375)" This reverts commit 7d95e4c. * Revert "Create new session if client provided session is invalid (#2368) (#2371)" This reverts commit 5ab7858. * Revert "Add where clause support in create statement (#2366) (#2370)" This reverts commit b620a56. * Revert "create new session if current session not ready (#2363) (#2365)" This reverts commit 5d07281. * Revert "Handle Describe,Refresh and Show Queries Properly (#2357) (#2362)" This reverts commit 16e2f30. * Revert "Add Session limitation (#2354) (#2359)" This reverts commit 0f334f8. * Revert "Bug Fix, support cancel query in running state (#2351) (#2353)" This reverts commit 9a40591. * Revert "Fix bug, using basic instead of basicauth (#2342) (#2355)" This reverts commit e4827a5. * Revert "Add missing tags and MV support (#2336) (#2346)" This reverts commit 8791bb0. * Revert "[Backport 2.x] deprecated job-metadata-index (#2340) (#2343)" This reverts commit bea432c. * Revert "Integration with REPL Spark job (#2327) (#2338)" This reverts commit 58a5ae5. * Revert "Implement patch API for datasources (#2273) (#2329)" This reverts commit 4c151fe. * Revert "Add sessionId parameters for create async query API (#2312) (#2324)" This reverts commit 3d1a376. * Revert "Add Statement (#2294) (#2318) (#2319)" This reverts commit b3c2e94. * Revert "Upgrade json (#2307) (#2314)" This reverts commit 6c65bb4. * Revert "Minor Refactoring (#2308) (#2317)" This reverts commit 051cc4f. * Revert "add InteractiveSession and SessionManager (#2290) (#2293) (#2315)" This reverts commit 6ac197b. --------- Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>

kaituo requested review from pjfitzgibbons, ps48, kavithacm, derek-ho, joshuali925, dai-chen, YANG-DB, rupal-bq, mengweieric, vamsi-amazon, Swiddis, penghuo, seankao-az, MaxKsyunz, Yury-Fridlyand, anirudha, forestmvey, acarbonetto and GumpacG as code owners October 26, 2023 01:43

vamsi-amazon added backport 2.x backport 2.11 enhancement New feature or request labels Oct 26, 2023

vamsi-amazon assigned kaituo Oct 26, 2023

vamsi-amazon reviewed Oct 26, 2023

View reviewed changes

kaituo force-pushed the purge branch from 6263a30 to af9217a Compare October 26, 2023 05:44

kaituo force-pushed the purge branch 2 times, most recently from ed77c86 to 1ee8c03 Compare October 26, 2023 16:11

kaituo added 2 commits October 26, 2023 21:44

address comments

003fe92

Signed-off-by: Kaituo Li <kaituo@amazon.com>

kaituo force-pushed the purge branch from 1ee8c03 to 003fe92 Compare October 27, 2023 04:55

penghuo reviewed Oct 27, 2023

View reviewed changes

penghuo approved these changes Oct 27, 2023

View reviewed changes

ps48 approved these changes Oct 27, 2023

View reviewed changes

penghuo merged commit 1bcacd1 into opensearch-project:main Oct 27, 2023
19 of 21 checks passed

opensearch-trigger-bot bot mentioned this pull request Oct 27, 2023

[Backport 2.x] Add Flint Index Purging Logic #2388

Merged

opensearch-trigger-bot bot mentioned this pull request Oct 27, 2023

[Backport 2.11] Add Flint Index Purging Logic #2389

Merged

mengweieric added a commit to mengweieric/sql that referenced this pull request Nov 8, 2023

Revert "Add Flint Index Purging Logic (opensearch-project#2372) (open…

ad39e55

…search-project#2389)" This reverts commit dd48b9b. Signed-off-by: Eric <menwe@amazon.com>

mengweieric added a commit to mengweieric/sql that referenced this pull request Nov 8, 2023

Revert "Add Flint Index Purging Logic (opensearch-project#2372) (open…

5706600

…search-project#2389)" This reverts commit dd48b9b. Signed-off-by: Eric <menwe@amazon.com>

vamsi-amazon added a commit to mengweieric/sql that referenced this pull request Nov 13, 2023

Revert "Add Flint Index Purging Logic (opensearch-project#2372) (open…

6f7dd96

…search-project#2389)" This reverts commit dd48b9b.

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Flint Index Purging Logic #2372

Add Flint Index Purging Logic #2372

kaituo commented Oct 26, 2023 •

edited

Loading

vamsi-amazon Oct 26, 2023

kaituo Oct 26, 2023

vamsi-amazon Oct 26, 2023

kaituo Oct 26, 2023

codecov bot commented Oct 26, 2023 •

edited

Loading

penghuo commented Oct 26, 2023

vamsi-amazon Oct 26, 2023

kaituo Oct 26, 2023

kaituo commented Oct 26, 2023 •

edited

Loading

penghuo Oct 26, 2023

kaituo Oct 27, 2023

Add Flint Index Purging Logic #2372

Add Flint Index Purging Logic #2372

Conversation

kaituo commented Oct 26, 2023 • edited Loading

Description

Issues Resolved

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 26, 2023 • edited Loading

Codecov Report

penghuo commented Oct 26, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaituo commented Oct 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaituo commented Oct 26, 2023 •

edited

Loading

codecov bot commented Oct 26, 2023 •

edited

Loading

kaituo commented Oct 26, 2023 •

edited

Loading