Skip to content

Truncate API added#5573

Open
Sreemanth wants to merge 10 commits intoapache:masterfrom
Sreemanth:truncate_command
Open

Truncate API added#5573
Sreemanth wants to merge 10 commits intoapache:masterfrom
Sreemanth:truncate_command

Conversation

@Sreemanth
Copy link
Copy Markdown
Contributor

Description

#5559: Truncate API Added.

Offline : Disabled Table and deleted all segments
RealTime: Disabled Table and deleted all segments.

@Jackie-Jiang
Implemented as per your comment. (PinotSegmentRestletResource.deleteAllSegments())

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 16, 2020

Codecov Report

Merging #5573 into master will decrease coverage by 0.17%.
The diff coverage is 68.03%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5573      +/-   ##
==========================================
- Coverage   66.44%   66.27%   -0.18%     
==========================================
  Files        1075     1123      +48     
  Lines       54773    57490    +2717     
  Branches     8168     8609     +441     
==========================================
+ Hits        36396    38101    +1705     
- Misses      15700    16560     +860     
- Partials     2677     2829     +152     
Flag Coverage Δ
#integrationtests 45.18% <45.96%> (?)
#unittests 56.88% <60.23%> (?)
Impacted Files Coverage Δ
...quota/HelixExternalViewBasedQueryQuotaManager.java 67.87% <0.00%> (ø)
...org/apache/pinot/common/function/FunctionInfo.java 73.33% <ø> (ø)
...not/common/lineage/SegmentLineageAccessHelper.java 0.00% <0.00%> (ø)
...ache/pinot/common/metadata/ZKMetadataProvider.java 66.85% <0.00%> (ø)
...java/org/apache/pinot/common/segment/ReadMode.java 66.66% <ø> (ø)
...org/apache/pinot/common/utils/CommonConstants.java 39.02% <0.00%> (+0.92%) ⬆️
...oller/api/resources/PinotTableRestletResource.java 51.48% <0.00%> (-5.06%) ⬇️
...troller/helix/core/retention/RetentionManager.java 80.28% <0.00%> (+1.11%) ⬆️
...he/pinot/controller/util/AutoAddInvertedIndex.java 0.00% <0.00%> (ø)
.../org/apache/pinot/core/common/BaseBlockValSet.java 3.03% <0.00%> (-1.32%) ⬇️
... and 478 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3a8368...e74a129. Read the comment docs.

@kishoreg kishoreg requested a review from npawar June 16, 2020 16:43
return new SuccessResponse("Table config updated for " + tableName);
}

@POST
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add documentation highlighting the differences between this and table drop?

Generally, the standard implemented by other databases has the following distinction:

DROP will delete table data, indexes, metadata etc. So any statement (to fetch data or query metadata) will fail stating "table does not exist"

TRUNCATE will delete table data, indexes, but the metadata will be kept around. So you can still see table's schema, table config etc and the queries will return empty results AFAIK.

Are we following the same distinction? If so, we should update the docs to state when should the user choose DROP vs TRUNCATE.

Also, we should add tests to ensure that both behave expectedly. If TRUNCATE is not expected to delete config and schema, then we should add a test for that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. drop and truncate behavior should follow the standards.

@ApiOperation(value = "Truncate table")
public SuccessResponse truncateTable(
@ApiParam(value = "Name of the table to update", required = true) @PathParam("tableName") String tableName,
@ApiParam(value = "realtime|offline") @QueryParam("type") String tableTypeStr) throws Exception {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still have to define what truncate means for realtime. Typically, when you create a realtime table, it immediately starts consumption. So on truncate, after dropping all segments, we should again restart consumption based on configured offset.
That will not happen with these changes. Consumption will be restarted only on the next run of the RealtimeSegmentValidationManager. Which I don't think is an acceptable flow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 options

  • this is ok as part of the version
  • we can invoke the validation manager at the end of the call
  • we can add an separate call to pause/resume Kafka consumption

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take the discussion to the Issue? I believe we need to agree on a spec first before we start publishing PRs.

@npawar
Copy link
Copy Markdown
Contributor

npawar commented Jun 24, 2020

I think this task is still pending discussion about race conditions between this API and segment completion/uploads. The issue has more details

@mcvsubbu
Copy link
Copy Markdown
Contributor

Can we take the conversation to the Issue instead of this PR?

@mcvsubbu
Copy link
Copy Markdown
Contributor

@kishoreg we cannot invoke validation manager unless we get to the lead controller for that particular table. With controller separation using helix controller management, all controllers take part in valdation. etc.

@Path("/tables/{tableName}/truncate")
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation(value = "Truncate table")
public SuccessResponse truncateTable(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5559 (comment)

Let us first agree that adding a new API (that we need to keep backward compatible, and so on) is the best answer to the problem.

@ApiOperation(value = "Truncate table")
public SuccessResponse truncateTable(
@ApiParam(value = "Name of the table to update", required = true) @PathParam("tableName") String tableName,
@ApiParam(value = "realtime|offline") @QueryParam("type") String tableTypeStr) throws Exception {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take the discussion to the Issue? I believe we need to agree on a spec first before we start publishing PRs.

@mcvsubbu mcvsubbu self-requested a review June 25, 2020 00:01
toggleTableState(tableName, tableType, StateType.DISABLE);

// Get all segment names for table and delete all segments
String tableNameWithType = getExistingTableNamesWithType(tableName, tableType).get(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving line String tableNameWithType = getExistingTableNamesWithType(tableName, tableType).get(0); before the toggle, so that we are validating if table exists before calling disable.

Then you can also directly call _pinotHelixResourceManager.toggleTableState(tableNameWithType, status);

// Get all segment names for table and delete all segments
String tableNameWithType = getExistingTableNamesWithType(tableName, tableType).get(0);
List<String> segmentNames = _pinotHelixResourceManager.getSegmentsFor(tableNameWithType);
PinotResourceManagerResponse response = _pinotHelixResourceManager.deleteSegments(tableNameWithType, segmentNames);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before deleting segments, you need to check that all segments went into OFFLINE state in the external view. Pasting the steps Kishore had suggested in the issue:

disable the table
wait for all segments to go to offline state
delete all the segments
enable the table
call the setup table method

toggleTableState(tableName, tableType, StateType.ENABLE);

// Setup table for real-time : (ensureRealtimeClusterIsSetUp)
if (tableType != TableType.OFFLINE && _pinotHelixResourceManager.hasRealtimeTable(tableName)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just if(tableType == REALTIME) ?
and you already have tableNameWithType variable above, no need to construct the tableName again

}
}

private List<String> getExistingTableNamesWithType(String tableName, @Nullable TableType tableType) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tableType doesn't need annotation Nullable - you're checking for not-null before this call

@@ -1240,7 +1240,7 @@ private void verifyIndexingConfig(String tableNameWithType, IndexingConfig index
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we made this a public method, can you add some javadocs explaining what this method does?

}

/**
* Truncate table, delete all contents of table without removing table configuration and schema.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add here that for Truncate for REALTIME includes creating new CONSUMING segments in segment metadata and ideal state

@Produces(MediaType.APPLICATION_JSON)
@ApiOperation(value = "Truncate table")
public SuccessResponse truncateTable(
@ApiParam(value = "Name of the table to update", required = true) @PathParam("tableName") String tableName,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name of table to "truncate" (not update)

Let's make "type" also a required param

}

@Test
public void testTruncateTable() throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need some more testing

  1. create a OFFLINE table and schema, add some segments, then call truncate. Check that segments/idealstate/external view are deleted, but the tableconfig and schema are intact
  2. create a REALTIME table and schema, let it create some CONSUMING segments, then call truncate. Check that table config and schema re intact. Also check that idealstate/external view/segments now have the newly created segments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants