CDAP-8745 Fix search total #8158

Denton-L · 2017-02-24T01:36:20Z

This commit fixes the bug where the total returned is incorrect when
doing a sorted metadata search. Prior to this, the total returned was
the actual total. Now, the total returned may be less than the actual
total but the search is more efficient.

JIRA: https://issues.cask.co/browse/CDAP-8745
Bamboo: http://builds.cask.co/browse/CDAP-RUT701

bdmogal · 2017-02-24T04:58:01Z

cdap-data-fabric/src/main/java/co/cask/cdap/data2/metadata/dataset/MetadataDataset.java

@@ -608,7 +608,7 @@ private SearchResults searchByCustomIndex(String namespaceId, Set<EntityTypeSimp
    String indexColumn = getIndexColumn(sortInfo.getSortBy(), sortInfo.getSortOrder());
    // we want to return the first chunk of 'limit' elements after offset
    // in addition, we want to pre-fetch 'numCursors' chunks of size 'limit'
-    int fetchSize = offset + ((numCursors + 1) * limit);
+    int fetchSize = (int) Math.min(offset + ((numCursors + 1) * (long) limit), Integer.MAX_VALUE);


why do we need all this casting?

for when offset is really large (larger than Integer.MAX_VALUE?

this seems wrong. just int fetchSize = offset + ((numCursors + 1) * limit); should work?

It's possible for limit to be the Integer.MAX_VALUE. If that's the case, then this line won't run properly.

ok. Is this easier to read then?

if (Integer.MAX_VALUE == limit) { fetchSize = limit; } else { fetchSize = offset + ((numCursors + 1) * limit); }

It's ok to keep it the way it is, but please add a comment explaining why this is done.

Well, in the off chance that limit is very large, it would still not work for the code above. I'll add a comment documenting it, though.

bdmogal · 2017-02-24T05:05:10Z

cdap-data-fabric/src/test/java/co/cask/cdap/data2/metadata/dataset/MetadataDatasetTest.java

@@ -1137,42 +1137,34 @@ public void apply() throws Exception {
        searchResults = dataset.search(namespaceId, "*", targets, nameAsc, 0, 2, 0, null, false,
                                       EnumSet.allOf(EntityScope.class));
        Assert.assertEquals(ImmutableList.of(flowEntry, dsEntry), searchResults.getResults());
-        Assert.assertEquals(ImmutableList.of(flowEntry, dsEntry, appEntry), searchResults.getAllResults());


what is searchResults.getAllResults() and why is it needed?

getAllResults() should get all search results, including the ones in the offset. This way, we'll be able to count the total.

you mean if offset = 2, limit = 2, numCursors = 2
results will have (limit + 1) * numCursors = 6 elements, but all results will have offset + results = 8 elements?

To count the total, you don't actually need to return the results, right? You can just add the offset?

If we do that, then we reintroduce the bug that was fixed in the last PR: https://issues.cask.co/browse/CDAP-7930.

bdmogal

@Denton-L, a couple of comments. However, is there a test that covers this scenario somewhere? Would be good to ensure that there's a test which makes sure that a full scan is not done when not required.

bdmogal · 2017-02-24T05:44:07Z

@Denton-L I meant a test that does the following:

Has 5 possible search results
Accepts limit = 1 and numCursors = 3
Ensures that only 4 search results, 3 cursors and total as 4 are returned.

Denton-L · 2017-02-24T06:04:21Z

@bdmogal, in the unit tests, I introduced a check for getAllResults().size(). This is essentially the final total which is returned. Any line where I'm asserting a number < 3 should be testing the behaviour you've described in your comment. Could you give it a quick lookover?

bdmogal · 2017-02-24T06:12:57Z

@Denton-L ok. final comment: getResults and getAllResults is a little ambiguous. Any better names? getResultsFromOffset and getResultsFromBeginning, perhaps? Its not getAllResults, because we may still have more than what's returned. Other than that, LGTM so long as this scenario has been tested. If you have this pushed to a cluster, perhaps @tonybach can verify the behavior too.

Denton-L · 2017-02-24T06:13:53Z

Yeah, good idea, I'll change the naming. And unfortunately, I've only tested this locally using StandaloneMain.

bdmogal · 2017-02-24T06:18:30Z

Ok. I'd recommend pushing this to a cluster, and having @tonybach verify the fix as well, since it is so late in the release cycle :-)

This commit fixes the bug where the total returned is incorrect when doing a sorted metadata search. Prior to this, the total returned was the actual total. Now, the total returned may be less than the actual total but the search is more efficient.

sreevatsanraman requested a review from bdmogal February 24, 2017 04:44

bdmogal reviewed Feb 24, 2017

View reviewed changes

CDAP-8745 Fix search total

4845d65

This commit fixes the bug where the total returned is incorrect when doing a sorted metadata search. Prior to this, the total returned was the actual total. Now, the total returned may be less than the actual total but the search is more efficient.

Denton-L force-pushed the bugfix/cdap-8745-fix-total branch from e3a6b86 to 4845d65 Compare February 24, 2017 06:25

Denton-L merged commit 4e57fad into release/4.1 Feb 24, 2017

Denton-L deleted the bugfix/cdap-8745-fix-total branch February 24, 2017 06:25

Denton-L mentioned this pull request Feb 24, 2017

CDAP-8745 Rename variables for clarity #8167

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDAP-8745 Fix search total #8158

CDAP-8745 Fix search total #8158

Denton-L commented Feb 24, 2017 •

edited

Loading

bdmogal Feb 24, 2017

bdmogal Feb 24, 2017

bdmogal Feb 24, 2017

Denton-L Feb 24, 2017 •

edited

Loading

bdmogal Feb 24, 2017

Denton-L Feb 24, 2017

bdmogal Feb 24, 2017

Denton-L Feb 24, 2017

bdmogal Feb 24, 2017

Denton-L Feb 24, 2017

bdmogal left a comment

bdmogal commented Feb 24, 2017

Denton-L commented Feb 24, 2017

bdmogal commented Feb 24, 2017

Denton-L commented Feb 24, 2017

bdmogal commented Feb 24, 2017

CDAP-8745 Fix search total #8158

CDAP-8745 Fix search total #8158

Conversation

Denton-L commented Feb 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Denton-L Feb 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdmogal left a comment

Choose a reason for hiding this comment

bdmogal commented Feb 24, 2017

Denton-L commented Feb 24, 2017

bdmogal commented Feb 24, 2017

Denton-L commented Feb 24, 2017

bdmogal commented Feb 24, 2017

Denton-L commented Feb 24, 2017 •

edited

Loading

Denton-L Feb 24, 2017 •

edited

Loading