Skip to content

Commit

Permalink
[CARBONDATA-3345]A growing streaming ROW_V1 carbondata file would be …
Browse files Browse the repository at this point in the history
…ingored some InputSplits

After looking at carbondata segments, when the file grows to more than 150 M (possibly 128M),
Presto initiates a query by separating several small files, including those in ROW_V1 format.
This bug causes some small files in ROW_V1 format to be ignored, resulting in inaccurate queries.
So for the carbondata ROW_V1 inputSplits MapKey(Java), I adjust concat 'carbonInput.getStart()' to keeping the required inputSplit

This closes #3186
  • Loading branch information
junyan-zg authored and kumarvishal09 committed May 7, 2019
1 parent eb7a833 commit 0ab2412
Showing 1 changed file with 8 additions and 1 deletion.
Expand Up @@ -46,6 +46,7 @@
import org.apache.carbondata.core.metadata.schema.table.TableInfo;
import org.apache.carbondata.core.reader.ThriftReader;
import org.apache.carbondata.core.scan.expression.Expression;
import org.apache.carbondata.core.statusmanager.FileFormat;
import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
import org.apache.carbondata.core.util.CarbonProperties;
Expand Down Expand Up @@ -291,7 +292,13 @@ public List<CarbonLocalMultiBlockSplit> getInputSplits2(CarbonTableCacheModel ta
// Use block distribution
List<List<CarbonLocalInputSplit>> inputSplits = new ArrayList(
result.stream().map(x -> (CarbonLocalInputSplit) x).collect(Collectors.groupingBy(
carbonInput -> carbonInput.getSegmentId().concat(carbonInput.getPath()))).values());
carbonInput -> {
if (FileFormat.ROW_V1.equals(carbonInput.getFileFormat())) {
return carbonInput.getSegmentId().concat(carbonInput.getPath())
.concat(carbonInput.getStart() + "");
}
return carbonInput.getSegmentId().concat(carbonInput.getPath());
})).values());
if (inputSplits != null) {
for (int j = 0; j < inputSplits.size(); j++) {
multiBlockSplitList.add(new CarbonLocalMultiBlockSplit(inputSplits.get(j),
Expand Down

0 comments on commit 0ab2412

Please sign in to comment.