Skip to content

Reorder location check in bulkv2 to avoid uneeded work #5469

@keith-turner

Description

@keith-turner

The following code checks if a tablet does not have a location and it checks if a tablet needs to load files. In the case where a tablet already has all of its load markers and it has no location there is no need to wait on the tablet. However the code will always wait on tablets w/o locations even when it does not need to. This can cause a bulk import to many tablet thats are moving around to wait unnecessarily.

Location location = tablet.getLocation();
HostAndPort server = null;
if (location == null) {
locationLess++;
continue;
} else {
server = location.getHostAndPort();
}
Set<TabletFile> loadedFiles = tablet.getLoaded().keySet();
Map<String,MapFileInfo> thriftImports = new HashMap<>();
for (final Bulk.FileInfo fileInfo : files) {
Path fullPath = new Path(bulkDir, fileInfo.getFileName());
TabletFile bulkFile = new TabletFile(fullPath);
if (!loadedFiles.contains(bulkFile)) {
thriftImports.put(fileInfo.getFileName(), new MapFileInfo(fileInfo.getEstFileSize()));
}
}

Can change the code to only increment the locationLess counter when the files to load are not already in the loadedFiles set.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions