Skip to content

Conversation

@FineAndDandy
Copy link
Contributor

No description provided.

"If a thread blocks more than this period of time waiting to get file permits,"
+ " debugging information will be written."),

TSERV_BULK_MAX_TABLET_OVERLAP("tserver.bulk.max.overlap", "0", PropertyType.COUNT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just the name "tserver.bulk.max.tablets" is more appropriate since it is just the max number of tablets a single bulk file can produce.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking ahead, for the new bulk import code in the 2.x branch I think this check would actually execute in the master. Therefore the prefix tserver is not a good prefix. This is not a problem with this PR, more of a problem with the current property name conventions. I searched for bulk in this file and found master.bulk. and tserver.bulk. prefixes. The user does not care where the check is done. Could possibly use the prefix general instead of tserver, but this creates a third prefix for bulk properties which could make this property harder to find.

Not sure there is a good solution to this problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't think users care whether the prefix is tserver or master. I don't think its worth going through a deprecation cycle to migrate all of the tserver properties to master or general. That just adds extra unnecessary work for users. I think the name master.bulk.max.tablets is OK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name master.bulk.max.tablets is OK.

I like that name and would be in favor of using it in 1.9 and 2.x. The name will make more sense in 2.x than in 1.9 but that is better for the long term.

iterator.next();
Assert.assertTrue(iterator.hasNext());
iterator.next();
Assert.assertFalse(iterator.hasNext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done in a loop, like you did above to create the file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better yet, just use Guava's Iterables.size(s)

// create a bulk load file
final FileSystem fs = getCluster().getFileSystem();
final Path basePath = getCluster().getTemporaryPath();
CachedConfiguration.setInstance(fs.getConf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class was dropped in 2.x. and I don't think is needed in the test.

Suggested change
CachedConfiguration.setInstance(fs.getConf());

RemoteIterator<LocatedFileStatus> status = fs.listFiles(bulkFailures, false);
Assert.assertTrue(status.hasNext());
Path newPath = status.next().getPath();
Assert.assertFalse(status.hasNext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bulkImport may not have run yet so you need to verify that the files were moved first before checking the failure directory. The same with the second try below.

Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had a chance to look at the PR in detail. However, I wanted to point out that this modifies the behavior of an API that is deprecated in the latest code. I'm not sure it makes sense to add new behavior features to deprecated APIs, when we haven't yet discussed or come to consensus on how it should be implemented in the replacement API. Addressing it in the latest API first, will ensure we make sure any backported change leads in the same direction, and doesn't do something wildly different (assuming we backport to a deprecated API at all).

@milleruntime
Copy link
Contributor

I am in the process of adding this check to the new bulk import for 2.1. I think this would be useful to have in the old bulk import process as well, but will be different for 2.x since some of the old code was refactored for reuse with the new code. Adding a new check to the deprecated API in 2.x is debatable and back porting this check to 1.9/1.10 is also debatable. I think the maintenance cost of something like this is very minimal and would benefit users who don't want to risk switching to the newer API. The new bulk API should have significant improvements yet may come with some bugs initially as it has yet to be tested in production at a large scale.

@ctubbsii ctubbsii closed this Aug 17, 2020
@ctubbsii ctubbsii reopened this Aug 17, 2020
@ctubbsii ctubbsii changed the base branch from 1.9 to 1.10 August 17, 2020 01:14
@ctubbsii
Copy link
Member

PR auto-closed when 1.9 branch was removed. Re-opened and set base branch to 1.10 instead.

@milleruntime
Copy link
Contributor

Closing since this didn't make it into 1.10. It was added to 2.1 in #1614. The only thing left that could be done would be to add the property to the deprecated bulk import code and I wouldn't recommend doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider implementing failsafe to prevent large number of tablet assignments in bulk import

4 participants