Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FilePredicates for jobs #17525

Merged
merged 1 commit into from
Jun 7, 2023
Merged

Add FilePredicates for jobs #17525

merged 1 commit into from
Jun 7, 2023

Conversation

ssyssy
Copy link
Contributor

@ssyssy ssyssy commented May 30, 2023

What changes are proposed in this pull request?

Adding the predicates for move jobs in order to be triggered by the policy engine.

Why are the changes needed?

Adding two predicates:

  1. unmodifiedFor
  2. dateFromFileNameOlderThan

Does this PR introduce any user facing changes?

NA

@ssyssy ssyssy force-pushed the addPredicates branch 2 times, most recently from 7b2b659 to 17740e5 Compare May 31, 2023 00:12
@ssyssy ssyssy requested review from jja725 and lucyge2022 May 31, 2023 00:12
@ssyssy ssyssy force-pushed the addPredicates branch 5 times, most recently from bf92dae to 0d02bf7 Compare May 31, 2023 02:11
@jja725 jja725 requested a review from Kai-Zhang May 31, 2023 02:38
@ssyssy ssyssy force-pushed the addPredicates branch 3 times, most recently from 02afc2b to 873f1ab Compare June 1, 2023 09:14
Copy link
Contributor

@jja725 jja725 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, some style comments, thanks for the contribution! The PR looks a lot better this time, keep up the good work!

return predicate;
}
}
return new NoPredicate();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the default semantic should be if no filter then all files should pass instead of no files pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jja725 I think we should block the move job, because think about if user puts in an invalid filter, if we move all the files from source to destination. User would be frustrated to put everything back.

Copy link
Contributor

@jja725 jja725 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case If they put a unknown filter, we should throw an exception then. And if they didn't put filter, we should have all things passed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha

Iterable<FileInfo> fileIterator =
new UfsFileIterable(ufs, mJobEntry.getSrc(), user, mJobEntry.getPartialListing(),
FileInfo::isCompleted);
filePredicate.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think we can combine create & get into one function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I think it would be more flexible if we have separate functions.

@@ -116,7 +119,8 @@ public void testProgressReport() throws Exception {
FileIterable files =
new FileIterable(fileSystemMaster, srcPath, user, false, MoveJob.QUALIFIED_FILE_FILTER);
MoveJob job = spy(new MoveJob(srcPath, dstPath, false, user, "1",
OptionalLong.empty(), false, false, false, files));
OptionalLong.empty(), false, false, false, files,
FileFilter.getDefaultInstance()));
when(job.getDurationInSec()).thenReturn(0L);
job.setJobState(JobState.RUNNING, false);
List<Route> nextRoutes = job.getNextRoutes(25);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind adding some tests for the move job with predicate? To make sure we filter the current files. Also it's nice to cover what happened if we have an exception when findTimestampFromFileName

@ssyssy ssyssy force-pushed the addPredicates branch 2 times, most recently from 55dab17 to 336c503 Compare June 3, 2023 02:40
Copy link
Contributor

@jja725 jja725 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, leave one comment

Copy link
Contributor

@jja725 jja725 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ssyssy ssyssy added the type-feature This issue is a feature request label Jun 6, 2023
@ssyssy
Copy link
Contributor Author

ssyssy commented Jun 7, 2023

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit dd6ad6d into main Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature This issue is a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants