New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED JENKINS-40252] Add an Iterable<T> that returns all items unsorted #2665

Merged
merged 5 commits into from Dec 16, 2016

Conversation

4 participants
@reviewbybees

This comment has been minimized.

Show comment
Hide comment
@reviewbybees

reviewbybees Dec 14, 2016

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

reviewbybees commented Dec 14, 2016

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

Show outdated Hide outdated core/src/main/java/hudson/model/Items.java
@@ -386,6 +387,35 @@ String name(Item i) {
}
/**
* Gets all the {@link Item}s recursively in the {@link ItemGroup} tree visible to
* {@link Jenkins#getAuthentication()} without concern for the order in which items are returned.
*

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Makes sense to mention the iterator is read-inly

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Makes sense to mention the iterator is read-inly

* @param type the type.
* @param <T> the type.
* @return An {@link Iterable} for all items.
* @since FIXME

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Makes sense to mention the iterator is read-inly

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Makes sense to mention the iterator is read-inly

This comment has been minimized.

@oleg-nenashev
@oleg-nenashev
Show outdated Hide outdated core/src/main/java/hudson/model/Items.java
*/
private final Stack<ItemGroup> stack = new Stack<>();
/**
* The iterator of the current {@link ItemGroup} we

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Comment is not complete

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Comment is not complete

if (stack.isEmpty()) {
return false;
}
ItemGroup group = stack.pop();

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

I have a concern about the following case:

  • ItemGroup A contains folders D, C, and B
  • C and D are Item groups containing items
  • On the first cycle we go through A => D => (elements of D)
  • Once there is no Elements of D left, the code pops back to the ItemGroup A
  • The code takes the new iterator, so the delegate should get D as a next item again

Maybe I'm reading the code wrong since the test seems to cover this case

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

I have a concern about the following case:

  • ItemGroup A contains folders D, C, and B
  • C and D are Item groups containing items
  • On the first cycle we go through A => D => (elements of D)
  • Once there is no Elements of D left, the code pops back to the ItemGroup A
  • The code takes the new iterator, so the delegate should get D as a next item again

Maybe I'm reading the code wrong since the test seems to cover this case

This comment has been minimized.

@stephenc

stephenc Dec 15, 2016

Member

No it will exhaust the current iteration before popping the stack, not strictly breath first

@stephenc

stephenc Dec 15, 2016

Member

No it will exhaust the current iteration before popping the stack, not strictly breath first

This comment has been minimized.

@stephenc

stephenc Dec 15, 2016

Member

On the first cycle if D C and B are of type T, you will get D, C and B in any order, then it will pop one of those off the stack and process them, A is done when it starts its children

@stephenc

stephenc Dec 15, 2016

Member

On the first cycle if D C and B are of type T, you will get D, C and B in any order, then it will pop one of those off the stack and process them, A is done when it starts its children

// slower path because the caller has switched authentication
// we need to keep the original authentication so that allItems() can be used
// like getAllItems() without the cost of building the entire list up front
try (ACLContext ctx = ACL.as(authentication)) {

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Why don't you impersonate once and only once on the top level of hasNext()? I see not so much reason to potentially reauthenticate it multiple times within the method (depends on the three structure).

Calling the impersonation on the allItems() level could also improve the performance a bit

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Why don't you impersonate once and only once on the top level of hasNext()? I see not so much reason to potentially reauthenticate it multiple times within the method (depends on the three structure).

Calling the impersonation on the allItems() level could also improve the performance a bit

This comment has been minimized.

@stephenc

stephenc Dec 15, 2016

Member

Because the usage pattern may be different. With getAllItems() you get the list using current auth, you can then switch auth and do whatever you need...

With this, if you change auth after getting the Iterable it will behave the same as getting the List. Without the impersonation, we have an unpredictable iterable.

The fast case where you just use the same auth throughout will not impersonate.

for (Item i: Items.allItems(Jenkins.getInstance(), Item.class)) {
  doSomething(i); // has side-effect of changing auth
}

Now the side-effect may be legitimate, eg authorise project related... or it may be a bug...

But if we don't guard against it then it is not safe to replace getAllItems with allItems.

The slow part will be the call to getItems() as that does the permission checks, impersonation around that should be relatively cheap

@stephenc

stephenc Dec 15, 2016

Member

Because the usage pattern may be different. With getAllItems() you get the list using current auth, you can then switch auth and do whatever you need...

With this, if you change auth after getting the Iterable it will behave the same as getting the List. Without the impersonation, we have an unpredictable iterable.

The fast case where you just use the same auth throughout will not impersonate.

for (Item i: Items.allItems(Jenkins.getInstance(), Item.class)) {
  doSomething(i); // has side-effect of changing auth
}

Now the side-effect may be legitimate, eg authorise project related... or it may be a bug...

But if we don't guard against it then it is not safe to replace getAllItems with allItems.

The slow part will be the call to getItems() as that does the permission checks, impersonation around that should be relatively cheap

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

I do not feel it exactly answers my questions, but I agree the permission checks will be more expensive for non-System users. So it's rather NIT

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

I do not feel it exactly answers my questions, but I agree the permission checks will be more expensive for non-System users. So it's rather NIT

This comment has been minimized.

@stephenc

stephenc Dec 15, 2016

Member

For 99.9% of use cases, people will not be changing authentication while iterating, and hence

if (Jenkins.getAuthentication() == authentication) {
    delegate = group.getItems().iterator();
}

Will be the path taken... which will just be inlined to a call to SecurityContextHolder.getContext().getAuthentication()... given that the getItems() will be making this call anyway, the only side-effect in terms of performance is that we will have pre-seeded the CPU cache with the object chain for the SecurityContextHolder.getContext().getAuthentication() call anyway.

There are some cases where the authentication used to enumerate the items does not match to the current authentication.

In Core there is one example: https://github.com/jenkinsci/jenkins/pull/2665/files#diff-46dd855f38f94f25c68cbdae7e6ba64bL243 the example is correct:

  • User decides to rename / move a job
  • The operation runs as the user's permission
  • The item's renameTo does escalation to ACL.SYSTEM for the parts that require this escalation, but otherwise is running as the user who did the rename.
  • Thus, while iterating the iterable, the items should be the items visible to ACL.SYSTEM but the current authentication is not ACL.SYSTEM...

The second half of the branch is to cover this exact case.

HTH

@stephenc

stephenc Dec 15, 2016

Member

For 99.9% of use cases, people will not be changing authentication while iterating, and hence

if (Jenkins.getAuthentication() == authentication) {
    delegate = group.getItems().iterator();
}

Will be the path taken... which will just be inlined to a call to SecurityContextHolder.getContext().getAuthentication()... given that the getItems() will be making this call anyway, the only side-effect in terms of performance is that we will have pre-seeded the CPU cache with the object chain for the SecurityContextHolder.getContext().getAuthentication() call anyway.

There are some cases where the authentication used to enumerate the items does not match to the current authentication.

In Core there is one example: https://github.com/jenkinsci/jenkins/pull/2665/files#diff-46dd855f38f94f25c68cbdae7e6ba64bL243 the example is correct:

  • User decides to rename / move a job
  • The operation runs as the user's permission
  • The item's renameTo does escalation to ACL.SYSTEM for the parts that require this escalation, but otherwise is running as the user who did the rename.
  • Thus, while iterating the iterable, the items should be the items visible to ACL.SYSTEM but the current authentication is not ACL.SYSTEM...

The second half of the branch is to cover this exact case.

HTH

@stephenc

This comment has been minimized.

Show comment
Hide comment
@stephenc

stephenc Dec 15, 2016

Member

Given that the tests pass, I may see about replacing getAllItems where the order is not important to give core the perf improvements

Member

stephenc commented Dec 15, 2016

Given that the tests pass, I may see about replacing getAllItems where the order is not important to give core the perf improvements

stephenc added some commits Dec 15, 2016

[JENKINS-40252] Switch to allItems where traversal order is not impor…
…tant

- Also switch in cases where we have a subset that is likely significantly smaller and hence quicker to sort
@stephenc

This comment has been minimized.

Show comment
Hide comment
@stephenc

stephenc Dec 15, 2016

Member

@oleg-nenashev round 2? the reliable CI agrees that the tests pass... just Flaky McFlaky with its

screen shot 2016-12-15 at 14 21 38

Member

stephenc commented Dec 15, 2016

@oleg-nenashev round 2? the reliable CI agrees that the tests pass... just Flaky McFlaky with its

screen shot 2016-12-15 at 14 21 38

Show outdated Hide outdated core/src/main/java/hudson/util/RunList.java
@@ -76,7 +76,14 @@ public RunList(Collection<? extends Job> jobs) {
this.base = combine(runLists);
}
private Iterable<R> combine(Iterable<Iterable<R>> runLists) {
public static <J extends Job<J,R>, R extends Run<J,R>> RunList<R> fromJobs(Iterable<? extends J> jobs) {

This comment has been minimized.

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Javadoc is missing

@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Javadoc is missing

@oleg-nenashev

This comment has been minimized.

Show comment
Hide comment
@oleg-nenashev

oleg-nenashev Dec 15, 2016

Member

Looks good (excepting one minor comment) 🐝

Member

oleg-nenashev commented Dec 15, 2016

Looks good (excepting one minor comment) 🐝

@oleg-nenashev

This comment has been minimized.

Show comment
Hide comment
@oleg-nenashev

oleg-nenashev Dec 16, 2016

Member

re-🐝

Member

oleg-nenashev commented Dec 16, 2016

re-🐝

@daniel-beck

This comment has been minimized.

Show comment
Hide comment
@daniel-beck

daniel-beck Dec 16, 2016

Member

@stephenc This would also be the preferred approach for plugins that perform long-running operations on each item, as it would drastically lessen the chance of operating on since deleted items, right? Disk Usage Plugin comes to mind.

Member

daniel-beck commented Dec 16, 2016

@stephenc This would also be the preferred approach for plugins that perform long-running operations on each item, as it would drastically lessen the chance of operating on since deleted items, right? Disk Usage Plugin comes to mind.

@stephenc

This comment has been minimized.

Show comment
Hide comment
@stephenc

stephenc Dec 16, 2016

Member

@daniel-beck it would be the preferred approach everywhere unless you want the full list of items in sorted order.

If you only want a partial list, sorting a smaller list will in general be faster than building the full list in sorted order

If you do not care about the order, not sorting will be faster than sorting

If you only want to find the first, lazy iteration will be faster than full collection

If you have many jobs (like say somebody who has 100's of folders each with 1000's of jobs) then lazy iteration will have less memory pressure than getting the full list

Member

stephenc commented Dec 16, 2016

@daniel-beck it would be the preferred approach everywhere unless you want the full list of items in sorted order.

If you only want a partial list, sorting a smaller list will in general be faster than building the full list in sorted order

If you do not care about the order, not sorting will be faster than sorting

If you only want to find the first, lazy iteration will be faster than full collection

If you have many jobs (like say somebody who has 100's of folders each with 1000's of jobs) then lazy iteration will have less memory pressure than getting the full list

@@ -79,6 +80,44 @@
return false;
}
};
/**
* A comparator of {@link Item} instances that uses a case-insensitive comparison of {@link Item#getName()}.

This comment has been minimized.

@daniel-beck

daniel-beck Dec 16, 2016

Member

Wouldn't a natural sort be better? IIRC a user recently requested this.

I.e.

item1
item2
item10
@daniel-beck

daniel-beck Dec 16, 2016

Member

Wouldn't a natural sort be better? IIRC a user recently requested this.

I.e.

item1
item2
item10

This comment has been minimized.

@stephenc

stephenc Dec 16, 2016

Member

You can do whatever sorts you want, this is the existing sort that you need to restore the sort order from getAllItems() if you use allItems() to build a filtered sub-list

@stephenc

stephenc Dec 16, 2016

Member

You can do whatever sorts you want, this is the existing sort that you need to restore the sort order from getAllItems() if you use allItems() to build a filtered sub-list

@stephenc

This comment has been minimized.

Show comment
Hide comment
@stephenc

stephenc Dec 16, 2016

Member

I have not seen any objections to this... merging

Member

stephenc commented Dec 16, 2016

I have not seen any objections to this... merging

@stephenc stephenc merged commit 673cc13 into jenkinsci:master Dec 16, 2016

2 checks passed

Jenkins This pull request looks good
Details
continuous-integration/jenkins/pr-head This commit looks good
Details

oleg-nenashev added a commit that referenced this pull request Dec 17, 2016

@stephenc stephenc deleted the stephenc:jenkins-40252 branch Feb 15, 2017

@stephenc stephenc restored the stephenc:jenkins-40252 branch Feb 15, 2017

@stephenc stephenc deleted the stephenc:jenkins-40252 branch Feb 15, 2017

@oleg-nenashev

This comment has been minimized.

Show comment
Hide comment
@oleg-nenashev

oleg-nenashev Nov 9, 2017

Member

This method was used in https://github.com/jenkinsci/maven-plugin/blob/master/src/test/java/hudson/model/MockHelper.java. CC @olamy @aheritier . Mock Helper does not work correctly anymore in 2.37+. Will cleanup the tests

This method was used in https://github.com/jenkinsci/maven-plugin/blob/master/src/test/java/hudson/model/MockHelper.java. CC @olamy @aheritier . Mock Helper does not work correctly anymore in 2.37+. Will cleanup the tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment