Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostics pages #275

Merged
merged 11 commits into from Nov 11, 2016

Conversation

Projects
None yet
7 participants
@rsandell
Copy link
Member

commented Mar 1, 2016

Some management pages to get some diagnostics views into the internals of the trigger. Usable to troubleshoot why some strange behaviours are happening that I used to have system groovy scripts for.

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 1, 2016

@reviewbybees

@scoheb do you happen to have a somewhat more busy Jenkins environment than my laptop and possibly take this change for a spin there to see if it isn't too intrusive or breaks anything? I doubt it will, but I don't have enough resources to test it.

@scoheb

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2016

I will back at work tomorrow. I will take a look then!

@jenkinsadmin

This comment has been minimized.

Copy link
Member

commented Mar 1, 2016

Thank you for this pull request! Please check this document for how the Jenkins project handles pull requests.

*
* @return the list of GerritServer.
*/
* Gets the list of GerritServer.

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 2, 2016

Author Member

Darn! I never asked for these to be changed. must have been a slip when I optimized the imports.

}
menu.add(new MenuItem()
.withUrl(url.toString())
.withStockIcon("clipboard.png")

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 2, 2016

Author Member

If anyone could conjure up any better icons I would be so happy.

l.layout(title: _("${report.getDisplayName()} - Gerrit Trigger Diagnostics"), norefresh: false, permission: Diagnostics.requiredPermission) {
l.header {
style {
raw("""

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 2, 2016

Author Member

It's not the ugliest thing I've managed to produce at least ;)

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 2, 2016

Author Member

But close...

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Mmm, I'm going to go for the inline bash script in the cje-war pipeline. This is not so bad to read, for me.

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

I was more referring to the actually rendered page ;)

@rsandell rsandell changed the title WiP Diagnostics pages Diagnostics pages Mar 2, 2016

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 2, 2016

Needs some tests, but other than that I'm ready for review.

@reviewbybees

This comment has been minimized.

Copy link

commented Mar 2, 2016

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

@scoheb

This comment has been minimized.

Copy link
Contributor

commented Mar 2, 2016

Some comments:

  1. The /gerrit-trigger/diagnostics/eventListeners/ page is nice but it can get VERY large and cumbersome when dealing with thousands of jobs. Not quite sure if a "paged" concept is possible with an additional ALL link.
  2. The /gerrit-trigger/diagnostics/buildMemory/ page does not seem to work for me. I had one non-silent job that was triggered and reported back to Gerrit. The above page was still empty.
@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 3, 2016

It should be possible to "paginate" that page, or whatever the term is :) Perhaps some info can be removed as well to limit the size of each row? I just somewhat arbitrarily put some values in that I thought would be useful.

I haven't tested the build memory page enough/close to not at all, so there might be something I missed in the implementation.

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 3, 2016

@scoheb I found some potential NPE's that might have been hidden in your environment but showed up when I did some more testing on the buildMemory page. Fixes are in the last commit.

List<Map.Entry<GerritTriggeredEvent, List<BuildMemory.MemoryImprint.Entry>>> entries =
report.getSortedEntrySet();

for (Map.Entry<GerritTriggeredEvent, List<BuildMemory.MemoryImprint.Entry>> entry : entries) {

This comment has been minimized.

Copy link
@stephenc

stephenc Mar 8, 2016

Member

how many of these do you envision being in the bundle? are we talking MB's of output? If lots of output it might be better to replace with a summary plus the average stddev and the worst min(25,10%) followed by a list of all the job names

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Agree about this, but for a different reason: I'm wondering if something more selective & pre-filtered will be easier to interpret for diagnostic purposes.

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

But how would I know which 25% to add? This list is produced to find something that might indicate a bug; stray entries etc.

The number of items, as I wrote in the conversation, is based on how many builds that are triggered at this time in the system. So even on a large install that would only be in the hundreds of entries.


@Override
public void addContents(@NonNull Container container) {
container.add(new PrintedContent("gerrit/event-listeners.md") {

This comment has been minimized.

Copy link
@stephenc

stephenc Mar 8, 2016

Member

if this is going to be producing lots of content it should not be enabled by default... or else find some better way to summarize

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Huge, huge +1

@stephenc

This comment has been minimized.

Copy link
Member

commented Mar 8, 2016

my main concern is that you are potentially generating a lot of data for the support bundles...

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2016

@stephenc Yes the size maybe is a concern. The list where you commented basically contains one line per running build of a job that was triggered and one line per "event" that triggered that build.
At my old place we could see maybe 200 to 400 as an average max, so I don't think that is too large. Unless there is something broken somewhere so you start seeing stale builds hanging around (basically what you would like to catch in this report) then we are talking about potential MBs.

The other report will contain a few lines per job that is configured, so in @scoheb 's case that's a couple of thousand jobs and can easily pass a couple of MBs.

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 9, 2016

But they should compress well

*/
* Gets the list of GerritServer.
*
* @return the list of GerritServer.

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Did you mean to go in and add a single additional space to all of these?

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

Nope, see my comment above.

@@ -281,7 +295,11 @@ public GerritServer doAddNewServer(StaplerRequest req, StaplerResponse rsp) thro

@Override
public Object getTarget() {
Hudson.getInstance().checkPermission(Hudson.ADMINISTER);
Jenkins jenkins = Jenkins.getInstance();

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Why did we switch from Hudson instance to Jenkins here?

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

Inherital sin

*
* @return the name
*/
String getDisplayName(); //TODO Replace with a default method in GerritEventListener when moving to Java 8

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Is this allowed/intended to return null? This one would be very helpful to add the appropriate Findbugs annotation to.

* @param name the url
* @return an url, relative or not.
*/
private String makeRelativeUrl(String context, String name) {

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

I feel like this logic must exist somewhere else, with less potential for incorrect handling. Perhaps java.net.URI?

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

java.net.URI seems to require absolute url with schema and all.
Jenkins has a Functions.joinPath(String... components) but it doesn't handle null values for path components so that if statement would need to be here in this case anyway.

Run run = en.getBuild();
String name = "<unknown>";
if (job != null) {
name = job.getFullName(); //Or should it be fullDisplayName?

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Yes, get fullDisplayName

* @author Robert Sandell &lt;rsandell@cloudbees.com&gt;.
*/
@Extension(optional = true)
public class EventListenersComponent extends Component {

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

This is quite useful. +1

public synchronized BuildMemoryReport report() {
BuildMemoryReport report = new BuildMemoryReport();
for (Map.Entry<GerritTriggeredEvent, MemoryImprint> entry : memory.entrySet()) {
List<Entry> triggered = new LinkedList<Entry>();

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

Why a LinkedList here and not an arraylist? Easier to follow if you call the 'for...' variable here 'triggering'

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

Because an ArrayList preallocates memory that might not be needed, and when that memory is full it needs to allocate even more new memory and move that over.
According to what basic algorithm theory taught me in school; an ArrayList is only effective to use when you need to access specific indices in the list. Everywhere else it is ineffective.
Since this list will only be used to iterate over from start to end once or twice, a LinkedList is more effective to use.

This comment has been minimized.

Copy link
@reviewbybees

reviewbybees Mar 10, 2016

An ArrayList will have better cache locality when iterating and the many nodes of a LinkedList make it worse in terms of wasted memory.

Unlearning needs of you, young padawan.

Use a LinkedList when many inserts you expect. For append-only ArrayList is much better than you think

Sent from my iPhone

On 10 Mar 2016, at 11:47, Robert Sandell notifications@github.com wrote:

In src/main/java/com/sonyericsson/hudson/plugins/gerrit/trigger/gerritnotifier/model/BuildMemory.java:

@@ -406,6 +408,24 @@ public void setEntryFailureMessage(GerritTriggeredEvent event, Run r, String fai
}

 /**
  • \* Creates a snapshot clone of the current coordination memory status.
    
  • *
    
  • \* @return the report
    
  • */
    
  • @nonnull
  • public synchronized BuildMemoryReport report() {
  •    BuildMemoryReport report = new BuildMemoryReport();
    
  •    for (Map.Entry<GerritTriggeredEvent, MemoryImprint> entry : memory.entrySet()) {
    
  •        List<Entry> triggered = new LinkedList<Entry>();
    
    Because an ArrayList preallocates memory that might not be needed, and when that memory is full it needs to allocate even more new memory and move that over.
    According to what basic algorithm theory taught me in school; an ArrayList is only effective to use when you need to access specific indices in the list. Everywhere else it is ineffective.
    Since this list will only be used to iterate over from start to end once or twice, a LinkedList is more effective to use.


Reply to this email directly or view it on GitHub.

You received this message because you are subscribed to the Google Groups "engineering-code-reviews" group.
To unsubscribe from this group and stop receiving emails from it, send an email to engineering-code-reviews+unsubscribe@cloudbees.com.
To post to this group, send email to engineering-code-reviews@cloudbees.com.
To view this discussion on the web visit https://groups.google.com/a/cloudbees.com/d/msgid/engineering-code-reviews/jenkinsci/gerrit-trigger-plugin/pull/275/r55668111%40github.com.

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

@rsandell True at an algorithms level, but it ignores the impact of locality of reference on the cache, and the extra object allocation + memory fragmentation for LinkedLists. That impact is quite gigantic, it turns out. The ArrayList can do a direct memory copy of the backing array contents (via System.arrayCopy) on array resize, which is memory bandwidth-limited and extremely fast (plus cache-friendly). Then each append operation is simply writing the pointer to the added object on the last free array element.

In comparison, LinkedList has to lookup the tail node, do heap allocation for a new node object (which comes with a larger overhead for each entry, due to created objects), and set the pointers for that node and the previous.

The memory structure also impacts iteration performance to some extent.

ArrayLists generally use considerably less memory than LinkedLists, because you're only storing an Object pointer per array slot, vs an object + a couple pointers (IIRC there's a stack overflow question out there on this with specific numbers). Even with 30-50% of your arraylist empty, they're still smaller.

Basically, even in many cases where LinkedList should be much faster than ArrayList, the difference is often un-noticeable due to memory effects. LinkedList only tends to be more performant with tiny arrays (where the performance is largely irrelevant), or special cases like removing elements via ListIterator from large lists. With big arrays, where it matters, ArrayList can end up being much faster as the collection grows (which is where performance matters).

http://stackoverflow.com/questions/5346039/java-linkedlist-slower-than-arraylist-when-adding-elements

And a basic benchmark and results here:
https://gist.github.com/svanoort/4256d1801d97a2b7eef3

There are actually articles out there now that boil down to "never use LinkedList" for this reason.
https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again/

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 11, 2016

Author Member

That will take some time to unlearn :)

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 11, 2016

Author Member

Otoh the article that @svanoort linked to had a big disclaimer in the beginning that the title should be "Why you should never, ever, EVER use linked-list for number crunching" Where you need to do some back and forth in the list, and I've never disputed that ;)

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 12, 2016

Member

@rsandell But then they show straightforward benchmarks of common use cases.

The only cases where you see a major performance benefit for LinkedLists vs ArrayLists are where you repeatedly trigger large ArrayCopy operations (for example, insertion at the middle or beginning, and inserts at beginning may be reversed).

There's a reason the ArrayDeque (circular array impl) is recommended over Stack/Queue/LinkedList versions and some sources describe it as a bug to use the older classes in that way. But then again I'm an optimization nerd...

@svanoort

This comment has been minimized.

Copy link
Member

commented Mar 10, 2016

Overall:

  • Good:
    • the approach makes sense here, for dumping running state to use in debugging.
    • General implementation appears correct, from what I can tell, but concerned I might be missing something due to overall size here.
  • Bad:
    • It feels like we may be dumping a lot of information that may not be pertinent. I'd ask if we can reduce the amount of log produced here, to generate a better signal-to-noise ratio for analysis. Also, it might be helpful to offer separate options to dump for BuildMemory and EventListeners (with the former off by default, and the latter on).
    • I would consider if it might be better to annotate the appropriate classes for serialization to a dump, rather than writing custom logging here?
  • BuildMemoryReport object has a lot of deeply nested types and collections. Could we simplify this in some way to be easier to follow?
  • CheckMes: the null handling here feels iffy, I'm not 100% sure what intent is in some cases for proper behavior.

Unfortunately I'm not familiar enough with the groovy views here to follow those components well enough for proper review yet.

*
* @return a sorted list of this report's entries.
*/
public List<Map.Entry<GerritTriggeredEvent, List<BuildMemory.MemoryImprint.Entry>>> getSortedEntrySet() {

This comment has been minimized.

Copy link
@svanoort

svanoort Mar 10, 2016

Member

My eyes are bleeding about now. There's got to be a more straightforward way to do this.

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 10, 2016

Author Member

Not really, this is one of the more straight forward ways there are. But I guess you mean "better looking with less generics"? Then yes, but those are imho less straight forward ;)

@rsandell

This comment has been minimized.

Copy link
Member Author

commented Mar 10, 2016

@svanoort @stephenc Thanks for the review. I'll make some adjustments.

@stephenc

This comment has been minimized.

Copy link
Member

commented Oct 5, 2016

🐝 AIUI but I suspect you need a rebase before you can call this done

rsandell added some commits Mar 1, 2016

Moved the reports and main page into a separate package
Will make it a bit more clear where they are and make future
improvements easier.
Support Bundle component for the BuildMemoryReport
Support Core plugin usage is optional
Support Bundle component for EventListenersReport
Some minor refactorings as well to better reuse some things for the report.
Added a way to fake trigge some events in a development environment.
Fixed some issues in the diagnostic pages found while triggering some events.

@rsandell rsandell force-pushed the diagnostics branch from a800db2 to 830d288 Oct 7, 2016

@abayer

This comment has been minimized.

Copy link
Member

commented Oct 10, 2016

🐝

@rsandell rsandell merged commit 166b26c into master Nov 11, 2016

1 check passed

Jenkins This pull request looks good
Details

@rsandell rsandell deleted the diagnostics branch Nov 11, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.