Skip to content

Conversation

@DomGarguilo
Copy link
Member

In #1099 there are some lists of methods that have been reported as "hot method too big".

This PR refactors some of those methods to make them smaller. After these changes, the following methods are no longer reported as "hot method too big":

  • org.apache.accumulo.core.file.rfile.RelativeKey::<init>
  • org.apache.accumulo.core.file.rfile.RelativeKey::readFields
  • org.apache.accumulo.server.fs.FileManager$ScanFileManager::openFiles

The following have been refactored to reduce size, but are still being reported:

  • org.apache.accumulo.core.iteratorsImpl.system.LocalityGroupIterator::_seek
  • org.apache.accumulo.tserver.scan.LookupTask::run

@DomGarguilo
Copy link
Member Author

I have marked this as WIP since there is still work to be done here:

  • Look into refactoring more methods appearing in the list of methods in Examine Hot Methods #1099
  • Rename new methods. I am not sure what to name the new methods so suggestions are very welcome.
    • Namely, the new methods in RelativeKey: extracted() and getBytes(). Both of which I have left as thier IDE generated names

@DomGarguilo DomGarguilo marked this pull request as ready for review July 18, 2022 19:12
@DomGarguilo DomGarguilo changed the title WIP: First pass at refactoring hot methods Refactor hot methods Jul 18, 2022
@EdColeman
Copy link
Contributor

EdColeman commented Jul 19, 2022

Sorry - this comment applies to #2812

Looking at the method in 'Key' the method that performs the equality check on a byte array private static boolean isEqual(byte[] a1, byte[] a2) is already optimized for Accumulo row comparisons, checking the last few bytes first and does not seem to need additional optimization.

@DomGarguilo
Copy link
Member Author

Looking at the method in 'Key' the method that performs the equality check on a byte array private static boolean isEqual(byte[] a1, byte[] a2) is already optimized for Accumulo row comparisons, checking the last few bytes first and does not seem to need additional optimization.

Is this in relation to #2812? If so, do you think its worth investigating rearranging some of the comparisons made as mentioned in your comments here, or do you think #2812 should be closed?

@DomGarguilo
Copy link
Member Author

A few more methods were found (details). I'll look into those now, before this is merged.

@DomGarguilo
Copy link
Member Author

A few more methods were found (details). I'll look into those now, before this is merged.

Never mind. Took a look and I don't think those methods will be changed in this PR. This PR is ready for review.

@DomGarguilo
Copy link
Member Author

I have refactored Key.equals() in a way that makes it not show up in the hot methods list any more. Here is the method after refactoring:

  public boolean equals(Key other, PartialKey part) {
    boolean result = isEqual(row, other.row);
    if (part == ROW)
      return result;
    result &= isEqual(colFamily, other.colFamily);
    if (part == ROW_COLFAM)
      return result;
    result &= isEqual(colQualifier, other.colQualifier);
    if (part == ROW_COLFAM_COLQUAL)
      return result;
    result &= isEqual(colVisibility, other.colVisibility);
    if (part == ROW_COLFAM_COLQUAL_COLVIS)
      return result;
    result &= (timestamp == other.timestamp);
    if (part == ROW_COLFAM_COLQUAL_COLVIS_TIME)
      return result;
    result &= (deleted == other.deleted);
    if (part == ROW_COLFAM_COLQUAL_COLVIS_TIME_DEL)
      return result;

    throw new IllegalArgumentException("Unrecognized partial key specification " + part);
  }

It will fall through and compare each PartialKey until it reaches part and returns. Here is the current code to compare:

public boolean equals(Key other, PartialKey part) {
switch (part) {
case ROW:
return isEqual(row, other.row);
case ROW_COLFAM:
return isEqual(row, other.row) && isEqual(colFamily, other.colFamily);
case ROW_COLFAM_COLQUAL:
return isEqual(row, other.row) && isEqual(colFamily, other.colFamily)
&& isEqual(colQualifier, other.colQualifier);
case ROW_COLFAM_COLQUAL_COLVIS:
return isEqual(row, other.row) && isEqual(colFamily, other.colFamily)
&& isEqual(colQualifier, other.colQualifier)
&& isEqual(colVisibility, other.colVisibility);
case ROW_COLFAM_COLQUAL_COLVIS_TIME:
return isEqual(row, other.row) && isEqual(colFamily, other.colFamily)
&& isEqual(colQualifier, other.colQualifier)
&& isEqual(colVisibility, other.colVisibility) && timestamp == other.timestamp;
case ROW_COLFAM_COLQUAL_COLVIS_TIME_DEL:
return isEqual(row, other.row) && isEqual(colFamily, other.colFamily)
&& isEqual(colQualifier, other.colQualifier)
&& isEqual(colVisibility, other.colVisibility) && timestamp == other.timestamp
&& deleted == other.deleted;
default:
throw new IllegalArgumentException("Unrecognized partial key specification " + part);
}
}

While this refactored method is smaller and no longer "hot", it should probably be determined if this has any significant performance hits which is what is being invstigated in #2812

@milleruntime
Copy link
Contributor

Thanks for taking a look at these methods. I will take a look at this today.

Copy link
Contributor

@milleruntime milleruntime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it makes a difference with the compiler but we have a bad habit of jamming static classes within another class. A lot of these could be pulled out into their own class.

@DomGarguilo
Copy link
Member Author

In 89a2f32 I added the equals method described above. This updated method does not appear on the hot methods list.

# Conflicts:
#	server/tserver/src/main/java/org/apache/accumulo/tserver/scan/LookupTask.java
# Conflicts:
#	server/tserver/src/main/java/org/apache/accumulo/tserver/scan/LookupTask.java
@milleruntime
Copy link
Contributor

@DomGarguilo is this PR ready to merge? If so, I can take another close look at it.

@DomGarguilo
Copy link
Member Author

@DomGarguilo is this PR ready to merge? If so, I can take another close look at it.

Yes, I think I am done making changes here and was just waiting for someone to verify these changes.

Co-authored-by: Keith Turner <kturner@apache.org>
Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't finish looking at all of this. I realized it may be easier to try to fix one thing at a time, rather than try to fix all hot methods in a single big PR. Smaller, more narrowly scoped PRs get reviewed and merged faster, and you can build on them incrementally with lower risk. And, there's less drift and merge conflicts when smaller PRs get merged faster.

DomGarguilo and others added 3 commits September 29, 2022 11:59
Co-authored-by: Keith Turner <kturner@apache.org>
Co-authored-by: Christopher Tubbs <ctubbsii@apache.org>
# Conflicts:
#	server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableImpl.java
@ctubbsii ctubbsii merged commit cd0113e into apache:main Sep 29, 2022
@DomGarguilo DomGarguilo deleted the hotMethods branch September 29, 2022 18:21
@ctubbsii ctubbsii added this to the 2.1.0 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants