Skip to content

Comments

[DRAFT][SPARK-23607][CORE] Use HDFS extended attributes to store application summary information in SHS#40949

Closed
thejdeep wants to merge 1 commit intoapache:masterfrom
thejdeep:SPARK-23607-1
Closed

[DRAFT][SPARK-23607][CORE] Use HDFS extended attributes to store application summary information in SHS#40949
thejdeep wants to merge 1 commit intoapache:masterfrom
thejdeep:SPARK-23607-1

Conversation

@thejdeep
Copy link
Contributor

What changes were proposed in this pull request?

Previously, #34829 was raised which was auto closed due to staleness. This is a rework of that by addressing comments raised.

This PR seeks to improve the performance of serving the application list in History Server by storing the required information of the application as part of HDFS extended attributes instead of parsing the log file each time.

Why are the changes needed?

Improves the performance of the History Server listing page

Below is the comparison of the time taken to complete mergeApplicationListing call in FsHistoryProvider that is called for every log file that is updated in the event log directory :

Event log size Extended Attributes disabled (in ms) Extended Attributes enabled (in ms)
122MB 1340 137
10.2MB 866 135
5.5MB 645 136
0.6MB 505 134
0.8MB 525 137

As we can see in the comparison above, irrespective of the event log size, the time to build the application listing for the app remains the same. This is hugely beneficial in clusters that have very large event log sizes.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Will add unit tests

 This PR seeks to improve the performance of serving the application list in History Server by storing the required information of the application as part of HDFS extended attributes instead of parsing the log file each time.

 ### Why are the changes needed?

 Improves the performance of the History Server listing page

 ### Does this PR introduce _any_ user-facing change?

 No.

 ### How was this patch tested?
 Unit tests to be added
@github-actions github-actions bot added the CORE label Apr 25, 2023
Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took one pass through the changes.
We need to add unit tests to validate these changes are working as expected.

case _: IOException =>
logWarning(s"Failed to set extended attribute ${attrName}")
case _: UnsupportedOperationException =>
logWarning("setXAttr not supported by filesystem")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make these INFO ?

.createOptional

val EVENT_LOG_XATTR_ENABLED = ConfigBuilder("spark.history.fs.eventLog.xattr.enabled")
.version("3.3.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version 3.5.0

None, reader.completed))
count = count + 1
reader.fileSizeForLastIndex > 0
handleNewLogFile(reader, newLastScanTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increment count

var xAttrStatus = LogXAttrStatus.XATTR_DISABLED
if(HISTORY_LOG_USE_XATTR) {
val isXAttrEnabled = getXAttr(reader.rootPath, EventLoggingListener.USER_XATTR_ENABLED)
if (isXAttrEnabled.isDefined && new String(isXAttrEnabled.get) == "true") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encode/decode bytes explicitly using UTF-8, and not platform default.

nit: Also, use something like
xattrEnabled.exists(data => new String(data, StandardCharsets.UTF_8) == "true") instead of the current condition

private def handleNewLogFile(reader: EventLogFileReader, scanTime: Long): Boolean = {
var xAttrStatus = LogXAttrStatus.XATTR_DISABLED
if(HISTORY_LOG_USE_XATTR) {
val isXAttrEnabled = getXAttr(reader.rootPath, EventLoggingListener.USER_XATTR_ENABLED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isXAttrEnabled -> xattrEnabled

case _: NoSuchElementException =>
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly duplicated in doMergeApplicationListingInternal - and there is some changes between both - please update this method and ensure doMergeApplicationListingInternal delegates to this method as well.

try {
val valueMap = fs.getXAttrs(path, nameList.asJava)
if(valueMap != null) {
Some(valueMap.asScala.toMap.map(value => value._1 -> new String(value._2)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier, use utf-8 when converting bytes to String

private def getXAttr(path: Path, name: String): Option[String] = {
val valueMap = getXAttrs(path, List(name))
if (!valueMap.isEmpty) {
Some(valueMap.get(name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Option instead of Some

var name: String = null

def toView(attempts: List[AttemptInfoWrapper]): ApplicationInfoWrapper = {
val apiInfo = ApplicationInfo(id.get, name, None, None, None, None, Nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we assuming id wont be None ?
If that is not a valid assumption, replace with id.orNull

viewAcls,
adminAclsGroups,
viewAclsGroups)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changed in MutableAttemptInfo ?
Also, why did we pull it out ?

@github-actions
Copy link

github-actions bot commented Aug 8, 2023

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants