New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29720][CORE] Add linux condition to make ProcfsMetricsGetter more complete #26365
Conversation
Can one of the admins verify this patch? |
Also cc @maropu @dongjoon-hyun |
away from a keyboard now, so will check tonight. cc: @viirya |
This is not my area though, is this any harm without this pr? It seems these metrics are turned off when exception caught:
|
This catch just confirm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -26,6 +26,8 @@ import scala.collection.mutable | |||
import scala.collection.mutable.ArrayBuffer | |||
import scala.util.Try | |||
|
|||
import org.apache.hadoop.util.Shell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about checking os.name
instead of using hadoop-commons?
@@ -62,13 +64,14 @@ private[spark] class ProcfsMetricsGetter(procfsDir: String = "/proc/") extends L | |||
SparkEnv.get.conf.get(config.EVENT_LOG_STAGE_EXECUTOR_METRICS) | |||
val shouldLogStageExecutorProcessTreeMetrics = | |||
SparkEnv.get.conf.get(config.EVENT_LOG_PROCESS_TREE_METRICS) | |||
procDirExists.get && shouldLogStageExecutorProcessTreeMetrics && shouldLogStageExecutorMetrics | |||
procDirExists.get && shouldLogStageExecutorProcessTreeMetrics && | |||
shouldLogStageExecutorMetrics && Shell.LINUX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the os check in the first condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we make a /proc?
@@ -62,13 +64,14 @@ private[spark] class ProcfsMetricsGetter(procfsDir: String = "/proc/") extends L | |||
SparkEnv.get.conf.get(config.EVENT_LOG_STAGE_EXECUTOR_METRICS) | |||
val shouldLogStageExecutorProcessTreeMetrics = | |||
SparkEnv.get.conf.get(config.EVENT_LOG_PROCESS_TREE_METRICS) | |||
procDirExists.get && shouldLogStageExecutorProcessTreeMetrics && shouldLogStageExecutorMetrics | |||
procDirExists.get && shouldLogStageExecutorProcessTreeMetrics && | |||
shouldLogStageExecutorMetrics && Shell.LINUX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure to support only Linux? Other Unix-like systems which have /proc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In hadoop ProcfsMetrics, it checks this ProcfsBasedProcessTree and it already exists long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like in isProcfsAvailable, we check /proc existence instead of checking Linux like hadoop. Isn't it enough? And for other Unit-like systems which have /proc? We do not need to follow ProcfsBasedProcessTree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIC the procfs is the linux feature, other unix-like os may be try to compatible with it. Check the /proc dir is not a better idea than check linux os. One point is that, someone can change the /proc information which os not support procfs native and it will be a vulnerability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If /proc is in other Unix-like systems, this change actually removes the support on these systems. Unless we are sure current code does not work for such systems, I think we should not simply remove their support.
Yeah, I don't see the point here. If you've tried to fake /proc then that's the problem. I'm slightly concerned this will make it stop working on an OS that is UNIX-like but without "Linux" in the name. |
I agree with others here. prosfs based systems are not just linux systems ( I have a vague memory that this was discussed when the pr for prosfs metrics was opened) and also this feature isn't enabled by default. But I think we need to add documentation for the config saying that the feature is only working on prosfs based systems. We missed that doc at the time of adding the feature. I mean adding a ".doc" for the config in package.scala |
Actually I am not sure whether we will report the correct metrics in other procfsbased systems since our computation is based on http://man7.org/linux/man-pages/man5/proc.5.html which is linux dependent. I haven't tested this feature on non-linux based systems. So I think it make sense to check whether the OS is actually linux. Thanks for the PR. |
Thanks for review. |
@ulysses-you In my last comment I said this PR make sense. Although I juts checked the Solaris. There is no stat file there so the prosfcs metrics will raise an exception and return all zero because of the check in this line. It won't give back wrong info to the user in case of solaris. The same may be correct for other OSes. Still no harm in having your change if others also agree. |
@rezasafi Oh I see... pending the last review. |
The problem isn't Solaris, but rather, Linux distros that for whatever reason don't begin with "Linux" in os.name. I think it's not worth possibly breaking this. |
I agree with Sean, I'm not really sure what the value is of adding this, and it'll potentially prevent the feature from being used in cases it should be fine. Suppose somebody does create |
Thanks for review, I will close this. |
Is this why I keep getting the following warning on windows?
|
What changes were proposed in this pull request?
Add Shell.LINUX condition . Related pr 22612 .
Why are the changes needed?
The proc is only support for linux. We can just
mkdir /proc
on mac os and spark cannot recognize what is . So it should add a linux condition .Does this PR introduce any user-facing change?
No.
How was this patch tested?
Exists UT.