-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16979. RBF: Add proxyuser port in hdfsauditlog #5552
Conversation
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need something more specific logic for dfsrouterPort.
As of today, Router rpc client is the only place that adds clientPort
and hence this PR logic would work fine and add dfsrouterPort
during audit log, but in future if someone else also adds clientPort
in the context, they would also automatically have dfsrouterPort
printed in audit logs, correct?
We need something very specific to router here.
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
IMHO we could do something like this: add a new key/value pair in the CallerContext at RouterRpcClient level, e.g. |
💔 -1 overall
This message was automatically generated. |
30acb1e
to
3ceae91
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped some comments, the use case sounds fair enough to me, we should log the ip & port of the proxying user, here it is router.
@@ -47,6 +47,7 @@ public final class CallerContext { | |||
// field names | |||
public static final String CLIENT_IP_STR = "clientIp"; | |||
public static final String CLIENT_PORT_STR = "clientPort"; | |||
public static final String DFSROUTER_PORT_STR = "dfsrouterPort"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now it is only router that proxies request and in "Hadoop" code, anyone can pass the CLIENT_IP_PORT, lets keep the name generic, something likePROXY_USER_PORT
, maybe? or something better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LiuGuH if this comment is addressed, then also things look good IMO.
The point is both dfsrouterPort
and using isClientPortInfoAbsent()
to set it in the audit logs, together seems fragile because anyone in future passes clientPort in the context and will get dfsrouterPort printed, which is not correct for non-router usecases.
public class TestNamenodeAuditlogWithDFSRouterPort { | ||
private static final Logger LOG = LoggerFactory.getLogger( | ||
TestNamenodeAuditlogWithDFSRouterPort.class); | ||
private static final Pattern AUDIT_WITH_PORT_PATTERN = Pattern.compile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need a new class for this test? No, right?
Explore adding the TestRouterRpc
, can derive pointers from testRealUserPropagationInCallerContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my environment, there is a maven conflict if I add dependence hadoop-hdfs-rbf test jar in hadoop-hdfs . The error detail is { java: Modules hadoop-yarn-server-tests and hadoop-yarn-server-resourcemanager must have the same 'additional command line parameters' specified because of cyclic dependencies between them } . I confused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestRouterRpc is in hadoop-rbf only, why you need to add hdfs-rbf jar hadoop-hdfs?
//When the request comes from the dfsrouter, origContext must contains CLIENT_PORT_STR. See RouterRpcClient.addClientInfoToCallerContext | ||
String clientPort = isClientPortInfoAbsent(ctx) ? CallerContext.CLIENT_PORT_STR : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find this a problem, Even if someone else sends Client port, we are just logging his actual port in the audit log,
Though I don't like the name of the config mentioning dfsrouter, let it be any client who sends you the client port, if someone sends you a port via client port, we just added the actual client port.
At worst if someone finds serious objections on having this in, We can consider doing this only if the guy sending the request is configured under dfs.namenode.ip-proxy-users
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this all in the Namenode code?
Using Server.getRemotePort(). https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java#L394
Then it can be more generic. We can print the port of the proxyUser if it is different from the clientPort. Or even just always print the user port since we already print the IP.
...a/org/apache/hadoop/hdfs/server/federation/router/TestNamenodeAuditlogWithDFSRouterPort.java
Outdated
Show resolved
Hide resolved
09f0212
to
3632f46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no isDfsRouter
or dfsRouterPort
, namenode in general shouldn't be doing router specific checks the code should be generic
change those callercontext values I mentioned above as something like PROXY_USER_PORT
#5552 (comment)
the test can be adjusted in existing class TestRouterRpc is a good candidate, ICYMI
FWIW. It is not the router which only passes Client details via CallerContext, it is a common practice in many downstream projects.
One I know is Hive (because I did it)
2023-03-30 13:25:10,899 INFO FSNamesystem.audit: allowed=true ugi=hive/cdp.3.210.175.29.nip.io@WORKSHOP.COM (auth:KERBEROS) ip=/10.0.1.16 cmd=open src=/warehouse/tablespace/managed/hive/abc/delta_0000001_0000001_0000/bucket_00000_0 dst=null perm=null proto=rpc callerContext=hive_sessionId_96572167-cb5e-4615-89ef-a542caa33148_User:ImpersonatedTestUser
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor stuff, else lgtm
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
...op-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Jenkins doesn't have any complains, the changes LGTM
@simbadzina have been around this code, any comments/suggestions?
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A checkstyle warning around space after , needs to be fixed
Once fixed I don't have anything further to add.
@goiri does this look fair enough, any comments/suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A checkstyle warning around space after , needs to be fixed
Once fixed I don't have anything further to add.
@goiri does this look fair enough, any comments/suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new code looks good to me.
@LiuGuH for future changes, please keep the commit history so people can see the changes between reviews.
OK,I'll pay attention to that. Thank you. |
💔 -1 overall
This message was automatically generated. |
Why all checks have failed ? Any suggestion for me to do with it? |
I'd try rebasing on the latest trunk. |
Those failing tests doesn't look related but have retriggered the build. Let's see... |
💔 -1 overall
This message was automatically generated. |
LGTM. |
Description of PR
when client is using proxyuser via realuser, the hdfs aduilg log is lack of dfsrouter port infomation.
client (using proxyuser)-> dfsrouter -> namenode
clientport dfsrouterport
hdfsauditlog should record dfsrouterport
How was this patch tested?
I add a test case for this.