-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-49090][CORE] Support JWSFilter
#47575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
<groupId>io.jsonwebtoken</groupId> | ||
<artifactId>jjwt-jackson</artifactId> | ||
<version>0.12.6</version> | ||
<scope>test</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this as a test dependency for now because the user may want to use GSON instead of this.
Could you review this PR about Spark UI (including Spark Cluster), @viirya ? |
val claims = Jwts.parser().verifyWith(key).build().parseSignedClaims(token) | ||
chain.doFilter(req, res) | ||
case _ => | ||
hres.sendError(HttpServletResponse.SC_FORBIDDEN, s"Malformed ${AUTHORIZATION} header.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hres.sendError(HttpServletResponse.SC_FORBIDDEN, s"Malformed ${AUTHORIZATION} header.") | |
hres.sendError(HttpServletResponse.SC_FORBIDDEN, s"Malformed JWT ${AUTHORIZATION} header.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, but actually, the previous one is better because Bearer
is one of type
~
Authorization: <type> <credentials>
pattern is W3C in HTTP 1.0 spec, instead of a specific to JTW
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but the current one also doesn't have Bearer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and, the missing Bearer is an issue of Authorization
header, not a JWT token. Here, JWT token itself doesn't exist yet.
<dependency> | ||
<groupId>io.jsonwebtoken</groupId> | ||
<artifactId>jjwt-api</artifactId> | ||
<version>0.12.6</version> | ||
</dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users don't use the JWSFilter feature, we still need to include this new dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for now. Of course, we can make this as a profile
.
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Thank you, @viirya ! |
jettison/1.5.4//jettison-1.5.4.jar | ||
jetty-util-ajax/11.0.21//jetty-util-ajax-11.0.21.jar | ||
jetty-util/11.0.21//jetty-util-11.0.21.jar | ||
jjwt-api/0.12.6//jjwt-api-0.12.6.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update our NOTICE-binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Sure, @yaooqinn ! It's Apache License. Let me add this item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update! @dongjoon-hyun
Thank you, @yaooqinn ! |
Merged to master for Apache Spark 4.0.0-preview2. |
### What changes were proposed in this pull request? This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration. Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API. - #47575 ### Why are the changes needed? Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API . For example, we can protect `JWSFilter` to `Spark Master REST API` like the following. **MASTER REST API WITH JWSFilter** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ SPARK_NO_DAEMONIZE=1 \ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \ sbin/start-master.sh ``` **AUTHORIZATION FAILURE** ``` $ curl -v -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Sat, 03 Aug 2024 22:18:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 590 < Server: Jetty(11.0.21) < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/v1/submissions/clear</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/> </body> </html> * Connection #0 to host localhost left intact ``` **SUCCESS** ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Sat, 03 Aug 2024 22:16:51 GMT < Content-Type: application/json;charset=utf-8 < Content-Length: 113 < Server: Jetty(11.0.21) < { "action" : "ClearResponse", "message" : "", "serverSparkVersion" : "4.0.0-SNAPSHOT", "success" : true * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new feature which is not loaded by default. ### How was this patch tested? Pass the CIs with newly added test case. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47595 from dongjoon-hyun/SPARK-49103. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…REST API and rename parameter to `secretKey` ### What changes were proposed in this pull request? This PR aims the following. - Document `JWSFilter` and its usage in `Spark UI` and `REST API` - `Spark UI` section of `Configuration` page - `Spark Security` page - `Spark Standalone` page - Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI. ### Why are the changes needed? To apply recent new security features - #47575 - #47595 ### Does this PR introduce _any_ user-facing change? No because this is a new feature of Apache Spark 4.0.0. ### How was this patch tested? Pass the CIs and manual review. - `spark-standalone.html`  - `security.html`   - `configuration.html`  ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47596 from dongjoon-hyun/SPARK-49104. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR aims to support `JWSFilter` which is a servlet filter that requires `JWS`, a cryptographically signed JSON Web Token, in the header via `spark.ui.filters` configuration. - spark.ui.filters=org.apache.spark.ui.JWSFilter - spark.org.apache.spark.ui.JWSFilter.param.key=YOUR-BASE64URL-ENCODED-KEY To simply put, `JWSFilter` will check the following for all requests. - The HTTP request should have `Authorization: Bearer <jws>` header. - `<jws>` is a string with three fields, `<header>.<payload>.<signature>`. - `<header>` is supposed to be a base64url-encoded string of `{"alg":"HS256","typ":"JWT"}`. - `<payload>` is a base64url-encoded string of fully-user-defined content. - `<signature>` is a signature based on `<header>.<payload>` and a user-provided key parameter. For example, the value of `<header>` will be `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9` always and the value of `payload` can be `e30` if the payload is empty, `{}`. The `<signature>` part is changed by the shared value of `spark.org.apache.spark.ui.JWSFilter.param.key` between the server and client. ``` jshell> java.util.Base64.getUrlEncoder().encodeToString("{\"alg\":\"HS256\",\"typ\":\"JWT\"}".getBytes()) $2 ==> "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9" jshell> java.util.Base64.getUrlEncoder().encodeToString("{}".getBytes()) $3 ==> "e30=" ``` ### Why are the changes needed? To provide a little better security on WebUI consistently including Spark Standalone Clusters. For example, **SETTING** ``` $ jshell | Welcome to JShell -- Version 17.0.12 | For an introduction type: /help intro jshell> java.util.Base64.getUrlEncoder().encodeToString("Visit https://spark.apache.org to download Apache Spark.".getBytes()) $1 ==> "VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" ``` ``` $ cat conf/spark-defaults.conf spark.ui.filters org.apache.spark.ui.JWSFilter spark.org.apache.spark.ui.JWSFilter.param.key VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= ``` **SPARK-SHELL** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ bin/spark-shell ``` Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61313 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:27:23 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 472 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$2-3b39bee2</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS, ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61311 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 302 Found < Date: Fri, 02 Aug 2024 01:27:01 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Location: http://localhost:4040/jobs/ < Content-Length: 0 < * Connection #0 to host localhost left intact ``` **SPARK MASTER** Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61331 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:34:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 477 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/json/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$1-6c52101f</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61329 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Fri, 02 Aug 2024 01:33:10 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Content-Type: text/json;charset=utf-8 < Vary: Accept-Encoding < Content-Length: 320 < { "url" : "spark://M3-Max.local:7077", "workers" : [ ], "aliveworkers" : 0, "cores" : 0, "coresused" : 0, "memory" : 0, "memoryused" : 0, "resources" : [ ], "resourcesused" : [ ], "activeapps" : [ ], "completedapps" : [ ], "activedrivers" : [ ], "completeddrivers" : [ ], "status" : "ALIVE" * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new filter. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47575 from dongjoon-hyun/SPARK-49090. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration. Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API. - apache#47575 ### Why are the changes needed? Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API . For example, we can protect `JWSFilter` to `Spark Master REST API` like the following. **MASTER REST API WITH JWSFilter** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ SPARK_NO_DAEMONIZE=1 \ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \ sbin/start-master.sh ``` **AUTHORIZATION FAILURE** ``` $ curl -v -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Sat, 03 Aug 2024 22:18:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 590 < Server: Jetty(11.0.21) < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/v1/submissions/clear</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/> </body> </html> * Connection #0 to host localhost left intact ``` **SUCCESS** ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Sat, 03 Aug 2024 22:16:51 GMT < Content-Type: application/json;charset=utf-8 < Content-Length: 113 < Server: Jetty(11.0.21) < { "action" : "ClearResponse", "message" : "", "serverSparkVersion" : "4.0.0-SNAPSHOT", "success" : true * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new feature which is not loaded by default. ### How was this patch tested? Pass the CIs with newly added test case. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47595 from dongjoon-hyun/SPARK-49103. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…REST API and rename parameter to `secretKey` ### What changes were proposed in this pull request? This PR aims the following. - Document `JWSFilter` and its usage in `Spark UI` and `REST API` - `Spark UI` section of `Configuration` page - `Spark Security` page - `Spark Standalone` page - Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI. ### Why are the changes needed? To apply recent new security features - apache#47575 - apache#47595 ### Does this PR introduce _any_ user-facing change? No because this is a new feature of Apache Spark 4.0.0. ### How was this patch tested? Pass the CIs and manual review. - `spark-standalone.html`  - `security.html`   - `configuration.html`  ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47596 from dongjoon-hyun/SPARK-49104. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This PR aims to support `JWSFilter` which is a servlet filter that requires `JWS`, a cryptographically signed JSON Web Token, in the header via `spark.ui.filters` configuration. - spark.ui.filters=org.apache.spark.ui.JWSFilter - spark.org.apache.spark.ui.JWSFilter.param.key=YOUR-BASE64URL-ENCODED-KEY To simply put, `JWSFilter` will check the following for all requests. - The HTTP request should have `Authorization: Bearer <jws>` header. - `<jws>` is a string with three fields, `<header>.<payload>.<signature>`. - `<header>` is supposed to be a base64url-encoded string of `{"alg":"HS256","typ":"JWT"}`. - `<payload>` is a base64url-encoded string of fully-user-defined content. - `<signature>` is a signature based on `<header>.<payload>` and a user-provided key parameter. For example, the value of `<header>` will be `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9` always and the value of `payload` can be `e30` if the payload is empty, `{}`. The `<signature>` part is changed by the shared value of `spark.org.apache.spark.ui.JWSFilter.param.key` between the server and client. ``` jshell> java.util.Base64.getUrlEncoder().encodeToString("{\"alg\":\"HS256\",\"typ\":\"JWT\"}".getBytes()) $2 ==> "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9" jshell> java.util.Base64.getUrlEncoder().encodeToString("{}".getBytes()) $3 ==> "e30=" ``` To provide a little better security on WebUI consistently including Spark Standalone Clusters. For example, **SETTING** ``` $ jshell | Welcome to JShell -- Version 17.0.12 | For an introduction type: /help intro jshell> java.util.Base64.getUrlEncoder().encodeToString("Visit https://spark.apache.org to download Apache Spark.".getBytes()) $1 ==> "VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" ``` ``` $ cat conf/spark-defaults.conf spark.ui.filters org.apache.spark.ui.JWSFilter spark.org.apache.spark.ui.JWSFilter.param.key VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= ``` **SPARK-SHELL** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ bin/spark-shell ``` Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61313 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:27:23 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 472 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$2-3b39bee2</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS, ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61311 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 302 Found < Date: Fri, 02 Aug 2024 01:27:01 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Location: http://localhost:4040/jobs/ < Content-Length: 0 < * Connection #0 to host localhost left intact ``` **SPARK MASTER** Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61331 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:34:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 477 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/json/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$1-6c52101f</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61329 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Fri, 02 Aug 2024 01:33:10 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Content-Type: text/json;charset=utf-8 < Vary: Accept-Encoding < Content-Length: 320 < { "url" : "spark://M3-Max.local:7077", "workers" : [ ], "aliveworkers" : 0, "cores" : 0, "coresused" : 0, "memory" : 0, "memoryused" : 0, "resources" : [ ], "resourcesused" : [ ], "activeapps" : [ ], "completedapps" : [ ], "activedrivers" : [ ], "completeddrivers" : [ ], "status" : "ALIVE" * Connection #0 to host localhost left intact }% ``` No, this is a new filter. Pass the CIs. No. Closes apache#47575 from dongjoon-hyun/SPARK-49090. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration. Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API. - apache#47575 Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API . For example, we can protect `JWSFilter` to `Spark Master REST API` like the following. **MASTER REST API WITH JWSFilter** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ SPARK_NO_DAEMONIZE=1 \ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \ sbin/start-master.sh ``` **AUTHORIZATION FAILURE** ``` $ curl -v -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Sat, 03 Aug 2024 22:18:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 590 < Server: Jetty(11.0.21) < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/v1/submissions/clear</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/> </body> </html> * Connection #0 to host localhost left intact ``` **SUCCESS** ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Sat, 03 Aug 2024 22:16:51 GMT < Content-Type: application/json;charset=utf-8 < Content-Length: 113 < Server: Jetty(11.0.21) < { "action" : "ClearResponse", "message" : "", "serverSparkVersion" : "4.0.0-SNAPSHOT", "success" : true * Connection #0 to host localhost left intact }% ``` No, this is a new feature which is not loaded by default. Pass the CIs with newly added test case. No. Closes apache#47595 from dongjoon-hyun/SPARK-49103. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…REST API and rename parameter to `secretKey` This PR aims the following. - Document `JWSFilter` and its usage in `Spark UI` and `REST API` - `Spark UI` section of `Configuration` page - `Spark Security` page - `Spark Standalone` page - Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI. To apply recent new security features - apache#47575 - apache#47595 No because this is a new feature of Apache Spark 4.0.0. Pass the CIs and manual review. - `spark-standalone.html`  - `security.html`   - `configuration.html`  No. Closes apache#47596 from dongjoon-hyun/SPARK-49104. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR aims to support `JWSFilter` which is a servlet filter that requires `JWS`, a cryptographically signed JSON Web Token, in the header via `spark.ui.filters` configuration. - spark.ui.filters=org.apache.spark.ui.JWSFilter - spark.org.apache.spark.ui.JWSFilter.param.key=YOUR-BASE64URL-ENCODED-KEY To simply put, `JWSFilter` will check the following for all requests. - The HTTP request should have `Authorization: Bearer <jws>` header. - `<jws>` is a string with three fields, `<header>.<payload>.<signature>`. - `<header>` is supposed to be a base64url-encoded string of `{"alg":"HS256","typ":"JWT"}`. - `<payload>` is a base64url-encoded string of fully-user-defined content. - `<signature>` is a signature based on `<header>.<payload>` and a user-provided key parameter. For example, the value of `<header>` will be `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9` always and the value of `payload` can be `e30` if the payload is empty, `{}`. The `<signature>` part is changed by the shared value of `spark.org.apache.spark.ui.JWSFilter.param.key` between the server and client. ``` jshell> java.util.Base64.getUrlEncoder().encodeToString("{\"alg\":\"HS256\",\"typ\":\"JWT\"}".getBytes()) $2 ==> "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9" jshell> java.util.Base64.getUrlEncoder().encodeToString("{}".getBytes()) $3 ==> "e30=" ``` ### Why are the changes needed? To provide a little better security on WebUI consistently including Spark Standalone Clusters. For example, **SETTING** ``` $ jshell | Welcome to JShell -- Version 17.0.12 | For an introduction type: /help intro jshell> java.util.Base64.getUrlEncoder().encodeToString("Visit https://spark.apache.org to download Apache Spark.".getBytes()) $1 ==> "VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" ``` ``` $ cat conf/spark-defaults.conf spark.ui.filters org.apache.spark.ui.JWSFilter spark.org.apache.spark.ui.JWSFilter.param.key VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= ``` **SPARK-SHELL** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ bin/spark-shell ``` Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61313 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:27:23 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 472 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$2-3b39bee2</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS, ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61311 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 302 Found < Date: Fri, 02 Aug 2024 01:27:01 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Location: http://localhost:4040/jobs/ < Content-Length: 0 < * Connection #0 to host localhost left intact ``` **SPARK MASTER** Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61331 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:34:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 477 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/json/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$1-6c52101f</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61329 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Fri, 02 Aug 2024 01:33:10 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Content-Type: text/json;charset=utf-8 < Vary: Accept-Encoding < Content-Length: 320 < { "url" : "spark://M3-Max.local:7077", "workers" : [ ], "aliveworkers" : 0, "cores" : 0, "coresused" : 0, "memory" : 0, "memoryused" : 0, "resources" : [ ], "resourcesused" : [ ], "activeapps" : [ ], "completedapps" : [ ], "activedrivers" : [ ], "completeddrivers" : [ ], "status" : "ALIVE" * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new filter. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47575 from dongjoon-hyun/SPARK-49090. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration. Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API. - apache#47575 ### Why are the changes needed? Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API . For example, we can protect `JWSFilter` to `Spark Master REST API` like the following. **MASTER REST API WITH JWSFilter** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ SPARK_NO_DAEMONIZE=1 \ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \ sbin/start-master.sh ``` **AUTHORIZATION FAILURE** ``` $ curl -v -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Sat, 03 Aug 2024 22:18:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 590 < Server: Jetty(11.0.21) < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/v1/submissions/clear</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/> </body> </html> * Connection #0 to host localhost left intact ``` **SUCCESS** ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Sat, 03 Aug 2024 22:16:51 GMT < Content-Type: application/json;charset=utf-8 < Content-Length: 113 < Server: Jetty(11.0.21) < { "action" : "ClearResponse", "message" : "", "serverSparkVersion" : "4.0.0-SNAPSHOT", "success" : true * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new feature which is not loaded by default. ### How was this patch tested? Pass the CIs with newly added test case. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47595 from dongjoon-hyun/SPARK-49103. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…REST API and rename parameter to `secretKey` ### What changes were proposed in this pull request? This PR aims the following. - Document `JWSFilter` and its usage in `Spark UI` and `REST API` - `Spark UI` section of `Configuration` page - `Spark Security` page - `Spark Standalone` page - Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI. ### Why are the changes needed? To apply recent new security features - apache#47575 - apache#47595 ### Does this PR introduce _any_ user-facing change? No because this is a new feature of Apache Spark 4.0.0. ### How was this patch tested? Pass the CIs and manual review. - `spark-standalone.html`  - `security.html`   - `configuration.html`  ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47596 from dongjoon-hyun/SPARK-49104. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR aims to support `JWSFilter` which is a servlet filter that requires `JWS`, a cryptographically signed JSON Web Token, in the header via `spark.ui.filters` configuration. - spark.ui.filters=org.apache.spark.ui.JWSFilter - spark.org.apache.spark.ui.JWSFilter.param.key=YOUR-BASE64URL-ENCODED-KEY To simply put, `JWSFilter` will check the following for all requests. - The HTTP request should have `Authorization: Bearer <jws>` header. - `<jws>` is a string with three fields, `<header>.<payload>.<signature>`. - `<header>` is supposed to be a base64url-encoded string of `{"alg":"HS256","typ":"JWT"}`. - `<payload>` is a base64url-encoded string of fully-user-defined content. - `<signature>` is a signature based on `<header>.<payload>` and a user-provided key parameter. For example, the value of `<header>` will be `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9` always and the value of `payload` can be `e30` if the payload is empty, `{}`. The `<signature>` part is changed by the shared value of `spark.org.apache.spark.ui.JWSFilter.param.key` between the server and client. ``` jshell> java.util.Base64.getUrlEncoder().encodeToString("{\"alg\":\"HS256\",\"typ\":\"JWT\"}".getBytes()) $2 ==> "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9" jshell> java.util.Base64.getUrlEncoder().encodeToString("{}".getBytes()) $3 ==> "e30=" ``` ### Why are the changes needed? To provide a little better security on WebUI consistently including Spark Standalone Clusters. For example, **SETTING** ``` $ jshell | Welcome to JShell -- Version 17.0.12 | For an introduction type: /help intro jshell> java.util.Base64.getUrlEncoder().encodeToString("Visit https://spark.apache.org to download Apache Spark.".getBytes()) $1 ==> "VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" ``` ``` $ cat conf/spark-defaults.conf spark.ui.filters org.apache.spark.ui.JWSFilter spark.org.apache.spark.ui.JWSFilter.param.key VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= ``` **SPARK-SHELL** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ bin/spark-shell ``` Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61313 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:27:23 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 472 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$2-3b39bee2</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS, ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:4040/ * Host localhost:4040 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:4040... * connect to ::1 port 4040 from ::1 port 61311 failed: Connection refused * Trying 127.0.0.1:4040... * Connected to localhost (127.0.0.1) port 4040 > GET / HTTP/1.1 > Host: localhost:4040 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 302 Found < Date: Fri, 02 Aug 2024 01:27:01 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Location: http://localhost:4040/jobs/ < Content-Length: 0 < * Connection #0 to host localhost left intact ``` **SPARK MASTER** Without JWS (ErrorCode: 403 Forbidden) ``` $ curl -v http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61331 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Fri, 02 Aug 2024 01:34:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 477 < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/json/</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.ui.JettyUtils$$anon$1-6c52101f</td></tr> </table> </body> </html> * Connection #0 to host localhost left intact ``` With JWS ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" http://localhost:8080/json/ * Host localhost:8080 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8080... * connect to ::1 port 8080 from ::1 port 61329 failed: Connection refused * Trying 127.0.0.1:8080... * Connected to localhost (127.0.0.1) port 8080 > GET /json/ HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Fri, 02 Aug 2024 01:33:10 GMT < Cache-Control: no-cache, no-store, must-revalidate < X-Frame-Options: SAMEORIGIN < X-XSS-Protection: 1; mode=block < X-Content-Type-Options: nosniff < Content-Type: text/json;charset=utf-8 < Vary: Accept-Encoding < Content-Length: 320 < { "url" : "spark://M3-Max.local:7077", "workers" : [ ], "aliveworkers" : 0, "cores" : 0, "coresused" : 0, "memory" : 0, "memoryused" : 0, "resources" : [ ], "resourcesused" : [ ], "activeapps" : [ ], "completedapps" : [ ], "activedrivers" : [ ], "completeddrivers" : [ ], "status" : "ALIVE" * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new filter. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47575 from dongjoon-hyun/SPARK-49090. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration. Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API. - apache#47575 ### Why are the changes needed? Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API . For example, we can protect `JWSFilter` to `Spark Master REST API` like the following. **MASTER REST API WITH JWSFilter** ``` $ build/sbt package $ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars $ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars $ SPARK_NO_DAEMONIZE=1 \ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \ sbin/start-master.sh ``` **AUTHORIZATION FAILURE** ``` $ curl -v -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/1.1 403 Forbidden < Date: Sat, 03 Aug 2024 22:18:03 GMT < Cache-Control: must-revalidate,no-cache,no-store < Content-Type: text/html;charset=iso-8859-1 < Content-Length: 590 < Server: Jetty(11.0.21) < <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 403 Authorization header is missing.</title> </head> <body><h2>HTTP ERROR 403 Authorization header is missing.</h2> <table> <tr><th>URI:</th><td>/v1/submissions/clear</td></tr> <tr><th>STATUS:</th><td>403</td></tr> <tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr> <tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr> </table> <hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/> </body> </html> * Connection #0 to host localhost left intact ``` **SUCCESS** ``` $ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear * Host localhost:6066 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6066... * connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused * Trying 127.0.0.1:6066... * Connected to localhost (127.0.0.1) port 6066 > POST /v1/submissions/clear HTTP/1.1 > Host: localhost:6066 > User-Agent: curl/8.7.1 > Accept: */* > Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw > * Request completely sent off < HTTP/1.1 200 OK < Date: Sat, 03 Aug 2024 22:16:51 GMT < Content-Type: application/json;charset=utf-8 < Content-Length: 113 < Server: Jetty(11.0.21) < { "action" : "ClearResponse", "message" : "", "serverSparkVersion" : "4.0.0-SNAPSHOT", "success" : true * Connection #0 to host localhost left intact }% ``` ### Does this PR introduce _any_ user-facing change? No, this is a new feature which is not loaded by default. ### How was this patch tested? Pass the CIs with newly added test case. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47595 from dongjoon-hyun/SPARK-49103. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…REST API and rename parameter to `secretKey` ### What changes were proposed in this pull request? This PR aims the following. - Document `JWSFilter` and its usage in `Spark UI` and `REST API` - `Spark UI` section of `Configuration` page - `Spark Security` page - `Spark Standalone` page - Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI. ### Why are the changes needed? To apply recent new security features - apache#47575 - apache#47595 ### Does this PR introduce _any_ user-facing change? No because this is a new feature of Apache Spark 4.0.0. ### How was this patch tested? Pass the CIs and manual review. - `spark-standalone.html`  - `security.html`   - `configuration.html`  ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47596 from dongjoon-hyun/SPARK-49104. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR aims to support
JWSFilter
which is a servlet filter that requiresJWS
, a cryptographically signed JSON Web Token, in the header viaspark.ui.filters
configuration.To simply put,
JWSFilter
will check the following for all requests.Authorization: Bearer <jws>
header.<jws>
is a string with three fields,<header>.<payload>.<signature>
.<header>
is supposed to be a base64url-encoded string of{"alg":"HS256","typ":"JWT"}
.<payload>
is a base64url-encoded string of fully-user-defined content.<signature>
is a signature based on<header>.<payload>
and a user-provided key parameter.For example, the value of
<header>
will beeyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
always and the value ofpayload
can bee30
if the payload is empty,{}
. The<signature>
part is changed by the shared value ofspark.org.apache.spark.ui.JWSFilter.param.key
between the server and client.Why are the changes needed?
To provide a little better security on WebUI consistently including Spark Standalone Clusters.
For example,
SETTING
SPARK-SHELL
Without JWS (ErrorCode: 403 Forbidden)
With JWS,
SPARK MASTER
Without JWS (ErrorCode: 403 Forbidden)
With JWS
Does this PR introduce any user-facing change?
No, this is a new filter.
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No.