Skip to content

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Aug 3, 2024

What changes were proposed in this pull request?

This PR aims to support spark.master.rest.filters configuration like the existing spark.ui.filters configuration.

Recently, Apache Spark starts to support JWSFilter. We can take advantage of JWSFilter to protect Spark Master REST API.

Why are the changes needed?

Like Spark UI, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect JWSFilter to Spark Master REST API like the following.

MASTER REST API WITH JWSFilter

$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh

AUTHORIZATION FAILURE

$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact

SUCCESS

$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%

Does this PR introduce any user-facing change?

No, this is a new feature which is not loaded by default.

How was this patch tested?

Pass the CIs with newly added test case.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Aug 3, 2024
@dongjoon-hyun
Copy link
Member Author

cc @mridulm , @viirya , @yaooqinn

.version("4.0.0")
.stringConf
.toSequence
.createWithDefault(Nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any user-facing documentation for this config?

@dongjoon-hyun
Copy link
Member Author

Thank you, @viirya .

For the following, I'm current preparing an independent documentation PR to include the recent contents. I will include this part too.

Do we have any user-facing documentation for this config?

@dongjoon-hyun dongjoon-hyun deleted the SPARK-49103 branch August 4, 2024 02:40
HyukjinKwon pushed a commit that referenced this pull request Aug 4, 2024
…REST API and rename parameter to `secretKey`

### What changes were proposed in this pull request?

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

### Why are the changes needed?

To apply recent new security features
- #47575
- #47595

### Does this PR introduce _any_ user-facing change?

No because this is a new feature of Apache Spark 4.0.0.

### How was this patch tested?

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
fusheng9399 pushed a commit to fusheng9399/spark that referenced this pull request Aug 6, 2024
### What changes were proposed in this pull request?

This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration.

Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API.
- apache#47575

### Why are the changes needed?

Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect `JWSFilter` to `Spark Master REST API` like the following.

**MASTER REST API WITH JWSFilter**
```
$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh
```

**AUTHORIZATION FAILURE**
```
$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact
```

**SUCCESS**
```
$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%
```

### Does this PR introduce _any_ user-facing change?

No, this is a new feature which is not loaded by default.

### How was this patch tested?

Pass the CIs with newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47595 from dongjoon-hyun/SPARK-49103.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
fusheng9399 pushed a commit to fusheng9399/spark that referenced this pull request Aug 6, 2024
…REST API and rename parameter to `secretKey`

### What changes were proposed in this pull request?

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

### Why are the changes needed?

To apply recent new security features
- apache#47575
- apache#47595

### Does this PR introduce _any_ user-facing change?

No because this is a new feature of Apache Spark 4.0.0.

### How was this patch tested?

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration.

Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API.
- apache#47575

Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect `JWSFilter` to `Spark Master REST API` like the following.

**MASTER REST API WITH JWSFilter**
```
$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh
```

**AUTHORIZATION FAILURE**
```
$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact
```

**SUCCESS**
```
$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%
```

No, this is a new feature which is not loaded by default.

Pass the CIs with newly added test case.

No.

Closes apache#47595 from dongjoon-hyun/SPARK-49103.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
…REST API and rename parameter to `secretKey`

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

To apply recent new security features
- apache#47575
- apache#47595

No because this is a new feature of Apache Spark 4.0.0.

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

No.

Closes apache#47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?

This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration.

Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API.
- apache#47575

### Why are the changes needed?

Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect `JWSFilter` to `Spark Master REST API` like the following.

**MASTER REST API WITH JWSFilter**
```
$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh
```

**AUTHORIZATION FAILURE**
```
$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact
```

**SUCCESS**
```
$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%
```

### Does this PR introduce _any_ user-facing change?

No, this is a new feature which is not loaded by default.

### How was this patch tested?

Pass the CIs with newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47595 from dongjoon-hyun/SPARK-49103.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…REST API and rename parameter to `secretKey`

### What changes were proposed in this pull request?

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

### Why are the changes needed?

To apply recent new security features
- apache#47575
- apache#47595

### Does this PR introduce _any_ user-facing change?

No because this is a new feature of Apache Spark 4.0.0.

### How was this patch tested?

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?

This PR aims to support `spark.master.rest.filters` configuration like the existing `spark.ui.filters` configuration.

Recently, Apache Spark starts to support `JWSFilter`. We can take advantage of `JWSFilter` to protect Spark Master REST API.
- apache#47575

### Why are the changes needed?

Like `Spark UI`, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect `JWSFilter` to `Spark Master REST API` like the following.

**MASTER REST API WITH JWSFilter**
```
$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh
```

**AUTHORIZATION FAILURE**
```
$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact
```

**SUCCESS**
```
$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%
```

### Does this PR introduce _any_ user-facing change?

No, this is a new feature which is not loaded by default.

### How was this patch tested?

Pass the CIs with newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47595 from dongjoon-hyun/SPARK-49103.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…REST API and rename parameter to `secretKey`

### What changes were proposed in this pull request?

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

### Why are the changes needed?

To apply recent new security features
- apache#47575
- apache#47595

### Does this PR introduce _any_ user-facing change?

No because this is a new feature of Apache Spark 4.0.0.

### How was this patch tested?

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
dongjoon-hyun added a commit that referenced this pull request Feb 12, 2025
### What changes were proposed in this pull request?

This PR aims to enable `spark.master.rest.enabled` by default for Apache Spark 4.1.0.

### Why are the changes needed?

Apache Spark is ready to enable this feature by default.
- Since Apache Spark 1.3.0, `spark.master.rest.enabled` has been used stably.
- Since Apache Spark 4.0.0, `spark.master.rest.filters` provides a way to serve it securely.
  - #47595

### Does this PR introduce _any_ user-facing change?

Yes, the migration guide is updated.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49894 from dongjoon-hyun/SPARK-51165.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants