Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46704][CORE][UI] Fix MasterPage to sort Running Drivers table by Duration column correctly #44711

Closed
wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jan 12, 2024

What changes were proposed in this pull request?

This PR aims to fix MasterPage to sort Running Drivers table by Duration column correctly.

Why are the changes needed?

Since Apache Spark 3.0.0, MasterPage shows Duration column of Running Drivers.

BEFORE

AFTER

Does this PR introduce any user-facing change?

Yes, this is a bug fix of UI.

How was this patch tested?

Manual.

Run a Spark standalone cluster.

$ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.deploy.maxDrivers=2" sbin/start-master.sh
$ sbin/start-worker.sh spark://$(hostname):7077

Submit multiple jobs via REST API.

$ curl -s -k -XPOST http://localhost:6066/v1/submissions/create \
    --header "Content-Type:application/json;charset=UTF-8" \
    --data '{
      "appResource": "",
      "sparkProperties": {
        "spark.master": "spark://localhost:7077",
        "spark.app.name": "Test 1",
        "spark.submit.deployMode": "cluster",
        "spark.jars": "/Users/dongjoon/APACHE/spark-merge/examples/target/scala-2.13/jars/spark-examples_2.13-4.0.0-SNAPSHOT.jar"
      },
      "clientSparkVersion": "",
      "mainClass": "org.apache.spark.examples.SparkPi",
      "environmentVariables": {},
      "action": "CreateSubmissionRequest",
      "appArgs": [ "10000" ]
    }'

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

Could you review this Spark Master UI bug fix when you have some time, @viirya ?

Comment on lines +378 to +380
<td sorttable_customkey={(-driver.startTime).toString}>
{UIUtils.formatDuration(System.currentTimeMillis() - driver.startTime)}
</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't use duration as key? Now it uses negative driver start time as key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, for apps, we use duration directly.

<td sorttable_customkey={app.duration.toString}>
  {UIUtils.formatDuration(app.duration)}
</td>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver duration is different from App duration because we have Driver-only Spark job.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Because for all duration, end time is the same (current time).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single-pod Spark job is the Driver-only Spark job without application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's because this is Running Drivers table.

Oh, I see. Because for all duration, end time is the same (current time).

@dongjoon-hyun
Copy link
Member Author

Thank you, @viirya ! Let me merge this because I verified manually~

dongjoon-hyun added a commit that referenced this pull request Jan 12, 2024
…ble by `Duration` column correctly

### What changes were proposed in this pull request?

This PR aims to fix `MasterPage` to sort `Running Drivers` table by `Duration` column correctly.

### Why are the changes needed?

Since Apache Spark 3.0.0, `MasterPage` shows `Duration` column of `Running Drivers`.

**BEFORE**
<img width="111" src="https://github.com/apache/spark/assets/9700541/50276e34-01be-4474-803d-79066e06cb2c">

**AFTER**
<img width="111" src="https://github.com/apache/spark/assets/9700541/a427b2e6-eab0-4d73-9114-1d8ff9d052c2">

### Does this PR introduce _any_ user-facing change?

Yes, this is a bug fix of UI.

### How was this patch tested?

Manual.

Run a Spark standalone cluster.
```
$ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.deploy.maxDrivers=2" sbin/start-master.sh
$ sbin/start-worker.sh spark://$(hostname):7077
```

Submit multiple jobs via REST API.
```
$ curl -s -k -XPOST http://localhost:6066/v1/submissions/create \
    --header "Content-Type:application/json;charset=UTF-8" \
    --data '{
      "appResource": "",
      "sparkProperties": {
        "spark.master": "spark://localhost:7077",
        "spark.app.name": "Test 1",
        "spark.submit.deployMode": "cluster",
        "spark.jars": "/Users/dongjoon/APACHE/spark-merge/examples/target/scala-2.13/jars/spark-examples_2.13-4.0.0-SNAPSHOT.jar"
      },
      "clientSparkVersion": "",
      "mainClass": "org.apache.spark.examples.SparkPi",
      "environmentVariables": {},
      "action": "CreateSubmissionRequest",
      "appArgs": [ "10000" ]
    }'
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44711 from dongjoon-hyun/SPARK-46704.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 25c680c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun added a commit that referenced this pull request Jan 12, 2024
…ble by `Duration` column correctly

### What changes were proposed in this pull request?

This PR aims to fix `MasterPage` to sort `Running Drivers` table by `Duration` column correctly.

### Why are the changes needed?

Since Apache Spark 3.0.0, `MasterPage` shows `Duration` column of `Running Drivers`.

**BEFORE**
<img width="111" src="https://github.com/apache/spark/assets/9700541/50276e34-01be-4474-803d-79066e06cb2c">

**AFTER**
<img width="111" src="https://github.com/apache/spark/assets/9700541/a427b2e6-eab0-4d73-9114-1d8ff9d052c2">

### Does this PR introduce _any_ user-facing change?

Yes, this is a bug fix of UI.

### How was this patch tested?

Manual.

Run a Spark standalone cluster.
```
$ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.deploy.maxDrivers=2" sbin/start-master.sh
$ sbin/start-worker.sh spark://$(hostname):7077
```

Submit multiple jobs via REST API.
```
$ curl -s -k -XPOST http://localhost:6066/v1/submissions/create \
    --header "Content-Type:application/json;charset=UTF-8" \
    --data '{
      "appResource": "",
      "sparkProperties": {
        "spark.master": "spark://localhost:7077",
        "spark.app.name": "Test 1",
        "spark.submit.deployMode": "cluster",
        "spark.jars": "/Users/dongjoon/APACHE/spark-merge/examples/target/scala-2.13/jars/spark-examples_2.13-4.0.0-SNAPSHOT.jar"
      },
      "clientSparkVersion": "",
      "mainClass": "org.apache.spark.examples.SparkPi",
      "environmentVariables": {},
      "action": "CreateSubmissionRequest",
      "appArgs": [ "10000" ]
    }'
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44711 from dongjoon-hyun/SPARK-46704.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 25c680c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-46704 branch January 12, 2024 20:57
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…ble by `Duration` column correctly

### What changes were proposed in this pull request?

This PR aims to fix `MasterPage` to sort `Running Drivers` table by `Duration` column correctly.

### Why are the changes needed?

Since Apache Spark 3.0.0, `MasterPage` shows `Duration` column of `Running Drivers`.

**BEFORE**
<img width="111" src="https://github.com/apache/spark/assets/9700541/50276e34-01be-4474-803d-79066e06cb2c">

**AFTER**
<img width="111" src="https://github.com/apache/spark/assets/9700541/a427b2e6-eab0-4d73-9114-1d8ff9d052c2">

### Does this PR introduce _any_ user-facing change?

Yes, this is a bug fix of UI.

### How was this patch tested?

Manual.

Run a Spark standalone cluster.
```
$ SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.deploy.maxDrivers=2" sbin/start-master.sh
$ sbin/start-worker.sh spark://$(hostname):7077
```

Submit multiple jobs via REST API.
```
$ curl -s -k -XPOST http://localhost:6066/v1/submissions/create \
    --header "Content-Type:application/json;charset=UTF-8" \
    --data '{
      "appResource": "",
      "sparkProperties": {
        "spark.master": "spark://localhost:7077",
        "spark.app.name": "Test 1",
        "spark.submit.deployMode": "cluster",
        "spark.jars": "/Users/dongjoon/APACHE/spark-merge/examples/target/scala-2.13/jars/spark-examples_2.13-4.0.0-SNAPSHOT.jar"
      },
      "clientSparkVersion": "",
      "mainClass": "org.apache.spark.examples.SparkPi",
      "environmentVariables": {},
      "action": "CreateSubmissionRequest",
      "appArgs": [ "10000" ]
    }'
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#44711 from dongjoon-hyun/SPARK-46704.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 25c680c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants