Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10873] Support column sort and search for History Server. #10648

Closed
wants to merge 12 commits into from

Conversation

zhuoliu
Copy link

@zhuoliu zhuoliu commented Jan 7, 2016

[SPARK-10873] Support column sort and search for History Server using jQuery DataTable and REST API. Before this commit, the history server was generated hard-coded html and can not support search, also, the sorting was disabled if there is any application that has more than one attempt. Supporting search and sort (over all applications rather than the 20 entries in the current page) in any case will greatly improve user experience.

  1. Create the historypage-template.html for displaying application information in datables.
  2. historypage.js uses jQuery to access the data from /api/v1/applications REST API, and use DataTable to display each application's information. For application that has more than one attempt, the RowsGroup is used to merge such entries while at the same time supporting sort and search.
  3. "duration" and "lastUpdated" rest API are added to application's "attempts".
  4. External javascirpt and css files for datatables, RowsGroup and jquery plugins are added with licenses clarified.

Snapshots for how it looks like now:

History page view:
historypage

Search:
search

Sort by started time:
sort-by-started-time

@vanzin
Copy link
Contributor

vanzin commented Jan 7, 2016

ok to test

@vanzin
Copy link
Contributor

vanzin commented Jan 7, 2016

Haven't looked at the code, but the zebra striping is messed up when you have multiple attempts.

@tgravescs
Copy link
Contributor

the striping is the same as the current history server (1.5) when you have multiple attempts. The only difference I see if the striping on the first 2 columns is disabled on the current version. I'm fine with changing it if its not to much hassle.

@SparkQA
Copy link

SparkQA commented Jan 7, 2016

Test build #48961 has finished for PR 10648 at commit 808558f.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

search: 'applied',
}

GroupedColumnsOrderDir = 'desc'; // change
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This external js file fro RowsGroup is using MIT license. I modified this line to set the default order from "asc" to "desc". This change is necessary that by default the latest applications will be displayed on the first page.
Also, I commented line 181out so that it will not disable sorting on the last 6 columns.
Do we need to add an additional license for this two minor changes?

@SparkQA
Copy link

SparkQA commented Jan 7, 2016

Test build #48966 has finished for PR 10648 at commit f1f8feb.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 7, 2016

Test build #48968 has finished for PR 10648 at commit 2abe73c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhuoliu
Copy link
Author

zhuoliu commented Jan 7, 2016

Hi @vanzin , I added two new fields (duration Long, lastUpdated Date) into ApplicationAttemptInfo in api.scala.
The jenkins compile is complaining about mima binary check failures. Did you have a clue for this? Thank you!

[info] Compiling 16 Scala sources and 2 Java sources to /home/jenkins/workspace/SparkPullRequestBuilder@3/sql/core/target/scala-2.10/classes...
[info] spark-core: found 2 potential binary incompatibilities (filtered 621)
[error]  * synthetic method <init>$default$5()Boolean in object org.apache.spark.status.api.v1.ApplicationAttemptInfo does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.status.api.v1.ApplicationAttemptInfo.<init>$default$5")
[error]  * method this(scala.Option,java.util.Date,java.util.Date,java.lang.String,Boolean)Unit in class org.apache.spark.status.api.v1.ApplicationAttemptInfo does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.status.api.v1.ApplicationAttemptInfo.this")
...
...
[error] (core/*:mimaReportBinaryIssues) spark-core: Binary compatibility check failed!
[error] Total time: 111 s, completed Jan 7, 2016 1:13:24 PM
[error] running /home/jenkins/workspace/SparkPullRequestBuilder@3/dev/mima ; received return code 1

@vanzin
Copy link
Contributor

vanzin commented Jan 7, 2016

Since the constructor is private anyway, it's fine to just do what the error message says and add the exclusions to MimaExcludes.scala.

@zhuoliu
Copy link
Author

zhuoliu commented Jan 7, 2016

Hi @vanzin , thanks for the info.
BTW, I have fixed the issue you mentioned about messed zebra striping. Now the zebra striping is exactly as what we have originally (the first two columns display pure white if any application has more than one attempt). Please check the updated snapshot figures above.

@SparkQA
Copy link

SparkQA commented Jan 7, 2016

Test build #48976 has finished for PR 10648 at commit 4322851.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 8, 2016

Test build #49025 has finished for PR 10648 at commit d7cc895.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

@vanzin @srowen @JoshRosen @steveloughran @andrewor14 @rxin I want to get input on what people think of using the datatables? If everyone is ok with it we can review in more detail. I think starting out with history page and then perhaps changing all the UI to use datatables to its consistent across the board.

@vanzin
Copy link
Contributor

vanzin commented Jan 13, 2016

I don't have a problem with it if the license are ok; I just don't have the cycles right now to review the code. :-(

@andrewor14
Copy link
Contributor

Screenshot looks amazing, thanks @zhuoliu!

I think @JoshRosen has some opinions on using a database for the history server.

@steveloughran
Copy link
Contributor

I have some opinions too, as with YARN timeline service integration, it's essentially hooked up to a database, both for publishing and retrieval. It might be interesting to consider a dual mode, where things are published to HDFS/s3/WASB during a run (availability will match that of the analytics work itself), and for retrieval, after any initial load & parse, persist in a DB. You could do that with the current FS records, extracting useful summary data "completed" "incompatible-version" in the process.

@tgravescs
Copy link
Contributor

The use of a database seems like it could be separate from this pr. This pr does change it to use the rest api but the main change here is the use of datatables to allow for sorting, searching, etc.

I haven't heard any issue with using that so I'll go ahead with a more thorough review. If I missed someone concern with that please let me know ASAP.

@tgravescs
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jan 21, 2016

Test build #49878 has finished for PR 10648 at commit ad6ce01.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* See the License for the specific language governing permissions and
* limitations under the License.
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment saying this is same as UIUtils.formatDuration

@SparkQA
Copy link

SparkQA commented Jan 22, 2016

Test build #49899 has finished for PR 10648 at commit b62d77b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -291,3 +291,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)
(MIT License) sorttable (https://github.com/stuartlangridge/sorttable)
(MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)
(MIT/GPL License) datatables (http://datatables.net/license)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the version we pull in is MIT licenses so lets removed GPL license

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter much since it's MIT or GPL, and I think this 'notation' was started by the automated output of a Maven plugin.

This is sufficient for MIT-licensed code. I'm a bit concerned about pulling in a lot of third party source just for this feature, but hey.

It looks like you're pulling in an inconsistent set of jquery files though. 1.11.1 vs 1.10.4? or maybe I'm misreading this. They show up as 1-line diffs.

@@ -291,3 +291,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)
(MIT License) sorttable (https://github.com/stuartlangridge/sorttable)
(MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)
(MIT/GPL License) datatables (http://datatables.net/license)
(MIT/GPL License) blockUI (http://jquery.malsup.com/block/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar here since it allows user to choose MIT or GPL lets just put MIT.

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50109 has finished for PR 10648 at commit 638a903.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


historySummary = $("#history-summary");
searchString = JSON.stringify(historySummary["context"]["location"]["search"]);
requestedIncomplete = searchString.indexOf("true") >= 0 ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be better if we specifically look for the paramter showIncomplete=true or false here

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50155 has finished for PR 10648 at commit d7e679e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

function getParameterByName(name, searchString) {
name = name.replace(/[\[]/, "\\[").replace(/[\]]/, "\\]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this.

@tgravescs
Copy link
Contributor

The paging doesn't seem to be working can you take a look.

Then could you also line up the table indentation with the other text for "Event log directory"

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50196 has finished for PR 10648 at commit 1d3343f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhuoliu
Copy link
Author

zhuoliu commented Jan 28, 2016

Hi @tgravescs , finally fixed the paging stuff in RowsGrouping. :)

@SparkQA
Copy link

SparkQA commented Jan 28, 2016

Test build #50248 has finished for PR 10648 at commit 430d283.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Jan 28, 2016

@tgravescs @zhuoliu are you guys interested in more UI work? I have some ideas that I never found time / people to work on ... I think they will make the UI a lot more useful.

@tgravescs
Copy link
Contributor

@rxin I would be interested in hearing them if they make debugging easier and usability better. We'll have to prioritize it with the rest of the work we have. We would like to change most of the pages to use the datatables to keep things consistent so if its things we can do at the same time it might be easier.

@tgravescs
Copy link
Contributor

mostly looks good a few more minor things.

  • can you highlight the page you are on
  • file a jira to have attemptid always show up in rest api so spark shell reports attempt id
  • when you click an application id with multiple attempts, open the latest (ie 3 if 3 attempts).

@zhuoliu
Copy link
Author

zhuoliu commented Jan 28, 2016

JIRA filed here: https://issues.apache.org/jira/browse/SPARK-13064

@tgravescs
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50306 has finished for PR 10648 at commit 0c55712.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

Jenkins, test this please

@tgravescs
Copy link
Contributor

+1 for the latest changes, the unit tests failures are unrelated. Going to give this a little while but if they are hung again commit this anyway since the latest changes made don't affect unit tests.

@asfgit asfgit closed this in e4c1162 Jan 29, 2016
superbobry pushed a commit to criteo-forks/spark that referenced this pull request Feb 15, 2017
[SPARK-10873] Support column sort and search for History Server using jQuery DataTable and REST API. Before this commit, the history server was generated hard-coded html and can not support search, also, the sorting was disabled if there is any application that has more than one attempt. Supporting search and sort (over all applications rather than the 20 entries in the current page) in any case will greatly improve user experience.

1. Create the historypage-template.html for displaying application information in datables.
2. historypage.js uses jQuery to access the data from /api/v1/applications REST API, and use DataTable to display each application's information. For application that has more than one attempt, the RowsGroup is used to merge such entries while at the same time supporting sort and search.
3. "duration" and "lastUpdated" rest API are added to application's "attempts".
4. External javascirpt and css files for datatables, RowsGroup and jquery plugins are added with licenses clarified.

Snapshots for how it looks like now:

History page view:
![historypage](https://cloud.githubusercontent.com/assets/11683054/12184383/89bad774-b55a-11e5-84e4-b0276172976f.png)

Search:
![search](https://cloud.githubusercontent.com/assets/11683054/12184385/8d3b94b0-b55a-11e5-869a-cc0ef0a4242a.png)

Sort by started time:
![sort-by-started-time](https://cloud.githubusercontent.com/assets/11683054/12184387/8f757c3c-b55a-11e5-98c8-577936366566.png)

Author: zhuol <zhuol@yahoo-inc.com>

Closes apache#10648 from zhuoliu/10873.
superbobry pushed a commit to criteo-forks/spark that referenced this pull request Mar 20, 2017
[SPARK-10873] Support column sort and search for History Server using jQuery DataTable and REST API. Before this commit, the history server was generated hard-coded html and can not support search, also, the sorting was disabled if there is any application that has more than one attempt. Supporting search and sort (over all applications rather than the 20 entries in the current page) in any case will greatly improve user experience.

1. Create the historypage-template.html for displaying application information in datables.
2. historypage.js uses jQuery to access the data from /api/v1/applications REST API, and use DataTable to display each application's information. For application that has more than one attempt, the RowsGroup is used to merge such entries while at the same time supporting sort and search.
3. "duration" and "lastUpdated" rest API are added to application's "attempts".
4. External javascirpt and css files for datatables, RowsGroup and jquery plugins are added with licenses clarified.

Snapshots for how it looks like now:

History page view:
![historypage](https://cloud.githubusercontent.com/assets/11683054/12184383/89bad774-b55a-11e5-84e4-b0276172976f.png)

Search:
![search](https://cloud.githubusercontent.com/assets/11683054/12184385/8d3b94b0-b55a-11e5-869a-cc0ef0a4242a.png)

Sort by started time:
![sort-by-started-time](https://cloud.githubusercontent.com/assets/11683054/12184387/8f757c3c-b55a-11e5-98c8-577936366566.png)

Author: zhuol <zhuol@yahoo-inc.com>

Closes apache#10648 from zhuoliu/10873.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants