[SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation #10297

thunterdb · 2015-12-14T20:55:16Z

This fixes the sidebar, using a pure CSS mechanism to hide it when the browser's viewport is too narrow.
Credit goes to the original author @Titan-C (mentioned in the NOTICE).

Note that I am not a CSS expert, so I can only address comments up to some extent.

Default view:

When collapsed manually by the user:

Disappears when column is too narrow:

Can still be opened by the user if necessary:

thunterdb · 2015-12-14T20:56:03Z

@jkbradley can you take a look at this fix?

SparkQA · 2015-12-14T22:54:26Z

Test build #47677 has finished for PR 10297 at commit b091b42.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-12-14T23:05:50Z

Tested locally. Functionally, this seems fine. I'll rely on someone else to check the CSS.

SparkQA · 2015-12-15T00:55:33Z

Test build #2215 has finished for PR 10297 at commit b091b42.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

parano · 2015-12-15T21:27:01Z

docs/_layouts/global.html

@@ -128,19 +128,31 @@

            {% if page.url contains "/ml" %}
              {% include nav-left-wrapper-ml.html nav-mllib=site.data.menu-mllib nav-ml=site.data.menu-ml %}
+              <input id="nav-trigger" class="nav-trigger" checked="" type="checkbox">
+              <label for="nav-trigger"></label>
+                <div class="content-with-sidebar" id="content">


parano · 2015-12-15T21:48:38Z

@thunterdb @jkbradley I left some comments on the css part, otherwise looks good to me

jkbradley · 2015-12-16T01:02:24Z

docs/css/main.css

+.content {
+  z-index: 1;
+  position: relative;
+  background-color: #FFF;


2 instances of background-color here

oops thanks!

jkbradley · 2015-12-16T01:31:44Z

Functionally, it seems good to me. (tested locally)

SparkQA · 2015-12-16T03:35:59Z

Test build #47764 has finished for PR 10297 at commit 464dcb5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-12-16T04:07:00Z

Test build #47773 has finished for PR 10297 at commit 19b92a2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-12-16T18:10:52Z

LGTM. I'll merge this with master and branch-1.6

This fixes the sidebar, using a pure CSS mechanism to hide it when the browser's viewport is too narrow. Credit goes to the original author Titan-C (mentioned in the NOTICE). Note that I am not a CSS expert, so I can only address comments up to some extent. Default view: <img width="936" alt="screen shot 2015-12-14 at 12 46 39 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793597/6d1d6eda-a261-11e5-836b-6eb2054e9054.png"> When collapsed manually by the user: <img width="1004" alt="screen shot 2015-12-14 at 12 54 02 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793669/c991989e-a261-11e5-8bf6-aecf3bdb6319.png"> Disappears when column is too narrow: <img width="697" alt="screen shot 2015-12-14 at 12 47 22 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793607/7754dbcc-a261-11e5-8b15-e0d074b0e47c.png"> Can still be opened by the user if necessary: <img width="651" alt="screen shot 2015-12-14 at 12 51 15 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793612/7bf82968-a261-11e5-9cc3-e827a7a6b2b0.png"> Author: Timothy Hunter <timhunter@databricks.com> Closes #10297 from thunterdb/12324. (cherry picked from commit a6325fc) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>

…ValueGroupedDataset ### What changes were proposed in this pull request? This PR proposes to add `as` API to RelationalGroupedDataset. It creates KeyValueGroupedDataset instance using given grouping expressions, instead of a typed function in groupByKey API. Because it can leverage existing columns, it can use existing data partition, if any, when doing operations like cogroup. ### Why are the changes needed? Currently if users want to do cogroup on DataFrames, there is no good way to do except for KeyValueGroupedDataset. 1. KeyValueGroupedDataset ignores existing data partition if any. That is a problem. 2. groupByKey calls typed function to create additional keys. You can not reuse existing columns, if you just need grouping by them. ```scala // df1 and df2 are certainly partitioned and sorted. val df1 = Seq((1, 2, 3), (2, 3, 4)).toDF("a", "b", "c") .repartition($"a").sortWithinPartitions("a") val df2 = Seq((1, 2, 4), (2, 3, 5)).toDF("a", "b", "c") .repartition($"a").sortWithinPartitions("a") ``` ```scala // This groupBy.as.cogroup won't unnecessarily repartition the data val df3 = df1.groupBy("a").as[Int] .cogroup(df2.groupBy("a").as[Int]) { case (key, data1, data2) => data1.zip(data2).map { p => p._1.getInt(2) + p._2.getInt(2) } } ``` ``` == Physical Plan == *(5) SerializeFromObject [input[0, int, false] AS value#11247] +- CoGroup org.apache.spark.sql.DataFrameSuite$$Lambda$4922/12067092816eec1b6f, a#11209: int, createexternalrow(a#11209, b#11210, c#11211, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), createexternalrow(a#11225, b#11226, c#11227, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), [a#11209], [a#11225], [a#11209, b#11210, c#11211], [a#11225, b#11226, c#11227], obj#11246: int :- *(2) Sort [a#11209 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(a#11209, 5), false, [id=#10218] : +- *(1) Project [_1#11202 AS a#11209, _2#11203 AS b#11210, _3#11204 AS c#11211] : +- *(1) LocalTableScan [_1#11202, _2#11203, _3#11204] +- *(4) Sort [a#11225 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#11225, 5), false, [id=#10223] +- *(3) Project [_1#11218 AS a#11225, _2#11219 AS b#11226, _3#11220 AS c#11227] +- *(3) LocalTableScan [_1#11218, _2#11219, _3#11220] ``` ```scala // Current approach creates additional AppendColumns and repartition data again val df4 = df1.groupByKey(r => r.getInt(0)).cogroup(df2.groupByKey(r => r.getInt(0))) { case (key, data1, data2) => data1.zip(data2).map { p => p._1.getInt(2) + p._2.getInt(2) } } ``` ``` == Physical Plan == *(7) SerializeFromObject [input[0, int, false] AS value#11257] +- CoGroup org.apache.spark.sql.DataFrameSuite$$Lambda$4933/138102700737171997, value#11252: int, createexternalrow(a#11209, b#11210, c#11211, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), createexternalrow(a#11225, b#11226, c#11227, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), [value#11252], [value#11254], [a#11209, b#11210, c#11211], [a#11225, b#11226, c#11227], obj#11256: int :- *(3) Sort [value#11252 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(value#11252, 5), true, [id=#10302] : +- AppendColumns org.apache.spark.sql.DataFrameSuite$$Lambda$4930/19529195347ce07f47, createexternalrow(a#11209, b#11210, c#11211, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), [input[0, int, false] AS value#11252] : +- *(2) Sort [a#11209 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(a#11209, 5), false, [id=#10297] : +- *(1) Project [_1#11202 AS a#11209, _2#11203 AS b#11210, _3#11204 AS c#11211] : +- *(1) LocalTableScan [_1#11202, _2#11203, _3#11204] +- *(6) Sort [value#11254 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(value#11254, 5), true, [id=#10312] +- AppendColumns org.apache.spark.sql.DataFrameSuite$$Lambda$4932/15265288491f0e0c1f, createexternalrow(a#11225, b#11226, c#11227, StructField(a,IntegerType,false), StructField(b,IntegerType,false), StructField(c,IntegerType,false)), [input[0, int, false] AS value#11254] +- *(5) Sort [a#11225 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#11225, 5), false, [id=#10307] +- *(4) Project [_1#11218 AS a#11225, _2#11219 AS b#11226, _3#11220 AS c#11227] +- *(4) LocalTableScan [_1#11218, _2#11219, _3#11220] ``` ### Does this PR introduce any user-facing change? Yes, this adds a new `as` API to RelationalGroupedDataset. Users can use it to create KeyValueGroupedDataset and do cogroup. ### How was this patch tested? Unit tests. Closes #26509 from viirya/SPARK-29427-2. Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com> Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

thunterdb added 3 commits December 14, 2015 11:39

menu

dc8d41b

colors fixed

94e8274

add the original author to the notice

b091b42

parano reviewed Dec 15, 2015
View reviewed changes

thunterdb added 2 commits December 15, 2015 16:36

comments from @parano

6b8e9cf

changes

464dcb5

jkbradley reviewed Dec 16, 2015
View reviewed changes

duplicate css code

19b92a2

asfgit closed this in a6325fc Dec 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation #10297

[SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation #10297

thunterdb commented Dec 14, 2015

thunterdb commented Dec 14, 2015

SparkQA commented Dec 14, 2015

jkbradley commented Dec 14, 2015

SparkQA commented Dec 15, 2015

parano Dec 15, 2015

thunterdb Dec 16, 2015

parano commented Dec 15, 2015

jkbradley Dec 16, 2015

thunterdb Dec 16, 2015

jkbradley commented Dec 16, 2015

SparkQA commented Dec 16, 2015

SparkQA commented Dec 16, 2015

jkbradley commented Dec 16, 2015

[SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation #10297

[SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation #10297

Conversation

thunterdb commented Dec 14, 2015

thunterdb commented Dec 14, 2015

SparkQA commented Dec 14, 2015

jkbradley commented Dec 14, 2015

SparkQA commented Dec 15, 2015

parano Dec 15, 2015

Choose a reason for hiding this comment

thunterdb Dec 16, 2015

Choose a reason for hiding this comment

parano commented Dec 15, 2015

jkbradley Dec 16, 2015

Choose a reason for hiding this comment

thunterdb Dec 16, 2015

Choose a reason for hiding this comment

jkbradley commented Dec 16, 2015

SparkQA commented Dec 16, 2015

SparkQA commented Dec 16, 2015

jkbradley commented Dec 16, 2015