[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

benoyantony · 2017-03-29T06:33:15Z

… to display results

What is this PR for?

Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property zeppelin.livy.spark.sql.field.truncate to control whether to truncate strings or not. By default, zeppelin.livy.spark.sql.field.truncate is set to true.

What type of PR is it?

Improvement

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-1965

How should this be tested?

Set zeppelin.livy.spark.sql.field.truncate to true or false
Run a SQL query which produces string values of length greater than 20.
Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not.

Questions:

Does the licenses files need update? No
Is there breaking changes for older versions? No
Does this needs documentation? No

… to display results

zjffdu · 2017-03-29T07:06:52Z

Thanks @benoyantony for contribution, would you mind to add this configuration into livy.md as well ?

zjffdu · 2017-03-29T07:36:28Z

One minor thing left, could you add test case for it ?

benoyantony · 2017-03-30T16:27:52Z

There seems to be a problem with Jenkins as the error is " java.io.IOException: Remote call on ubuntu-2 failed". Any idea on how to overcome this ?

The travis build has all checks passed.

zjffdu · 2017-03-31T00:58:04Z

It is caused by Caused by: java.lang.OutOfMemoryError: Java heap space, could you retrigger the CI ?

…backward compatibility and added two new testcases

benoyantony · 2017-03-31T01:36:12Z

I had retriggered it once. Doing it again.

benoyantony · 2017-03-31T02:59:04Z

That seems to have worked. thanks @zjffdu .

benoyantony · 2017-03-31T03:17:03Z

@felixcheung , Could you please help commit this patch ?

zjffdu · 2017-03-31T03:17:59Z

Thanks @benoyantony , LGTM

benoyantony · 2017-03-31T04:02:11Z

Thanks for the prompt review, @zjffdu .

felixcheung · 2017-03-31T04:00:32Z

docs/interpreter/livy.md

@@ -61,6 +61,11 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory`
    <td>Max number of Spark SQL result to display.</td>
  </tr>
  <tr>
+    <td>zeppelin.livy.spark.sql.truncate</td>
+    <td>true</td>
+    <td>Whether to truncate strings or not</td>


instead of strings should we say "output" or "results"?

and is this only for livy.sql interpreter? should it apply to other livy interpreter?

I wrote the description to match the documentation of the corresponding truncate argument in the spark dataset method
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset@show(numRows:Int,truncate:Boolean):Unit

truncate -
Whether truncate long strings. If true, strings more than 20 characters will be truncated and all cells will be aligned right

This should be applicable only to the LivySparkSQLInterpreter. Only in this interpreter, we invoke the above method as part of the code. For other interpreters, the user supplied code can call the appropriate show() method.

Should I say

Whether to truncate long strings in results or not

right, to consider:
https://github.com/apache/zeppelin/blob/master/spark/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L56

felixcheung · 2017-03-31T04:02:25Z

livy/src/main/resources/interpreter-setting.json

+      "zeppelin.livy.spark.sql.truncate": {
+        "propertyName": "zeppelin.livy.spark.sql.truncate",
+        "defaultValue": "true",
+        "description": "If true, truncate strings greater than 20 characters."


btw, I think perhaps this could be more useful if this is the truncate line numbers (eg. 20), instead of true/false.
I think I recall the Spark interpreter has it that way too.

I tired this, but I get the following error while running unit tests: " found : Int(20)\n"," required: Boolean\n".

zeppelin.livy.spark.sql.maxResult is for truncating line numbers

oh got it, sorry, so there's one for max line number.
perhaps to rename this to clarify? zeppelin.livy.spark.sql.truncate.string.value or something?

How about zeppelin.livy.spark.sql.field.truncate ? This is for truncating each field in table.

Shall I make the following change ?

zeppelin.livy.spark.sql.field.truncate
If true, truncate strings greater than 20 characters. (interpreter-setting.json)

Whether to truncate fields longer than 20 characters or not (in livy.md)

felixcheung · 2017-04-01T03:47:36Z

docs/interpreter/livy.md

@@ -56,14 +56,14 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory`
    <td>URL where livy server is running</td>
  </tr>
  <tr>
-    <td>zeppelin.livy.spark.maxResult</td>
+    <td>zeppelin.livy.spark.sql.maxResult</td>


I love this, but it might break existing users. Perhaps we don't change this for now?

benoyantony · 2017-04-01T04:23:13Z

It should not break anything as the property name remains same as before. The documentation had a typo in it. It was missing 'sql' in it. If it's confusing, I can do it as a separate fix.

…

On Mar 31, 2017 8:48 PM, "Felix Cheung" ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In docs/interpreter/livy.md <#2201 (comment)>: > @@ -56,14 +56,14 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory` <td>URL where livy server is running</td> </tr> <tr> - <td>zeppelin.livy.spark.maxResult</td> + <td>zeppelin.livy.spark.sql.maxResult</td> I love this, but it might break existing users. Perhaps we don't change this for now? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2201 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB1uyu_U3xz4WnpFAGISqFZMWZSy3q8Jks5rrckIgaJpZM4Msjp1> .

felixcheung · 2017-04-01T07:19:51Z

Ah got it. LGTM then Merging if no more comments.

benoyantony · 2017-04-03T22:24:20Z

Thank you @zjffdu and @felixcheung

… to display results Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property **zeppelin.livy.spark.sql.field.truncate** to control whether to truncate strings or not. By default, **zeppelin.livy.spark.sql.field.truncate** is set to **true**. Improvement https://issues.apache.org/jira/browse/ZEPPELIN-1965 Set zeppelin.livy.spark.sql.field.truncate to true or false Run a SQL query which produces string values of length greater than 20. Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not. * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Benoy Antony <benoy@apache.org> Closes apache#2201 from benoyantony/master and squashes the following commits: bb006c0 [Benoy Antony] changed field name and description 9eae68b [Benoy Antony] added a null check to avoid testcase failures, another nullcheck for backward compatibility and added two new testcases ab1ead2 [Benoy Antony] documented zeppelin.livy.spark.sql.truncate b6252be [Benoy Antony] [ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false) to display results (cherry picked from commit 1135fb6) Change-Id: Iee5341a7890bcdee386d1c22fc36d12692adccc3

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)…

b6252be

… to display results

documented zeppelin.livy.spark.sql.truncate

ab1ead2

added a null check to avoid testcase failures, another nullcheck for …

9eae68b

…backward compatibility and added two new testcases

felixcheung reviewed Mar 31, 2017

View reviewed changes

changed field name and description

bb006c0

felixcheung reviewed Apr 1, 2017

View reviewed changes

asfgit closed this in 1135fb6 Apr 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

benoyantony commented Mar 29, 2017 •

edited

zjffdu commented Mar 29, 2017

zjffdu commented Mar 29, 2017

benoyantony commented Mar 30, 2017 •

edited

zjffdu commented Mar 31, 2017

benoyantony commented Mar 31, 2017

benoyantony commented Mar 31, 2017

benoyantony commented Mar 31, 2017

zjffdu commented Mar 31, 2017

benoyantony commented Mar 31, 2017

felixcheung Mar 31, 2017

felixcheung Mar 31, 2017

benoyantony Mar 31, 2017

benoyantony Mar 31, 2017 •

edited

felixcheung Mar 31, 2017

felixcheung Mar 31, 2017

benoyantony Mar 31, 2017 •

edited

zjffdu Mar 31, 2017

felixcheung Mar 31, 2017

zjffdu Mar 31, 2017

benoyantony Mar 31, 2017

felixcheung Apr 1, 2017

benoyantony commented Apr 1, 2017 via email

felixcheung commented Apr 1, 2017 via email

benoyantony commented Apr 3, 2017

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

Conversation

benoyantony commented Mar 29, 2017 • edited

What is this PR for?

What type of PR is it?

What is the Jira issue?

How should this be tested?

Questions:

zjffdu commented Mar 29, 2017

zjffdu commented Mar 29, 2017

benoyantony commented Mar 30, 2017 • edited

zjffdu commented Mar 31, 2017

benoyantony commented Mar 31, 2017

benoyantony commented Mar 31, 2017

benoyantony commented Mar 31, 2017

zjffdu commented Mar 31, 2017

benoyantony commented Mar 31, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benoyantony Mar 31, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benoyantony Mar 31, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benoyantony commented Apr 1, 2017 via email

felixcheung commented Apr 1, 2017 via email

benoyantony commented Apr 3, 2017

benoyantony commented Mar 29, 2017 •

edited

benoyantony commented Mar 30, 2017 •

edited

benoyantony Mar 31, 2017 •

edited

benoyantony Mar 31, 2017 •

edited