New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201
Conversation
… to display results
Thanks @benoyantony for contribution, would you mind to add this configuration into livy.md as well ? |
One minor thing left, could you add test case for it ? |
There seems to be a problem with Jenkins as the error is " java.io.IOException: Remote call on ubuntu-2 failed". Any idea on how to overcome this ? The travis build has all checks passed. |
It is caused by |
…backward compatibility and added two new testcases
I had retriggered it once. Doing it again. |
That seems to have worked. thanks @zjffdu . |
@felixcheung , Could you please help commit this patch ? |
Thanks @benoyantony , LGTM |
Thanks for the prompt review, @zjffdu . |
docs/interpreter/livy.md
Outdated
@@ -61,6 +61,11 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory` | |||
<td>Max number of Spark SQL result to display.</td> | |||
</tr> | |||
<tr> | |||
<td>zeppelin.livy.spark.sql.truncate</td> | |||
<td>true</td> | |||
<td>Whether to truncate strings or not</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of strings
should we say "output" or "results"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and is this only for livy.sql interpreter? should it apply to other livy interpreter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote the description to match the documentation of the corresponding truncate argument in the spark dataset method
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset@show(numRows:Int,truncate:Boolean):Unit
truncate -
Whether truncate long strings. If true, strings more than 20 characters will be truncated and all cells will be aligned right
This should be applicable only to the LivySparkSQLInterpreter. Only in this interpreter, we invoke the above method as part of the code. For other interpreters, the user supplied code can call the appropriate show() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I say
Whether to truncate long strings in results or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"zeppelin.livy.spark.sql.truncate": { | ||
"propertyName": "zeppelin.livy.spark.sql.truncate", | ||
"defaultValue": "true", | ||
"description": "If true, truncate strings greater than 20 characters." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I think perhaps this could be more useful if this is the truncate line numbers (eg. 20), instead of true/false.
I think I recall the Spark interpreter has it that way too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tired this, but I get the following error while running unit tests: " found : Int(20)\n"," required: Boolean\n".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zeppelin.livy.spark.sql.maxResult
is for truncating line numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh got it, sorry, so there's one for max line number.
perhaps to rename this to clarify? zeppelin.livy.spark.sql.truncate.string.value
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about zeppelin.livy.spark.sql.field.truncate
? This is for truncating each field in table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall I make the following change ?
zeppelin.livy.spark.sql.field.truncate
If true, truncate strings greater than 20 characters. (interpreter-setting.json)
Whether to truncate fields longer than 20 characters or not (in livy.md)
@@ -56,14 +56,14 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory` | |||
<td>URL where livy server is running</td> | |||
</tr> | |||
<tr> | |||
<td>zeppelin.livy.spark.maxResult</td> | |||
<td>zeppelin.livy.spark.sql.maxResult</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this, but it might break existing users. Perhaps we don't change this for now?
It should not break anything as the property name remains same as before.
The documentation had a typo in it. It was missing 'sql' in it.
If it's confusing, I can do it as a separate fix.
…On Mar 31, 2017 8:48 PM, "Felix Cheung" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/interpreter/livy.md
<#2201 (comment)>:
> @@ -56,14 +56,14 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory`
<td>URL where livy server is running</td>
</tr>
<tr>
- <td>zeppelin.livy.spark.maxResult</td>
+ <td>zeppelin.livy.spark.sql.maxResult</td>
I love this, but it might break existing users. Perhaps we don't change
this for now?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2201 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB1uyu_U3xz4WnpFAGISqFZMWZSy3q8Jks5rrckIgaJpZM4Msjp1>
.
|
Ah got it. LGTM then
Merging if no more comments.
|
Thank you @zjffdu and @felixcheung |
… to display results Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property **zeppelin.livy.spark.sql.field.truncate** to control whether to truncate strings or not. By default, **zeppelin.livy.spark.sql.field.truncate** is set to **true**. Improvement https://issues.apache.org/jira/browse/ZEPPELIN-1965 Set zeppelin.livy.spark.sql.field.truncate to true or false Run a SQL query which produces string values of length greater than 20. Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not. * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Benoy Antony <benoy@apache.org> Closes apache#2201 from benoyantony/master and squashes the following commits: bb006c0 [Benoy Antony] changed field name and description 9eae68b [Benoy Antony] added a null check to avoid testcase failures, another nullcheck for backward compatibility and added two new testcases ab1ead2 [Benoy Antony] documented zeppelin.livy.spark.sql.truncate b6252be [Benoy Antony] [ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false) to display results (cherry picked from commit 1135fb6) Change-Id: Iee5341a7890bcdee386d1c22fc36d12692adccc3
… to display results Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property **zeppelin.livy.spark.sql.field.truncate** to control whether to truncate strings or not. By default, **zeppelin.livy.spark.sql.field.truncate** is set to **true**. Improvement https://issues.apache.org/jira/browse/ZEPPELIN-1965 Set zeppelin.livy.spark.sql.field.truncate to true or false Run a SQL query which produces string values of length greater than 20. Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not. * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Benoy Antony <benoy@apache.org> Closes apache#2201 from benoyantony/master and squashes the following commits: bb006c0 [Benoy Antony] changed field name and description 9eae68b [Benoy Antony] added a null check to avoid testcase failures, another nullcheck for backward compatibility and added two new testcases ab1ead2 [Benoy Antony] documented zeppelin.livy.spark.sql.truncate b6252be [Benoy Antony] [ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false) to display results (cherry picked from commit 1135fb6) Change-Id: Iee5341a7890bcdee386d1c22fc36d12692adccc3
… to display results
What is this PR for?
Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property zeppelin.livy.spark.sql.field.truncate to control whether to truncate strings or not. By default, zeppelin.livy.spark.sql.field.truncate is set to true.
What type of PR is it?
Improvement
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1965
How should this be tested?
Set zeppelin.livy.spark.sql.field.truncate to true or false
Run a SQL query which produces string values of length greater than 20.
Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not.
Questions: