Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false)… #2201

Closed
wants to merge 4 commits into from
Closed

Conversation

benoyantony
Copy link

@benoyantony benoyantony commented Mar 29, 2017

… to display results

What is this PR for?

Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property zeppelin.livy.spark.sql.field.truncate to control whether to truncate strings or not. By default, zeppelin.livy.spark.sql.field.truncate is set to true.

What type of PR is it?

Improvement

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-1965

How should this be tested?

Set zeppelin.livy.spark.sql.field.truncate to true or false
Run a SQL query which produces string values of length greater than 20.
Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not.

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@zjffdu
Copy link
Contributor

zjffdu commented Mar 29, 2017

Thanks @benoyantony for contribution, would you mind to add this configuration into livy.md as well ?

@zjffdu
Copy link
Contributor

zjffdu commented Mar 29, 2017

One minor thing left, could you add test case for it ?

@benoyantony
Copy link
Author

benoyantony commented Mar 30, 2017

There seems to be a problem with Jenkins as the error is " java.io.IOException: Remote call on ubuntu-2 failed". Any idea on how to overcome this ?

The travis build has all checks passed.

@zjffdu
Copy link
Contributor

zjffdu commented Mar 31, 2017

It is caused by Caused by: java.lang.OutOfMemoryError: Java heap space, could you retrigger the CI ?

…backward compatibility and added two new testcases
@benoyantony
Copy link
Author

I had retriggered it once. Doing it again.

@benoyantony
Copy link
Author

That seems to have worked. thanks @zjffdu .

@benoyantony
Copy link
Author

@felixcheung , Could you please help commit this patch ?

@zjffdu
Copy link
Contributor

zjffdu commented Mar 31, 2017

Thanks @benoyantony , LGTM

@benoyantony
Copy link
Author

Thanks for the prompt review, @zjffdu .

@@ -61,6 +61,11 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory`
<td>Max number of Spark SQL result to display.</td>
</tr>
<tr>
<td>zeppelin.livy.spark.sql.truncate</td>
<td>true</td>
<td>Whether to truncate strings or not</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of strings should we say "output" or "results"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and is this only for livy.sql interpreter? should it apply to other livy interpreter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote the description to match the documentation of the corresponding truncate argument in the spark dataset method
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset@show(numRows:Int,truncate:Boolean):Unit

truncate -
Whether truncate long strings. If true, strings more than 20 characters will be truncated and all cells will be aligned right

This should be applicable only to the LivySparkSQLInterpreter. Only in this interpreter, we invoke the above method as part of the code. For other interpreters, the user supplied code can call the appropriate show() method.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I say

Whether to truncate long strings in results or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"zeppelin.livy.spark.sql.truncate": {
"propertyName": "zeppelin.livy.spark.sql.truncate",
"defaultValue": "true",
"description": "If true, truncate strings greater than 20 characters."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I think perhaps this could be more useful if this is the truncate line numbers (eg. 20), instead of true/false.
I think I recall the Spark interpreter has it that way too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tired this, but I get the following error while running unit tests: " found : Int(20)\n"," required: Boolean\n".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zeppelin.livy.spark.sql.maxResult is for truncating line numbers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh got it, sorry, so there's one for max line number.
perhaps to rename this to clarify? zeppelin.livy.spark.sql.truncate.string.value or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about zeppelin.livy.spark.sql.field.truncate ? This is for truncating each field in table.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I make the following change ?

zeppelin.livy.spark.sql.field.truncate
If true, truncate strings greater than 20 characters. (interpreter-setting.json)

Whether to truncate fields longer than 20 characters or not (in livy.md)

@@ -56,14 +56,14 @@ Example: `spark.driver.memory` to `livy.spark.driver.memory`
<td>URL where livy server is running</td>
</tr>
<tr>
<td>zeppelin.livy.spark.maxResult</td>
<td>zeppelin.livy.spark.sql.maxResult</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this, but it might break existing users. Perhaps we don't change this for now?

@benoyantony
Copy link
Author

benoyantony commented Apr 1, 2017 via email

@felixcheung
Copy link
Member

felixcheung commented Apr 1, 2017 via email

@asfgit asfgit closed this in 1135fb6 Apr 2, 2017
@benoyantony
Copy link
Author

Thank you @zjffdu and @felixcheung

prabhjyotsingh added a commit to prabhjyotsingh/zeppelin that referenced this pull request Sep 1, 2017
… to display results

Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property **zeppelin.livy.spark.sql.field.truncate** to control whether to truncate strings or not. By default, **zeppelin.livy.spark.sql.field.truncate** is set to **true**.

Improvement

https://issues.apache.org/jira/browse/ZEPPELIN-1965

Set zeppelin.livy.spark.sql.field.truncate to true or false
Run a SQL query which produces string values of length greater than 20.
Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not.

* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Benoy Antony <benoy@apache.org>

Closes apache#2201 from benoyantony/master and squashes the following commits:

bb006c0 [Benoy Antony] changed field name and description
9eae68b [Benoy Antony] added a null check to avoid testcase failures, another nullcheck for backward compatibility and added two new testcases
ab1ead2 [Benoy Antony] documented zeppelin.livy.spark.sql.truncate
b6252be [Benoy Antony] [ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false) to display results

(cherry picked from commit 1135fb6)

Change-Id: Iee5341a7890bcdee386d1c22fc36d12692adccc3
prabhjyotsingh added a commit to prabhjyotsingh/zeppelin that referenced this pull request Oct 23, 2017
… to display results

Livy SQL interpreter truncate result strings of size greater than 20. In some cases, we like to see the full string. We are adding a interpreter property **zeppelin.livy.spark.sql.field.truncate** to control whether to truncate strings or not. By default, **zeppelin.livy.spark.sql.field.truncate** is set to **true**.

Improvement

https://issues.apache.org/jira/browse/ZEPPELIN-1965

Set zeppelin.livy.spark.sql.field.truncate to true or false
Run a SQL query which produces string values of length greater than 20.
Depending on the value of zeppelin.livy.spark.sql.field.truncate, the strings will either get truncated or not.

* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Benoy Antony <benoy@apache.org>

Closes apache#2201 from benoyantony/master and squashes the following commits:

bb006c0 [Benoy Antony] changed field name and description
9eae68b [Benoy Antony] added a null check to avoid testcase failures, another nullcheck for backward compatibility and added two new testcases
ab1ead2 [Benoy Antony] documented zeppelin.livy.spark.sql.truncate
b6252be [Benoy Antony] [ZEPPELIN-1965] Livy SQL Interpreter: Should use df.show(1000, false) to display results

(cherry picked from commit 1135fb6)

Change-Id: Iee5341a7890bcdee386d1c22fc36d12692adccc3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants