-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the Spark SQL Programming guide with Custom object encoding for Dataset and unsupported operation error handling #16997
Conversation
Can one of the admins verify this patch? |
docs/sql-programming-guide.md
Outdated
@@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested | |||
types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be | |||
registered as a table. Tables can be used in subsequent SQL statements. | |||
|
|||
Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's minor, but there are enough problems with the text to call it out. Please match the voice of the other text and avoid 'we'. Typos: "datase", "spark sql" and "kryo" for example. Use back-ticks to consistently format code if you're going to. What is Object-Name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello srowen,
I have updated the content to match the void of the content, you can have another look at it.
docs/sql-programming-guide.md
Outdated
@@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested | |||
types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be | |||
registered as a table. Tables can be used in subsequent SQL statements. | |||
|
|||
Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is trivial.. but maybe spark
-> Spark
? I am not an expert in grammar but up to my knowledge, capitalizing a proper noun is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, @HarshSharma8 this still doesn't address the comments. Use back-ticks for code, not bold, too. What is Object-Name?
BTW, could we maybe make the title complete (not |
Hello Sean,
I apologize for bold instead of back-ticks, and i'm updating the content
for this.
Thank You
Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
<https://twitter.com/harsh_sharma5> | Linked In
<https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
harshs316@gmail.com
Skype*: khandal60*
*+91-8447307237*
…On Tue, Feb 21, 2017 at 10:58 AM, Sean Owen ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/sql-programming-guide.md
<#16997 (comment)>:
> @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
registered as a table. Tables can be used in subsequent SQL statements.
+Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
Yes, @HarshSharma8 <https://github.com/HarshSharma8> this still doesn't
address the comments. Use back-ticks for code, not bold, too. What is
Object-Name?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16997 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AKIiQM8Tsz96c1KHGszvbFmgJnnRD62Gks5renYPgaJpZM4MF0vf>
.
|
…stead of bold tags
Hello Sean,
I have updated the content with back-ticks, Can you have a look at this ?
And i am not getting which object-name you are asking about.
Thank You
Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
<https://twitter.com/harsh_sharma5> | Linked In
<https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
harshs316@gmail.com
Skype*: khandal60*
*+91-8447307237*
…On Tue, Feb 21, 2017 at 11:03 AM, Harsh Sharma ***@***.***> wrote:
Hello Sean,
I apologize for bold instead of back-ticks, and i'm updating the content
for this.
Thank You
Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
<https://twitter.com/harsh_sharma5> | Linked In
<https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
***@***.***
Skype*: khandal60*
*+91-8447307237*
On Tue, Feb 21, 2017 at 10:58 AM, Sean Owen ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
> ------------------------------
>
> In docs/sql-programming-guide.md
> <#16997 (comment)>:
>
> > @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
> types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
> registered as a table. Tables can be used in subsequent SQL statements.
>
> +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
>
> Yes, @HarshSharma8 <https://github.com/HarshSharma8> this still doesn't
> address the comments. Use back-ticks for code, not bold, too. What is
> Object-Name?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#16997 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AKIiQM8Tsz96c1KHGszvbFmgJnnRD62Gks5renYPgaJpZM4MF0vf>
> .
>
|
You are still bold-facing code elements, and now back-ticked a string, which isn't code. There are still typos like "create dataset" instead of "create a Dataset". Do you mean to write something to indicate a class name will be in the message? then write something like "[class name]". There is no object name here. Please review carefully before you ask for another review. |
I updated the content with a demo object. I would appreciate if anyone can have a look at this. |
Could you fix the PR title too while you are online maybe? It might be nice to have a good title for both a commit log and those who like to track down the history. |
Hello HyukjinKwon, |
Did anyone get a chance to verify it or any changes required by me to make ? |
This still has formatting and text problems. I'm sorry I don't think I can go around again for this when it's not an important change, and I'd like to close this. |
Sure, and thanks for kind attention to this pull request.
Thank You
Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Knoldus Software LLP
FB <https://www.facebook.com/harsh.sharma.161446> | Twitter
<https://twitter.com/harsh_sharma5> | LinkedIn
<https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
harshs316@gmail.com
Skype*: khandal60*
*+91-8447307237*
…On Sun, Mar 5, 2017 at 10:13 PM, Sean Owen ***@***.***> wrote:
This still has formatting and text problems. I'm sorry I don't think I can
go around again for this when it's not an important change, and I'd like to
close this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16997 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AKIiQARgsS9c8P7s7slP6T39bwCfW7ywks5riuZGgaJpZM4MF0vf>
.
|
Closes apache#16819 Closes apache#13467 Closes apache#16083 Closes apache#17135 Closes apache#8785 Closes apache#16278 Closes apache#16997 Closes apache#17073 Closes apache#17220
What changes were proposed in this pull request?
Made some updates to SQL programming guide to explain the Encoding operation with kryo.
How was this patch tested?
Just updated the docs.
Please review http://spark.apache.org/contributing.html before opening a pull request.