New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-34760][EXAMPLES] Replace favorite_color
with age
in JavaSQLDataSourceExample
#31851
Conversation
ok to test |
Looks fine. |
Can one of the admins verify this patch? |
favorite_color
with age
in JavaSQLDataSourceExample
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix, @zengruios .
However, this PR introduces inconsistency from Scala/Python examples.
Please see Scala/Python example.
# $example on:write_partition_and_bucket$
df = spark.read.parquet("examples/src/main/resources/users.parquet")
(df
.write
.partitionBy("favorite_color")
.bucketBy(42, "name")
.saveAsTable("people_partitioned_bucketed"))
# $example off:write_partition_and_bucket$
# $example on:write_partition_and_bucket$
df = spark.read.parquet("examples/src/main/resources/users.parquet")
(df
.write
.partitionBy("favorite_color")
.bucketBy(42, "name")
.saveAsTable("people_partitioned_bucketed"))
# $example off:write_partition_and_bucket$
I guess we need to replace peopleDF
with usersDF
instead at line 207.
@dongjoon-hyun, thanks for your suggestion, maybe the table's name should be changed to user_partitioned_bucketed, I will try to fix it like this. |
favorite_color
with age
in JavaSQLDataSourceExamplefavorite_color
with age
in JavaSQLDataSourceExample
Could you merge the #31852 fix into this PR? These issues are similar and minor, so merging them looks okay. |
@maropu, OK,I will merge them. |
Thank you, @zengruios |
@maropu, @dongjoon-hyun, I have update it, can you review it again, thanks! |
@yaooqinn maybe can you try merging this as a brand new committer :-)? |
thanks, @HyukjinKwon. I will merge this to master only, is it Okay? |
Improvements are not backported in general but looks like this is a bug fix in the example (reading from JIRA) which is usually backported. The JIRA states the affected versions are 3.0.1 and 3.1.1 so I would merge this to branch-3.1 and branch-3.0. |
Yea, LGTM~ |
favorite_color
with age
in JavaSQLDataSourceExamplefavorite_color
with age
in JavaSQLDataSourceExample
My network is not in good condition at the moment. It took years to fetch and push this PR to master :(.. Now, it's fighting for branch-3.1 |
…LDataSourceExample ### What changes were proposed in this pull request? In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");' throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;' Change the column favorite_color to age. ### Why are the changes needed? Run JavaSparkSQLExample successfully. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? test in JavaSparkSQLExample . Closes #31851 from zengruios/SPARK-34760. Authored-by: zengruios <578395184@qq.com> Signed-off-by: Kent Yao <yao@apache.org> (cherry picked from commit 5570f81) Signed-off-by: Kent Yao <yao@apache.org>
…LDataSourceExample ### What changes were proposed in this pull request? In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");' throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;' Change the column favorite_color to age. ### Why are the changes needed? Run JavaSparkSQLExample successfully. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? test in JavaSparkSQLExample . Closes #31851 from zengruios/SPARK-34760. Authored-by: zengruios <578395184@qq.com> Signed-off-by: Kent Yao <yao@apache.org> (cherry picked from commit 5570f81) Signed-off-by: Kent Yao <yao@apache.org>
@zengruios Thanks for your first contribution to Apache Spark. I have added you as a contributor at the JIRA side, and SPARK-34760 has been assigned to you. Thanks for the review, @dongjoon-hyun @maropu @HyukjinKwon Merged to master/3.1/3.0. |
👏 |
Congratulation, @zengruios and @yaooqinn . |
late lgtm 👏 |
…LDataSourceExample ### What changes were proposed in this pull request? In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");' throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;' Change the column favorite_color to age. ### Why are the changes needed? Run JavaSparkSQLExample successfully. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? test in JavaSparkSQLExample . Closes apache#31851 from zengruios/SPARK-34760. Authored-by: zengruios <578395184@qq.com> Signed-off-by: Kent Yao <yao@apache.org> (cherry picked from commit 5570f81) Signed-off-by: Kent Yao <yao@apache.org>
What changes were proposed in this pull request?
In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");'
throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;'
Change the column favorite_color to age.
Why are the changes needed?
Run JavaSparkSQLExample successfully.
Does this PR introduce any user-facing change?
NO
How was this patch tested?
test in JavaSparkSQLExample .