[SPARK-34760][EXAMPLES] Replace `favorite_color` with `age` in JavaSQLDataSourceExample #31851

zengruios · 2021-03-16T13:01:01Z

What changes were proposed in this pull request?

In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");'
throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;'
Change the column favorite_color to age.

Why are the changes needed?

Run JavaSparkSQLExample successfully.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

test in JavaSparkSQLExample .

maropu · 2021-03-16T14:47:07Z

ok to test

maropu · 2021-03-16T14:47:12Z

Looks fine.

AmplabJenkins · 2021-03-16T16:43:15Z

Can one of the admins verify this patch?

dongjoon-hyun

Thank you for the fix, @zengruios .

However, this PR introduces inconsistency from Scala/Python examples.

Please see Scala/Python example.

    # $example on:write_partition_and_bucket$
    df = spark.read.parquet("examples/src/main/resources/users.parquet")
    (df
        .write
        .partitionBy("favorite_color")
        .bucketBy(42, "name")
        .saveAsTable("people_partitioned_bucketed"))
    # $example off:write_partition_and_bucket$

    # $example on:write_partition_and_bucket$
    df = spark.read.parquet("examples/src/main/resources/users.parquet")
    (df
        .write
        .partitionBy("favorite_color")
        .bucketBy(42, "name")
        .saveAsTable("people_partitioned_bucketed"))
    # $example off:write_partition_and_bucket$

I guess we need to replace peopleDF with usersDF instead at line 207.

zengruios · 2021-03-16T23:00:26Z

@dongjoon-hyun, thanks for your suggestion, maybe the table's name should be changed to user_partitioned_bucketed, I will try to fix it like this.

maropu · 2021-03-16T23:07:16Z

Could you merge the #31852 fix into this PR? These issues are similar and minor, so merging them looks okay.

zengruios · 2021-03-16T23:14:50Z

@maropu, OK，I will merge them.

maropu · 2021-03-16T23:16:26Z

Thank you, @zengruios

zengruios · 2021-03-17T00:27:08Z

@maropu, @dongjoon-hyun, I have update it, can you review it again, thanks!

HyukjinKwon · 2021-03-18T14:08:54Z

@yaooqinn maybe can you try merging this as a brand new committer :-)?

yaooqinn · 2021-03-18T14:12:19Z

thanks, @HyukjinKwon. I will merge this to master only, is it Okay?

HyukjinKwon · 2021-03-18T14:15:23Z

Improvements are not backported in general but looks like this is a bug fix in the example (reading from JIRA) which is usually backported. The JIRA states the affected versions are 3.0.1 and 3.1.1 so I would merge this to branch-3.1 and branch-3.0.

yaooqinn · 2021-03-18T14:17:49Z

Yea, LGTM~

yaooqinn · 2021-03-18T14:56:59Z

My network is not in good condition at the moment. It took years to fetch and push this PR to master :(.. Now, it's fighting for branch-3.1

…LDataSourceExample ### What changes were proposed in this pull request? In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");' throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;' Change the column favorite_color to age. ### Why are the changes needed? Run JavaSparkSQLExample successfully. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? test in JavaSparkSQLExample . Closes #31851 from zengruios/SPARK-34760. Authored-by: zengruios <578395184@qq.com> Signed-off-by: Kent Yao <yao@apache.org> (cherry picked from commit 5570f81) Signed-off-by: Kent Yao <yao@apache.org>

yaooqinn · 2021-03-18T15:17:00Z

@zengruios Thanks for your first contribution to Apache Spark.

I have added you as a contributor at the JIRA side, and SPARK-34760 has been assigned to you.

Thanks for the review, @dongjoon-hyun @maropu @HyukjinKwon

Merged to master/3.1/3.0.

HyukjinKwon · 2021-03-18T15:44:06Z

👏

dongjoon-hyun · 2021-03-18T16:14:24Z

Congratulation, @zengruios and @yaooqinn .

maropu · 2021-03-18T23:41:22Z

late lgtm 👏

…LDataSourceExample ### What changes were proposed in this pull request? In JavaSparkSQLExample when excecute 'peopleDF.write().partitionBy("favorite_color").bucketBy(42,"name").saveAsTable("people_partitioned_bucketed");' throws Exception: 'Exception in thread "main" org.apache.spark.sql.AnalysisException: partition column favorite_color is not defined in table people_partitioned_bucketed, defined table columns are: age, name;' Change the column favorite_color to age. ### Why are the changes needed? Run JavaSparkSQLExample successfully. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? test in JavaSparkSQLExample . Closes apache#31851 from zengruios/SPARK-34760. Authored-by: zengruios <578395184@qq.com> Signed-off-by: Kent Yao <yao@apache.org> (cherry picked from commit 5570f81) Signed-off-by: Kent Yao <yao@apache.org>

[BugFix]fix the bug in issue SPARK-34759.

5cbe603

github-actions bot added EXAMPLES SQL labels Mar 16, 2021

maropu changed the title ~~[BugFix]fix the bug in issue SPARK-34760.~~ [MINOR] Correct an example error in JavaSQLDataSourceExample Mar 16, 2021

maropu approved these changes Mar 16, 2021

View reviewed changes

dongjoon-hyun changed the title ~~[MINOR] Correct an example error in JavaSQLDataSourceExample~~ [MINOR][EXAMPLES] Replace favorite_color with age in JavaSQLDataSourceExample Mar 16, 2021

dongjoon-hyun requested changes Mar 16, 2021

View reviewed changes

maropu changed the title ~~[MINOR][EXAMPLES] Replace favorite_color with age in JavaSQLDataSourceExample~~ [SPARK-34760][EXAMPLES][MINOR] Replace favorite_color with age in JavaSQLDataSourceExample Mar 16, 2021

[BugFix]fix the bug in issue SPARK-34759 and SPARK-34760.

fbc6a73

zengruios force-pushed the SPARK-34760 branch from 4377a9f to fbc6a73 Compare March 17, 2021 00:23

github-actions bot added the PYTHON label Mar 17, 2021

HyukjinKwon approved these changes Mar 18, 2021

View reviewed changes

yaooqinn approved these changes Mar 18, 2021

View reviewed changes

yaooqinn changed the title ~~[SPARK-34760][EXAMPLES][MINOR] Replace favorite_color with age in JavaSQLDataSourceExample~~ [SPARK-34760][EXAMPLES] Replace favorite_color with age in JavaSQLDataSourceExample Mar 18, 2021

yaooqinn closed this in 5570f81 Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-34760][EXAMPLES] Replace `favorite_color` with `age` in JavaSQLDataSourceExample #31851

[SPARK-34760][EXAMPLES] Replace `favorite_color` with `age` in JavaSQLDataSourceExample #31851

zengruios commented Mar 16, 2021

maropu commented Mar 16, 2021

maropu commented Mar 16, 2021

AmplabJenkins commented Mar 16, 2021

dongjoon-hyun left a comment •

edited

zengruios commented Mar 16, 2021 •

edited

maropu commented Mar 16, 2021

zengruios commented Mar 16, 2021

maropu commented Mar 16, 2021

zengruios commented Mar 17, 2021

HyukjinKwon commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

HyukjinKwon commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

HyukjinKwon commented Mar 18, 2021

dongjoon-hyun commented Mar 18, 2021

maropu commented Mar 18, 2021

[SPARK-34760][EXAMPLES] Replace favorite_color with age in JavaSQLDataSourceExample #31851

[SPARK-34760][EXAMPLES] Replace favorite_color with age in JavaSQLDataSourceExample #31851

Conversation

zengruios commented Mar 16, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

maropu commented Mar 16, 2021

maropu commented Mar 16, 2021

AmplabJenkins commented Mar 16, 2021

dongjoon-hyun left a comment • edited

Choose a reason for hiding this comment

zengruios commented Mar 16, 2021 • edited

maropu commented Mar 16, 2021

zengruios commented Mar 16, 2021

maropu commented Mar 16, 2021

zengruios commented Mar 17, 2021

HyukjinKwon commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

HyukjinKwon commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

yaooqinn commented Mar 18, 2021

HyukjinKwon commented Mar 18, 2021

dongjoon-hyun commented Mar 18, 2021

maropu commented Mar 18, 2021

[SPARK-34760][EXAMPLES] Replace `favorite_color` with `age` in JavaSQLDataSourceExample #31851

[SPARK-34760][EXAMPLES] Replace `favorite_color` with `age` in JavaSQLDataSourceExample #31851

dongjoon-hyun left a comment •

edited

zengruios commented Mar 16, 2021 •

edited