Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite #32420

Closed
wants to merge 3 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented May 3, 2021

What changes were proposed in this pull request?

This PR intends to replace maropu/spark-tpcds-datagen with databricks/tpcds-kit for using a newer dsdgen and update the golden files in tpcds-query-results.

Why are the changes needed?

For better testing.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

GA passed.

@maropu
Copy link
Member Author

maropu commented May 3, 2021

I'm checking the results query-by-query...

@@ -3,4 +3,4 @@
-- !query schema
struct<i_item_id:string,i_item_desc:string,s_state:string,store_sales_quantitycount:bigint,store_sales_quantityave:double,store_sales_quantitystdev:double,store_sales_quantitycov:double,as_store_returns_quantitycount:bigint,as_store_returns_quantityave:double,as_store_returns_quantitystdev:double,store_returns_quantitycov:double,catalog_sales_quantitycount:bigint,catalog_sales_quantityave:double,catalog_sales_quantitystdev:double,catalog_sales_quantitycov:double>
-- !query output
AAAAAAAAKPFEAAAA Recently right TN 1 99.0 NULL NULL 1 66.0 NULL NULL 1 32.0 NULL NULL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked that the result is the same when using StringType instead of CharType. It looks sf=1 is not enough to output valid rows, e.g., if sf=20, this result will be;

+AAAAAAAAAAMHBAAA       English, labour shares put to a sanctions. Central, following opinions will wreck workers; curious, real qualities could not involve nearly most national parties. Unable, detailed services    SD      1       49.0    NULL    NULL    1       8.0     NULL    NULL    1       68.0    NULL    NULL
+AAAAAAAAGBKDAAAA       Other men can keep in a customers. Surprised premises might not allow. Technical, british cler  SD      1       68.0    NULL    NULL    1       11.0    NULL    NULL    1       30.0    NULL    NULL
+AAAAAAAAGEOBBAAA       Safe, rough minutes shall find again famous orders. British     SD      1       38.0    NULL    NULL    1       22.0    NULL    NULL    1       88.0    NULL    NULL
+AAAAAAAAIBLFBAAA       Communications love with a texts. Powerful proceedings should feel players. Full names should not cover various, full spaces. Clubs must intervene in the ho    AL      1       2.0     NULL    NULL    1       2.0     NULL    NULL    1       49.0    NULL    NULL
+AAAAAAAAKOJIAAAA       Areas inform empirical scientists. Old  TN      1       68.0    NULL    NULL    1       21.0    NULL    NULL    1       87.0    NULL    NULL
+AAAAAAAAKPMNAAAA       Generally great holidays must keep separately very domestic cases; doctors explain then nuclear friends. Systemat       AL      1       4.0     NULL    NULL    1       4.0     NULL    NULL    1       64.0    NULL    NULL
+AAAAAAAAONIMAAAA       Tempera AL      1       58.0    NULL    NULL    1       14.0    NULL    NULL    1       73.0    NULL    NULL
+AAAAAAAAOPAJAAAA       Scots will not hang children. Long groups should not need also ol       TN      1       59.0    NULL    NULL    1       9.0     NULL    NULL    1       32.0    NULL    NULL

@@ -3,4 +3,4 @@
-- !query schema
struct<sum(sales):decimal(28,2)>
-- !query output
17030.91
NULL
Copy link
Member Author

@maropu maropu May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

213230904.79

NULL Robert 598.86
Brown Monika 6031.52
Collins Gordon 727.57
Green Jesse 9672.96
Copy link
Member Author

@maropu maropu May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+NULL   NULL    1211190.10
+NULL   NULL    2162596.43
+NULL   Amber                   26523.81
+NULL   Amelia                  39837.88
+NULL   Angela                  19000.56
+NULL   Ann                     9771.24
+NULL   Anthony                 33085.99
+NULL   Antonia                 9686.34
+NULL   Ashley                  44888.09
+NULL   Beatrice                42874.69
+NULL   Bernadette              33897.93
+NULL   Brandon                 36527.85
+NULL   Brian                   14629.00
+NULL   Bryan                   8213.72
+NULL   Bryce                   37041.48
+NULL   Camille                 3828.89
+NULL   Carl                    39187.08
+NULL   Chang                   13063.80
+NULL   Charity                 54843.39
+NULL   Charlene                49054.79
+NULL   Charles                 4771.75
+NULL   Charles                 62057.20
+NULL   Charlotte               16385.30
+NULL   Cherie                  27987.22
+NULL   Christopher             36433.14
+NULL   Clarence                25626.84
+NULL   Corey                   23496.48
+NULL   Dale                    5110.33
+NULL   Daniel                  22358.95
+NULL   Darlene                 57923.08
+NULL   Derek                   49167.25
+NULL   Dexter                  10170.43
+NULL   Donald                  24886.59
+NULL   Donna                   52864.24
+NULL   Edith                   61615.29
+NULL   Elizabeth               23627.73
+NULL   Elmer                   19619.60
+NULL   Eloise                  18450.24
+NULL   Erick                   38025.12
+NULL   Erika                   27798.27
+NULL   Evan                    46451.24
+NULL   Fay                     29404.94
+NULL   Frank                   38593.81
+NULL   Gary                    26419.34
+NULL   Giuseppina              23983.27
+NULL   Grace                   10196.36
+NULL   Gretchen                14514.03
+NULL   Guadalupe               25678.72
+NULL   Helena                  25266.03
+NULL   Howard                  19108.80
+NULL   Irene                   21181.11
+NULL   Iris                    20347.29
+NULL   Isaac                   21409.78
+NULL   Isabel                  10981.56
+NULL   James                   20208.41
+NULL   James                   45018.22
+NULL   Janet                   23444.60
+NULL   Janice                  11866.23
+NULL   Janice                  23576.67
+NULL   Jason                   3513.97
+NULL   Jason                   13569.16
+NULL   Jayne                   32662.64
+NULL   Jeffery                 12079.50
+NULL   Jeffrey                 23556.44
+NULL   Jennifer                33641.35
+NULL   Jessica                 21085.21
+NULL   John                    27934.22
+NULL   Jose                    27518.94
+NULL   Joseph                  105492.82
+NULL   Josephine               31305.22
+NULL   Julio                   33649.91
+NULL   Kaleigh                 26772.97
+NULL   Karoline                17438.02
+NULL   Keith                   11594.46
+NULL   Kenneth                 8982.40
+NULL   Kenneth                 37143.32
+NULL   Kerri                   14295.48
+NULL   Kimberly                37517.97
+NULL   Lara                    7898.06
+NULL   Laura                   10704.81
+NULL   Lillian                 27638.75
+NULL   Lina                    30780.78
+NULL   Linda                   21437.87
+NULL   Lori                    40834.58
+NULL   Lucas                   34701.86
+NULL   Lucille                 31975.22
+NULL   Luis                    64716.14
+NULL   Lynette                 39160.74
+NULL   Marcus                  65630.87
+NULL   Marjorie                32153.42
+NULL   Marlene                 28037.14
+NULL   Melanie                 46728.54
+NULL   Michael                 37021.70
+NULL   Michael                 44109.43
+NULL   Naomi                   15056.09
+NULL   Nicholas                30684.17
+NULL   Norman                  36903.91
+NULL   Oscar                   96584.20
+NULL   Pamela                  39836.80
+NULL   Paul                    18862.79

Pettit Richard able 3930.52
Townsend Franklin able 68983.20
Winchester Margaret bar 14269.20

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+Alexander                      Amy                     ese     687924.00
+Anderson                       John                    ese     210781.00
+Anderson                       Patrick                 n st    244511.48
+Anderson                       Scotty                  ese     206994.12
+Arreola                        Kathy                   bar     880629.75
+Barker                         Thomas                  able    1722853.44
+Bertram                        Mario                   n st    682633.56
+Boatwright                     Michael                 bar     34856.25
+Camacho                        Michael                 cally   242151.00
+Chambers                       Robert                  able    274590.12
+Chavis                         Erma                    n st    15720.96
+Christensen                    Enrique                 eing    1936078.32
+Coffey                         John                    cally   196462.50
+Durant                         Tonya                   bar     53667.25
+Estrada                        Irene                   ese     33461.12
+Everett                        Stephen                 ese     681824.64
+Fenton                         Manuel                  ese     117187.64
+Ferguson                       Shelia                  ese     4842298.20
+Graves                         NULL    cally   113356.75
+Greer                          Elisa                   ese     324325.32
+Hanna                          Martha                  n st    535420.44
+Hardy                          Tina                    cally   115713.00
+Huffman                        James                   able    336208.32
+James                          Geraldine               eing    53295.00
+Kelly                          Vilma                   cally   128271.00
+Khan                           David                   able    750553.23
+Knowles                        Linda                   ese     268852.96
+Livingston                     Elvira                  bar     262899.00
+Locke                          Debra                   able    996629.76
+Love                           Cathy                   bar     157316.25
+Lynch                          Eugene                  eing    12775.84
+Miranda                        Margaret                ese     3503151.12
+Nelson                         John                    n st    2079844.56
+Nixon                          Luis                    n st    182798.88
+Petty                          Molly                   ese     519369.04
+Pogue                          Trisha                  able    1444821.84
+Reed                           George                  n st    45361.52
+Ruiz                           Steven                  cally   1064927.50
+Schaeffer                      Toby                    eing    72466.24
+Schmitz                        Angel                   eing    318618.08
+Seals                          Sheila                  n st    193290.20
+Smith                          Debbie                  able    926086.56
+Snyder                         Fredrick                eing    140384.64
+Stephenson                     James                   able    52743.18
+Travis                         Ann                     n st    149890.24
+Wilkins                        Keith                   cally   128271.00
+Yang                           Ralph                   ese     9901.76

@@ -3,4 +3,4 @@
-- !query schema
struct<c_last_name:string,c_first_name:string,s_store_name:string,paid:decimal(27,2)>
-- !query output
Griffith Ray able 161564.48

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+Briggs                         Megan                   bar     598939.25
+Gonzales                       Vanessa                 eing    2221122.42
+Lacey                          Lillian                 ese     301165.60
+Rogers                         Kayla                   n st    604520.04
+Tillman                        Eugene                  eing    6144.82
+Tucker                         Clarence                able    6254.64
+Williams                       Bennie                  ese     33150.72

@@ -3,4 +3,4 @@
-- !query schema
struct<i_item_id:string,i_item_desc:string,s_store_id:string,s_store_name:string,store_sales_profit:decimal(17,2),store_returns_loss:decimal(17,2),catalog_sales_profit:decimal(17,2)>
-- !query output
AAAAAAAADPMBAAAA Things know alone letters. Flights should tend even jewish fees. Civil plans could not cry also social days; other losses might not pay walls; still able signs should not remove too human AAAAAAAAHAAAAAAA ation 12.84 91.41 -1329.46

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+AAAAAAAAEHGGAAAA       Very minute techniques should hang local soldiers. New, illegal crops   AAAAAAAALFAAAAAA        ought   -603.54 71.56   54.25
+AAAAAAAAGDPHAAAA       Large accounts buy legal        AAAAAAAAHAAAAAAA        ation   3025.05 219.52  401.82
+AAAAAAAANDDFBAAA       Different, happy children convey so at a jobs; animals ought to leave well children. Most big professionals will matter operational doctors. Very terrible rights shall continue no     AAAAAAAAHDAAAAAA        anti    -1349.76        164.49  -44.54
+AAAAAAAANEGKBAAA       Applications include    AAAAAAAAPBAAAAAA        ought   1867.80 388.05  38.94

@@ -3,4 +3,4 @@
-- !query schema
struct<segment:int,num_customers:bigint,segment_base:int>
-- !query output
11860 1 593000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+13730  1       686500
+15401  1       770050
+16754  1       837700
+17479  1       873950
+22426  1       1121300
+33319  1       1665950

@@ -3,4 +3,4 @@
-- !query schema
struct<Call_Center:string,Call_Center_Name:string,Manager:string,Returns_Loss:decimal(17,2)>
-- !query output
AAAAAAAACAAAAAAA Mid Atlantic Felipe Perkins 109.74

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

+AAAAAAAAEBAAAAAA       California_1    Jason Brito     15163.13
+AAAAAAAAGBAAAAAA       Hawaii/Alaska_1 Travis Wilson   5924.49
+AAAAAAAAHAAAAAAA       Pacific Northwest       Alden Snyder    5307.40
+AAAAAAAACAAAAAAA       Mid Atlantic    Felipe Perkins  2862.65
+AAAAAAAAIAAAAAAA       California      Wayne Ray       2193.49
+AAAAAAAAABAAAAAA       North Midwest_1 Timothy Bourgeois       2130.49
+AAAAAAAAKAAAAAAA       Hawaii/Alaska   Gregory Altman  1897.11
+AAAAAAAAEAAAAAAA       North Midwest   Larry Mccray    1754.93
+AAAAAAAAEAAAAAAA       North Midwest   Larry Mccray    1509.00
+AAAAAAAAOAAAAAAA       Mid Atlantic_1  Clyde Scott     1454.33
+AAAAAAAANAAAAAAA       NY Metro_1      Jack Little     909.15
+AAAAAAAAKAAAAAAA       Hawaii/Alaska   Gregory Altman  769.59
+AAAAAAAAEBAAAAAA       California_1    Jason Brito     627.28
+AAAAAAAAABAAAAAA       North Midwest_1 Timothy Bourgeois       551.91
+AAAAAAAAGBAAAAAA       Hawaii/Alaska_1 Travis Wilson   444.54
+AAAAAAAANAAAAAAA       NY Metro_1      Jack Little     419.47
+AAAAAAAABAAAAAAA       NY Metro        Bob Belcher     353.92
+AAAAAAAAHAAAAAAA       Pacific Northwest       Alden Snyder    335.22
+AAAAAAAADBAAAAAA       Pacific Northwest_1     Roderick Walls  216.94
+AAAAAAAABAAAAAAA       NY Metro        Bob Belcher     165.20
+AAAAAAAADBAAAAAA       Pacific Northwest_1     Roderick Walls  116.61

@maropu maropu changed the title [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite [SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite May 3, 2021
@maropu
Copy link
Member Author

maropu commented May 3, 2021

The failures in GA are not related to this PR. cc: @HyukjinKwon @dongjoon-hyun @yaooqinn

@HyukjinKwon
Copy link
Member

cc @wangyum too

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42674/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42674/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Test build #138153 has finished for PR 32420 at commit c21463f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented May 6, 2021

Thank you, @HyukjinKwon . Merged to master.

@maropu maropu closed this in 5c67d0c May 6, 2021
@dongjoon-hyun
Copy link
Member

FYI, master branch seems to be broken at this commit, @maropu and @HyukjinKwon ~

@maropu
Copy link
Member Author

maropu commented May 6, 2021

Oh, I didn't update the hash key of the cache, so forked GA jobs probably refer to a old cache. I'll make a followup PR to fix it.

dongjoon-hyun pushed a commit that referenced this pull request May 6, 2021
…C-DS cache data in forked GA jobs

### What changes were proposed in this pull request?

This is a follow-up PRi of #32420 and it intends to update the hash key to refresh TPC-DS cache data in forked GA jobs.

### Why are the changes needed?

To recover GA jobs.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

GA passed.

Closes #32460 from maropu/SPARK-35293-FOLLOWUP.

Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@HyukjinKwon
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants