Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39349] Add a centralized CheckError method for QA of error path #36693

Closed
wants to merge 20 commits into from

Conversation

srielau
Copy link
Contributor

@srielau srielau commented May 26, 2022

What changes were proposed in this pull request?

Pulling error messages out of the code base into error-classes.json solves only one half of the problem.
This change aims to lay the infrastructure to pull error messages out of QA.
We do this by adding an central checkError() method in SparkFunSuite which is geared towards verifying the payload of an error only:

  • ERROR_CLASS
  • Optional ERROR_SUBCLASS
  • Optional SQLSTATE (derived from error-classes.json, so debatable)
  • Parameter values (with optional parameter names for extra points)

The method allows regex matching of parameter values.

Why are the changes needed?

Pulling error-messages out of code and QA makes for a central place to fine tune error-messages for language and formatting.

Does this PR introduce any user-facing change?

No

How was this patch tested?

A subset of QA tests has been rewritten to exercise the code.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srielau srielau changed the title CheckError [SPARK-39349] CheckError Jun 1, 2022
Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You definitely need to provide more than zero info about this change! please edit the PR template

@srielau srielau changed the title [SPARK-39349] CheckError [SPARK-39349] Add a centralized CheckError method for QA of error path Jun 1, 2022
@srielau srielau force-pushed the textless-error-check branch 4 times, most recently from 05d8407 to 8002417 Compare June 1, 2022 23:22
@srielau srielau force-pushed the textless-error-check branch 2 times, most recently from 44e36bd to b425e45 Compare June 3, 2022 18:59
@srielau
Copy link
Contributor Author

srielau commented Jun 6, 2022

@gengliangwang @cloud-fan Hoping for a review.

srielau and others added 3 commits June 6, 2022 11:13
Co-authored-by: Gengliang Wang <ltnwgl@gmail.com>
Co-authored-by: Gengliang Wang <ltnwgl@gmail.com>
Co-authored-by: Gengliang Wang <ltnwgl@gmail.com>
srielau and others added 5 commits June 7, 2022 08:00
@srielau srielau requested a review from cloud-fan June 7, 2022 16:18
srielau referenced this pull request Jun 8, 2022
### What changes were proposed in this pull request?
`REGR_INTERCEPT` is an ANSI aggregate functions

**Syntax**: REGR_INTERCEPT(y, x)
**Arguments**:
- **y**:The dependent variable. This must be an expression that can be evaluated to a numeric type.
- **x**:The independent variable. This must be an expression that can be evaluated to a numeric type.

**Examples**:
`select k, regr_intercept(v, v2) from aggr group by k;`

|  k |    regr_intercept(v, v2) |
|---|--------------------|
| 1  |          [NULL]            |
| 2 |     1.154734411       |

The algorithm refers https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

The mainstream database supports `regr_intercept` show below:
**Teradata**
https://docs.teradata.com/r/kmuOwjp1zEYg98JsB8fu_A/MpkZBV~MSTZ~I84I~ezxNg
**Snowflake**
https://docs.snowflake.com/en/sql-reference/functions/regr_intercept.html
**Oracle**
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/REGR_-Linear-Regression-Functions.html#GUID-A675B68F-2A88-4843-BE2C-FCDE9C65F9A9
**DB2**
https://www.ibm.com/docs/en/db2/11.5?topic=af-regression-functions-regr-avgx-regr-avgy-regr-count
**H2**
http://www.h2database.com/html/functions-aggregate.html#regr_intercept
**Postgresql**
https://www.postgresql.org/docs/8.4/functions-aggregate.html
**Sybase**
https://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sqlanywhere.12.0.0/dbreference/regr-intercept-function.html
**Presto**
https://prestodb.io/docs/current/functions/aggregate.html
**Exasol**
https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/regr_function.htm

### Why are the changes needed?
`REGR_INTERCEPT` is very useful.

### Does this PR introduce _any_ user-facing change?
'No'. New feature.

### How was this patch tested?
New tests.

Closes #36708 from beliefer/SPARK-37623_new.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
@cloud-fan
Copy link
Contributor

The python doc issue is being fixed by #36813

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in d8e9ac0 Jun 9, 2022
ulysses-you pushed a commit to apache/kyuubi that referenced this pull request Jun 28, 2022
### _Why are the changes needed?_

to close #2949

Unquoted the function name  in the error `SECOND_FUNCTION_ARGUMENT_NOT_INTEGER` since Spark-3.4.0, for more details, see apache/spark#36693

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #2950 from cfmcgrady/kyuubi-2949.

Closes #2949

4d4da0b [Fu Chen] fix

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
MaxGekk added a commit that referenced this pull request Jul 29, 2022
…ror()` instead

### What changes were proposed in this pull request?
Replace all invokes of `checkErrorClass()` by `checkError()` and remove `checkErrorClass()`.

### Why are the changes needed?
1. To prepare test infra for testing of query contexts.
2. To check message parameters instead of entire text message. This PR is some kind of follow up of #36693.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By running the modified test suites:
```
$ build/sbt "sql/testOnly *DatasetUnpivotSuite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsSuite"
$ build/sbt "test:testOnly *QueryExecutionErrorsSuite"
```

Closes #37322 from MaxGekk/test-query-context.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
MaxGekk added a commit that referenced this pull request Aug 2, 2022
…Error()`

### What changes were proposed in this pull request?
1. Re-implemented `validateParsingError()` using `checkError()`.
2. Removed `checkParsingError()` and replaced by `checkError()`.

### Why are the changes needed?
1. To prepare test infra for testing of query contexts.
3. To check message parameters instead of entire text message. This PR is some kind of follow up of #36693 and #37322.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By running the modified test suites:
```
$ build/sbt -Phive -Phive-thriftserver "test:testOnly *TruncateTableParserSuite"
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *ShowPartitionsParserSuite"
$ build/sbt "sql/testOnly *QueryParsingErrorsSuite"
```

Closes #37363 from MaxGekk/checkParsingError-to-checkError.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants