Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: optimize tidb data integration #3839

Merged

Conversation

yht520100
Copy link
Collaborator

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    feature

  • What is the current behavior? (You can also link to an open issue here)
    tispark data type #3808

  • What is the new behavior (if this is a feature change)?
    supports TiDB data types: smallint, int, datetime.
    supported functionalities: offline import (soft copy), online import.
    online import: Ignore type checking, support asymmetric type conversion.
    data export: Ignore type checking, support asymmetric type conversion, for example, unsigned data.

@github-actions github-actions bot added documentation Improvements or additions to documentation batch-engine openmldb batch(offline) engine labels Mar 28, 2024
Copy link
Contributor

github-actions bot commented Mar 28, 2024

SDK Test Report

102 files  +1  102 suites  +1   2m 13s ⏱️ +8s
357 tests +8  343 ✅ +8  14 💤 ±0  0 ❌ ±0 
483 runs  +8  469 ✅ +8  14 💤 ±0  0 ❌ ±0 

Results for commit d8dff65. ± Comparison against base commit 86d640a.

This pull request removes 48 and adds 35 tests. Note that renamed tests count towards both.
  PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1
  PARTITION BY t1.col2 ORDER BY t1.col1
  ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW
 ) limit 10;](1)
 ) limit 10;](2)
 ) limit 10;](3)
 FROM db1.t1
 FROM t1
 WINDOW w1 AS (
 last join db2.t2 order by db2.t2.col1
…
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[,  SELECT sum(db1.t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1
 FROM db1.t1
 last join db2.t2 order by db2.t2.col1
 on db1.t1.col1 = db2.t2.col1 and db1.t1.col2 = db2.t2.col0
 WINDOW w1 AS (
  PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1
  ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW
 ) limit 10;](2)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[db1,  SELECT sum(t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0
 WINDOW w1 AS (
  PARTITION BY t1.col2 ORDER BY t1.col1
  ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW
 ) limit 10;](1)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlLastJoinWithMultipleDB[null,  SELECT sum(db1.t1.col1) over w1 as sum_t1_col1, db2.t2.str1 as t2_str1
 FROM db1.t1
 last join db2.t2 order by db2.t2.col1
 on db1.t1.col1 = db2.t2.col1 and db1.t1.col2 = db2.t2.col0
 WINDOW w1 AS (
  PARTITION BY db1.t1.col2 ORDER BY db1.t1.col1
  ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW
 ) limit 10;](3)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[, SELECT db2.t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0;
, SQL parse error: Fail to transform data provider op: table t1 not exists in database []](4)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT db1.t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0;
, SQL parse error: Column Not found: db1.t2.str1](2)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT db2.t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = t2.col1 and t1.col2 = db2.t2.col0;
, SQL parse error: Column Not found: .t2.col1](3)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[db1, SELECT t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0;
, SQL parse error: Column Not found: .t2.str1](1)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlMultipleDBErrorTest[null, SELECT db2.t2.str1 as t2_str1
 FROM t1
 last join db2.t2 order by db2.t2.col1
 on t1.col1 = db2.t2.col1 and t1.col2 = db2.t2.col0;
, SQL parse error: Fail to transform data provider op: table t1 not exists in database []](5)
com._4paradigm.hybridse.sdk.SqlEngineTest ‑ sqlWindowLastJoin[ SELECT sum(t1.col1) over w1 as sum_t1_col1, t2.str1 as t2_str1
 FROM t1
 last join t2 order by t2.col1
 on t1.col1 = t2.col1 and t1.col2 = t2.col0
 WINDOW w1 AS (
  PARTITION BY t1.col2 ORDER BY t1.col1
  ROWS_RANGE BETWEEN 3 PRECEDING AND CURRENT ROW
 ) limit 10;](1)
com._4paradigm.openmldb.batch.TestInsertPlan ‑ Test column with default value
…

♻️ This comment has been updated with latest results.

Copy link

codecov bot commented Mar 28, 2024

Codecov Report

Attention: Patch coverage is 39.47368% with 46 lines in your changes are missing coverage. Please review.

Project coverage is 40.75%. Comparing base (86d640a) to head (3e67365).
Report is 10 commits behind head on main.

Files Patch % Lines
...paradigm/openmldb/batch/utils/DataSourceUtil.scala 37.31% 42 Missing ⚠️
..._4paradigm/openmldb/batch/nodes/LoadDataPlan.scala 25.00% 3 Missing ⚠️
...paradigm/openmldb/batch/nodes/SelectIntoPlan.scala 50.00% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##               main    #3839       +/-   ##
=============================================
- Coverage     74.87%   40.75%   -34.12%     
  Complexity      658      658               
=============================================
  Files           742      195      -547     
  Lines        133925    11679   -122246     
  Branches       1387     1412       +25     
=============================================
- Hits         100277     4760    -95517     
+ Misses        33344     6615    -26729     
  Partials        304      304               

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented Mar 28, 2024

Linux Test Report

    57 files  +     4     244 suites  +184   1h 43m 59s ⏱️ + 1h 15m 31s
12 631 tests +11 960  12 624 ✅ +11 960  7 💤 ±0  0 ❌ ±0 
17 908 runs  +17 236  17 901 ✅ +17 236  7 💤 ±0  0 ❌ ±0 

Results for commit d8dff65. ± Comparison against base commit 86d640a.

♻️ This comment has been updated with latest results.

Copy link
Collaborator

@tobegit3hub tobegit3hub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vagetablechicken vagetablechicken changed the title Feat/optimize tidb data integration feat: optimize tidb data integration Apr 2, 2024
@yht520100
Copy link
Collaborator Author

The error content of cicd is not caused by the current PR commit. Now apply for manual merge.

@yht520100 yht520100 merged commit bfe5c1c into 4paradigm:main Apr 15, 2024
21 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch-engine openmldb batch(offline) engine documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants