Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-row/batch insert for Spark SQL #1284

Merged
merged 2 commits into from
Jul 7, 2024

Conversation

gatear
Copy link
Contributor

@gatear gatear commented Jul 3, 2024

I was mistaken on the first PR #1261

Spark SQL supports multi-row inserts
https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-dml-insert-into.html

Validated on Databricks that the following statement is accepted.

INSERT INTO `MY_TABLE` (`field1`, `field2`)
VALUES ('value1', 'value2'),
       ('value1', 'value2'),
       ('value1', 'value2'),
       ...

Copy link

what-the-diff bot commented Jul 3, 2024

PR Summary

  • Modification of SqlDialect.java
    The update in the SqlDialect.java file removed a certain limitation with SparkSQL. Previously, it was unable to support batch operations (performing multiple tasks at once, which is faster and efficient). This limitation has been removed.

  • Changes in SqlTest.java
    The changes in SqlTest.java was made to test the new batch insert capabilities of SparkSQL. A method has been updated to generate a batch insert statement with 10 rows, allowing us to verify and ensure the correct functioning of this new feature.

@codecov-commenter
Copy link

codecov-commenter commented Jul 3, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.96%. Comparing base (b37c566) to head (bf40361).
Report is 206 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1284      +/-   ##
============================================
- Coverage     92.35%   91.96%   -0.39%     
- Complexity     2821     3085     +264     
============================================
  Files           292      310      +18     
  Lines          5609     6025     +416     
  Branches        599      628      +29     
============================================
+ Hits           5180     5541     +361     
- Misses          275      332      +57     
+ Partials        154      152       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kingthorin
Copy link
Collaborator

Some tests are failing.

@gatear
Copy link
Contributor Author

gatear commented Jul 7, 2024

@kingthorin it's ready for review

Copy link
Collaborator

@kingthorin kingthorin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@snuyanzin snuyanzin merged commit ca568a0 into datafaker-net:main Jul 7, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants