Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Use bcp as a user option #587

Merged
merged 2 commits into from
Aug 31, 2023
Merged

Conversation

krithikasatish
Copy link
Contributor

@krithikasatish krithikasatish commented Aug 29, 2023

Added user option that enables bulk copy program (bcp) utility to speed data table insertion. This replaces the existing implementation of SQL "insert into" statements. Created additional methods for Customer, Orders and Stock tables that use bcp (bulk copy program) to load data into tables as to not interfere with existing implementation.

@sm-shaw
Copy link
Contributor

sm-shaw commented Aug 30, 2023

I have tested and made some additional changes to the pull request with this commit bcp option updates and fix

  1. On my first test, I saw this error:
Error in Virtual User 2: Error: SQLState = 28000, NativeError = 18456
Error = [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Login failed for user 'sa'.
SQLState = IM006, NativeError = 0
Error = [Microsoft][ODBC Driver 13 for SQL Server]Packet size change not supported by server, default used
child process exited abnormally

I assume that is because you need to use SQL Authentication and not Windows authentication for this feature? So I have updated the GUI options so that the BCP option it is greyed out (and unselected if selected) if Windows authentication is selected.

I have also set the default option to false in the XML assuming that not everyone with have SQL authentication enabled by default. Therefore anyone using hammerdb for the first time will use the slower way but it is likely to work out-of-the-box.

  1. Initially it was not possible to not choose the BCP Option ie unclicking it and closing the window it was then reselected if you opened the window again. This was because it was not listed in the local tpcc variables. I have added it in so the local change is then persisted.

  2. Deleting the schema would error with "Error in Virtual User 1: Error: Failed to load csv functions" - I'm not entirely sure where this package is referenced as packages should be in the lib directory or modules. There is a CSV package here https://wiki.tcl-lang.org/page/Tcllib+Csv although this doesn't look like it has been added. Therefore the schema delete would fail with the message above. for this reason I have removed the line in the delete schema that tries to load the CSV package.

I've tested (on what is now a relatively slow PC) and the build time with the BCP option is at least 2X so the functionality does what is intended.

USE BCP 30 WH with 8 VU
"Start:Wed Aug 30 17:12:25 BST 2023"
"End:Wed Aug 30 17:15:35 BST 2023"
3 minutes	10 seconds
ORIGINAL 30 WH with 8 VU
"Start:Wed Aug 30 17:16:49 BST 2023"
"End:Wed Aug 30 17:25:18 BST 2023"
8 minutes	29 seconds

I have also checked the consistency and the schema looks as it should and also tested performance after the build compared to the original and the performance looks the same, so approving the PR from my side.

Copy link
Contributor

@sm-shaw sm-shaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have tested and added some changes with bcp option updates and fix and approved.

@abondvt89
Copy link
Contributor

Merging after reviews and approval of the three members of the code maintenance team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve load performance of HammerDB for MS SQL Server for TPCC workloads using bcp utility
5 participants