Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance degradation in SQLServerDatabaseMetadata in 6.4.0 #665

Closed
woehrl01 opened this issue Mar 26, 2018 · 19 comments
Closed
Assignees
Projects

Comments

@woehrl01
Copy link

Driver version or jar name

I'm using the 6.4.0.jre8 in comparison to the 6.2.2.jre8.

SQL Server version

Microsoft SQL Server 2012 - 11.0.5343.0 (X64)

Client operating system

Microsoft Windows 10 Pro (10.0.16299)jav

Java/JVM version

Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

Problem description

After upgrading the driver from 6.2.2.jre8 to 6.4.0.jre8 I encounter a serious performance degradation in querying the metadata of the database.

The following screenshot is from the opensource tool symmetricds with the 6.2.2.jre8 driver, where you see a total execution time of 44 seconds for a method.

image

In the second screenshot you see the same code just with using the 6.4.0.jre8 driver, you can see that the total execution time is now 2 times slower (128 seconds).

image

If you tear down the callstack you'll see that the difference is in the sql server jdbc part where in the later screenshot the getImportedKeys is a lot slower than in the 6.2.2.jre8 version.

Expected behavior and actual behavior

I'd expect that the two drivers should behave equally. At least I don't expect a performance penalty of 200%.

Repro code

https://github.com/JumpMind/symmetric-ds you have to replace the driver with mssql-jdbc.

@woehrl01
Copy link
Author

After some investigation, it looks like this is due to the change of #490 to fix #467 @v-xiangs

@woehrl01
Copy link
Author

I think one improvement here, could be made that the routine does an early return if the sp_fkeys stored procedure doesn't return anything. Some heavy liftings should be moved to a later step. As if there are no foreign keys, a lot of unnecessary things are done (e.g. creating the temp table, running the join, doing the update etc.).

Any suggestions?

@peterbae peterbae added this to To do in MSSQL JDBC Mar 27, 2018
@rene-ye rene-ye self-assigned this Apr 3, 2018
@rene-ye rene-ye added the Work in Progress The pull request is a work in progress label Apr 3, 2018
@rene-ye rene-ye moved this from To do to Under Investigation in MSSQL JDBC Apr 3, 2018
@rene-ye
Copy link
Member

rene-ye commented Apr 4, 2018

Hi @woehrl01, thanks for submitting the issue. I've confirmed there is significant performance degradation when executing getImportedKeys(). I looked through the old pull request and it seems to address a sp_fkeys deficiency regarding limited return values. However, despite the thread mentioning that the stored procedure won't be fixed, this msdn document claims otherwise.

I have confirmed that sp_fkeys can successfully return values ranging from 0-3. In lieu of these changes, I propose we attempt to revert the PR as it is no longer necessary (but I see some of the values are flipped compared to sys.foreign_keys so it might take some adjusting). Does this solution sound reasonable?

@woehrl01
Copy link
Author

woehrl01 commented Apr 4, 2018

Hi @rene-ye, thanks for your feedback!

After thinking a while about this, I think that a more optimal solution would be, by using only the sys tables + a case when and some aliases, to create a single sql statement to return the expected output. This means we drop the temp table, usage of sp_keys and the multiple update statements.

This should result in a much better preforming statement, which is a lot more readable, too.

If you're unsure what I mean, I could provide the statement inlined into this issue or a full a PR in the next days. How does this sound to you?

@woehrl01 woehrl01 closed this as completed Apr 4, 2018
MSSQL JDBC automation moved this from Under Investigation to Closed Issues Apr 4, 2018
@woehrl01 woehrl01 reopened this Apr 4, 2018
MSSQL JDBC automation moved this from Closed Issues to In progress Apr 4, 2018
@woehrl01
Copy link
Author

woehrl01 commented Apr 4, 2018

Sorry for closing and reopen. I accidentally hit the wrong button 🙈

@rene-ye
Copy link
Member

rene-ye commented Apr 4, 2018

Hi @woehrl01, please feel free to add information/submit a PR if you have time. The team is happy to review and consider all possibilities when deciding on a solution to satisfy everyone.

@woehrl01
Copy link
Author

woehrl01 commented Apr 4, 2018

Hi @rene-ye , I'll provide a PR, due to some busy times, this will take a few days.

@rene-ye
Copy link
Member

rene-ye commented Apr 10, 2018

Hi @woehrl01, I know you're in the process of a PR, but I'd like to share some of my findings on this issue with you and get some of your opinions on possible changes. I created a temporary PR on my fork to make this easier, you can click the above or this link.

After thinking a while about this, I think that a more optimal solution would be, by using only the sys tables + a case when and some aliases, to create a single sql statement to return the expected output. This means we drop the temp table, usage of sp_keys and the multiple update statements.

Only using sys tables
This may be difficult/impossible to implement. The sp_fkeys procedure does more than just pull data from a bunch of system tables. There's some very valuable logic in the procedure such as parameter sanitation/verification which I think we should keep. I don't believe the procedure itself is the cause of the performance degradation, rather multiple statement executes.
create/execute a single sql statement
This is indeed the way to go, and should be the optimal solution. I've created the PR with this goal in mind but it runs into problems when executing the statement from the driver, mostly to do with error handling. E.g instead of throwing a specific message thrown by the SQL Server, it'll just throw "result set has no current row". This is caused by dropping the temp table as the last statement. The alternative is to split the drop into another execute, but that option degrades performance.
drop the temp table
I've looked around for ways to interact with the Resultset returned from sp_fkeys, but it seems we're limited in our options. The best recommended approach is using a temporary table. That being said, I tried to trim down on the use and the new PR uses only 1 temp table instead of 2.

All in all, I think this issue is important and a fix is necessary, so thank you for creating this thread. Performance degradation is to be expected as the previous PR was meant to align the driver with JDBC Specs. I'm just not sure at this point how much we can do to minimize the impact. If you could take a look at some of the changes and get back to me on your thoughts, that'd be great.

@woehrl01
Copy link
Author

woehrl01 commented Apr 10, 2018

Hi @rene-ye !

Thanks for your detailed response, I already created the sql which should output the desired values, I haven't gotten the time yet to make it a fully PR.

Just in case we are aligned:

create table #fkeys_results ( --or variable to don't need a drop: declare @fkeys_result table
 PKTABLE_QUALIFIER sysname, 
 PKTABLE_OWNER sysname, 
 PKTABLE_NAME sysname, 
 PKCOLUMN_NAME sysname,
 FKTABLE_QUALIFIER sysname, 
 FKTABLE_OWNER sysname, 
 FKTABLE_NAME sysname, 
 FKCOLUMN_NAME sysname, 
 KEY_SEQ smallint, 
 UPDATE_RULE smallint,
 DELETE_RULE smallint, 
 FK_NAME sysname, 
 PK_NAME sysname, 
 DEFERRABILITY smallint 
);

insert into #fkeys_results
exec sp_fkeys ?, ?, ?, ?, ?, ?;

select fkr.PKTABLE_QUALIFIER, fkr.PKTABLE_OWNER, fkr.PKTABLE_NAME, fkr.PKCOLUMN_NAME, 
fkr.FKTABLE_QUALIFIER, fkr.FKTABLE_OWNER, fkr.FKTABLE_NAME, fkr.FKCOLUMN_NAME,
fkr.KEY_SEQ, fkr.FK_NAME, fkr.PK_NAME, fkr.DEFERRABILITY,

case fkr.UPDATE_RULE 
	when 0 then 3
	when 1 then 0
	when 2 then 2
	when 3 then 4
	else null
end as UPDATE_RULE, 

case fkr.DELETE_RULE 
	when 0 then 3
	when 1 then 0
	when 2 then 2
	when 3 then 4
	else null
end as DELETE_RULE

from #fkeys_result fkr

If we pass in the parameter array to that statement instead of the single call to sp_keys we should have a working result. I think it should be still a good way to use a GUID for the table. Or we use a table variable instead, so we don't even have to drop anything. What do you think?

@rene-ye
Copy link
Member

rene-ye commented Apr 10, 2018

Hi @woehrl01, I looked through the SQL code and in general it's what I'm using. It's worth mentioning that there's currently a deficiency with sp_fkeys when executed with fktable parameters that cause it to only return 0 or 1 on UPDATE_RULE/DELETE_RULE columns. That's why I don't think we can avoid joining the sys.foreign_keys table. This behavior also extends to pktable parameters when we use an older version of SQL Server, such as 2008, which we have to support.

@woehrl01
Copy link
Author

Yes, it looks similar to your SQL code, just wanted to share my thoughts. Interesting that sp_keys returns only 0 or 1, as according to the MSDN it should differently.
What are your thoughts about using a table variable instead of a temp table?

@rene-ye
Copy link
Member

rene-ye commented Apr 10, 2018

I'm going to try that right now and see if it resolves some issues I had with drop. It seems to work just fine on SSMS and I'm going to use it over creating a temp table if I can.

Edit: It seems to work just fine, going forward with the implementation. It's probably going to be awhile before I can create a PR for this issue as it has brought some problems into light, and I'd like to address all of them before proceeding.

@rene-ye
Copy link
Member

rene-ye commented Apr 11, 2018

Hi @woehrl01, I've put together a fix and was wondering if you'd be interested in testing its performance. It's worth noting that I'm using a PreparedStatement instead of a regular Statement to execute the SQL. A regular Statement has security problems, but is noticeably faster. I'll attach the jars in this post.

Jars for testing.zip

@woehrl01
Copy link
Author

Hi @rene-ye , wow that was fast! I'll give those a try and will reach back to you. Thanks!

Just out of interest, which kind of security concerns do you have with a regular statement?

@rene-ye
Copy link
Member

rene-ye commented Apr 11, 2018

There have been cases where a user is able to perform an SQL injection through getImportedKeys(String,String,String)/getExportedKeys/getCrossReference.

@woehrl01
Copy link
Author

woehrl01 commented Apr 12, 2018

Thanks, that's reasonable. I wasn't aware that in jdbc the only way to pass parameters is by using a prepared statement. (I'm mostly active in the .Net world, where can do differently) e.g. By using sp_executesql in a single roundtrip.

I was looking through your pr and it looks pretty complete, I think the best is that we simply use your code. As due to limited time on my side I'm still struggling executing the unit tests)

What do you think?

@rene-ye
Copy link
Member

rene-ye commented Apr 12, 2018

I'll look into some other options for the prepared statement, but other than that I'll create the PR in the near future. Thanks for all your time and help.

@eirikbakke
Copy link

I see that you're working on the getResultSetForForeignKeyInformation code--I just reported another related bug at #681 .

@rene-ye
Copy link
Member

rene-ye commented May 3, 2018

#677 has now been merged. Closing the issue.

@rene-ye rene-ye closed this as completed May 3, 2018
MSSQL JDBC automation moved this from In progress to Closed Issues May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
MSSQL JDBC
  
Closed Issues
Development

No branches or pull requests

3 participants