Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving buildParamTypeDefinitions performance #1938

Merged
merged 3 commits into from
May 25, 2023
Merged

Conversation

cogman
Copy link
Contributor

@cogman cogman commented Oct 21, 2022

It was noticed in our applications that the growing of arrays and string builders were a major source of memory pressure. Precomputing the size of both and removing an unneeded string allocation from trim significantly reduces the amount of allocation buildParamTypeDefinitions needs.

It was noticed in our applications that the growing of arrays and string builders were a major source
of memory pressure.  Precomputing the size of both and removing an unneeded string allocation from trim
significantly reduces the amount of allocation buildParamTypeDefinitions needs.
@cogman
Copy link
Contributor Author

cogman commented Oct 21, 2022

@microsoft-github-policy-service agree

@Jeffery-Wasty
Copy link
Member

Thank you for this PR. We'll look over it, test it out, and get back to you as soon as we can.

@Jeffery-Wasty
Copy link
Member

We're not seeing this 'memory pressure' you talk about. Would you be able to provide screenshots showing memory impact before and after this fix? This will help us in prioritizing this PR.

@cogman
Copy link
Contributor Author

cogman commented Oct 21, 2022

@Jeffery-Wasty Yup.

Let me recreate the scenario to show the reduction. In our case, we are dealing with batch statements against very wide tables so that's what I'll put together to show it.

@cogman
Copy link
Contributor Author

cogman commented Oct 21, 2022

Before the Change
After the Change
Flight recording + java used to generate results

See the attached. It's not a perfect win but it does cut back on the number of garbage collections (reducing memory pressure) while doing batch inserts. As an added bonus, because it isn't copying as much data around it also reduces the runtime of that method.

Before the change CPU usage
After the change CPU usage

Both flight recordings are for 1 minute after the JVM had ran for a bit with the attached code.

@Jeffery-Wasty Jeffery-Wasty self-assigned this Oct 21, 2022
@Jeffery-Wasty Jeffery-Wasty added this to In progress in MSSQL JDBC via automation Oct 21, 2022
@Jeffery-Wasty Jeffery-Wasty added the Under Review Used for pull requests under review label Oct 21, 2022
@Jeffery-Wasty Jeffery-Wasty moved this from In progress to Under Peer Review in MSSQL JDBC Oct 21, 2022
@Jeffery-Wasty Jeffery-Wasty moved this from Under Peer Review to Backlog in MSSQL JDBC Nov 9, 2022
@Jeffery-Wasty Jeffery-Wasty added Backlog The topic in question has been recognized and added to development backlog and removed Under Review Used for pull requests under review labels Nov 9, 2022
@Jeffery-Wasty
Copy link
Member

@cogman, this PR is being moved to our backlog for now, and will be assessed again as soon we can. We're doing this for two reasons:

  • The code makes changes to very delicate parts of the code and extensive testing is required before incorporating changes.
  • The memory difference noted doesn't appear to be a large one. From the screenshots you posted, you are going from ~5% usage to ~1%. While this is an improvement, and we're always looking to improve the driver, it's not enough to make this a high priority item at this moment.

@cogman
Copy link
Contributor Author

cogman commented Nov 9, 2022

@Jeffery-Wasty No worries.

2 points though, the important thing about my screenshots wasn't necessarily the CPU improvements (though, those are there) but rather the amount of memory allocated. The number of GCs for the same runtime is roughly half. As I mentioned earlier, this PR is about reducing memory pressure (number of GCs) above all else. For our larger applications this is important to us. More garbage collecting == lower throughput.

Second point is this code, because of it's location, is very heavily tested. When writing this, I noticed the method was invoked in test over 400k times. That's the primary reason I didn't add new tests, the code is thoroughly covered.

Again, no rush, but just thought those are relevant points.

@Jeffery-Wasty Jeffery-Wasty removed their assignment Dec 22, 2022
@Jeffery-Wasty
Copy link
Member

Jeffery-Wasty commented May 16, 2023

closes #1678

@Jeffery-Wasty Jeffery-Wasty removed the Backlog The topic in question has been recognized and added to development backlog label May 17, 2023
@Jeffery-Wasty Jeffery-Wasty moved this from Backlog to Under Peer Review in MSSQL JDBC May 17, 2023
@Jeffery-Wasty
Copy link
Member

Hi @cogman,

We've removed this from our backlog as we're now able to spend time looking into it again. We'll update you on the status as we continue.

David-Engel
David-Engel previously approved these changes May 18, 2023
Copy link
Contributor

@David-Engel David-Engel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tkyc
Copy link
Member

tkyc commented May 25, 2023

I found one issue with the PR when I was doing some testing on our internal pipelines. At the moment I'm still trying to debug the issue and identify the cause, but I've attached a file with the failing test and DDL used in case you want to take a look as well (the test in the file will need to be ran on an AE connection). So far from what I've learned, the issue happens because the renewed build type definition isn't using a precision of 3 for the type ie. it's setting the renewed build type definition as datetime rather than datetime2(3) which is the reason for the failing test.

PR1938-debug.txt

EDIT:

This is likely the fix. Line 470 will need to happen before line 442.

            params[i].renewDefinition = renewDefinition;
            String typeDefinition = param.getTypeDefinition(connection, resultsReader());

The renew action needs to happen before pulling in the static type definition action to avoid errors in created commands
MSSQL JDBC automation moved this from Under Peer Review to In progress May 25, 2023
@cogman
Copy link
Contributor Author

cogman commented May 25, 2023

@tkyc I've made that change. Let me know if there's further changes needed.

@tkyc
Copy link
Member

tkyc commented May 25, 2023

@cogman Everything looks good so far, I'll merge after my tests are done running.

@tkyc tkyc merged commit 823fb86 into microsoft:main May 25, 2023
17 checks passed
MSSQL JDBC automation moved this from In progress to Closed/Merged PRs May 25, 2023
@tkyc
Copy link
Member

tkyc commented May 25, 2023

Just a heads up, next preview release will be early June and the stable release will be on end of July.

@lilgreenbird lilgreenbird added this to the 12.3.1 milestone May 29, 2023
Jeffery-Wasty pushed a commit that referenced this pull request Jun 2, 2023
* Improving prepared statment performance

It was noticed in our applications that the growing of arrays and string builders were a major source
of memory pressure.  Precomputing the size of both and removing an unneeded string allocation from trim
significantly reduces the amount of allocation buildParamTypeDefinitions needs.

* Move renew before pulling type definition

The renew action needs to happen before pulling in the static type definition action to avoid errors in created commands
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
MSSQL JDBC
  
Closed/Merged PRs
5 participants