Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maxResultBuffer property #1431

Merged
merged 18 commits into from
Nov 27, 2020
Merged

Add maxResultBuffer property #1431

merged 18 commits into from
Nov 27, 2020

Conversation

sawkoj
Copy link
Contributor

@sawkoj sawkoj commented Sep 17, 2020

Implementation of new property - maxResultBuffer. Enables max bytes read during reading result set.
Enable declare in two styles:
• as size of bytes (i.e. 100, 150M, 300K, 400G);
• as percent of max heap memory (i.e. 10p, 15pct, 20percent).

It also validates if defined size isn't bigger than 90% of max heap memory. If so, it limits max bytes read to 90% of memory heap. If maxResultBuffer property isn't provided, driver work isn't changed.

…ssage, changed order of MaxResultBufferPropertyDescription Exception,
… use of StringUtils in MaxResultBufferParser, replaced substring() with endsWith() and charAt(), extracted errorMessage to variable, fixed format of multipliers, added few test cases to MaxResultBufferParserTest, updated setting maxResultBuffer to activeConnectionProperties in SQLServerConnection
…urce, corrected formatting in SQLServerConnection, extracted switch expression to method, reformatted code
@ghost
Copy link

ghost commented Sep 17, 2020

CLA assistant check
All CLA requirements met.

@sawkoj sawkoj changed the title Max result buffer Add maxResultBuffer property Sep 17, 2020
@ulvii ulvii added this to the 9.1.0 milestone Sep 28, 2020
@rene-ye
Copy link
Member

rene-ye commented Oct 1, 2020

Hi @sawkoj, can you give some points regarding why this feature is necessary? The JDBC Statement APIs provide similar "memory" control options in setFetchSize and setMaxFieldSize. Although this doesn't directly allow you to control memory, the intentions are the same. Can you give a use case where your proposed feature would be significantly better than the existing JDBC apis?

@peterbae
Copy link
Contributor

peterbae commented Oct 1, 2020

Thanks for the contribution @sawkoj, sorry for the delay in response. I see that this change allows the user to set a limit to how much data can be read from a resultset, but in what situation will this change be useful? So far, users haven't asked for such feature because they can choose to stream the data instead of storing the entire resultset in memory, which I think is more flexible than the driver throwing an error if the amount of data read goes over the limit specified in this buffer.

@sawkoj
Copy link
Contributor Author

sawkoj commented Oct 2, 2020

Hi @rene-ye @peterbae , a purpose of the added property is to protect user from Out Of Memory Exception. Properties for “memory” control, which you have mentioned, won’t be enough in some cases. The most common case, I can think of, is when we have 2 queries which produce different result sets. Let’s call them A and B. A is large, contains thousands of rows, B’s size doesn’t matter. Upon executing A’s query, only first packet of A is kept in memory. We don’t iterate or do anything with A. Next, we call B query, to get B result set. In this moment, all remaining packets of A’s result set are being read to memory with first packet of B. Here OOM can occur and setting setFetchSize or setMaxFieldSize can prevent that.

Setting these limits can be very inconvenient though. Setting fetchSize changes amount of packets that need to be send to read whole resultSet. It could be a way of controlling of memory usage, but this approach is very error prone (you have to calculate the number of bytes for single row from a given query). If any column in query/table changes, you have to recompute number of rows to fetch if you want to handle memory using setFetchSize. The similar issue applies for setting maxFieldSize. You depend memory control on size of a single column, on change of query/table, you have to take that in consideration in code.

By using maxResultBuffer you just put amount of memory that you want to be used and don’t worry about having OOM and stopping whole application.

@ebrandsberg
Copy link

ebrandsberg commented Oct 2, 2020

Adding to this. This work is being done as part of a larger project at Heimdall Data, which provides a proxy for a variety of database types. We are working to bring consistent controls over memory usage across multiple driver types. This feature has already been included in the Postgres JDBC driver (pgjdbc/pgjdbc#1657), as well as a fix submitted to the Oracle MySQL jdbc driver (https://bugs.mysql.com/bug.php?id=100415), so our next target was the SQL Server driver. To completely understand the need, don't think about one connection, but when you have thousands of connections active, and you need to be able to ensure there is enough memory for all of them. @sawkoj explained one very specific case, but there are other cases where (I believe) adaptive fetching doesn't work, in particular if retrieving large result sets from tables with clustered columnstore indexes. Cursors are not allowed in this case, and can trigger oom problems.

@peterbae peterbae added this to Under Peer Review in MSSQL JDBC Oct 2, 2020
@ulvii ulvii removed this from the 9.1.0 milestone Oct 13, 2020
@rene-ye
Copy link
Member

rene-ye commented Oct 29, 2020

Hi @sawkoj, I've been testing the PR and there is currently some undesirable behavior with the new feature. There are currently some scenarios where if the maxResultBuffer property was set to "-1", the driver will throw an error on connect. New properties should be introduced with a default value where it won't impact the rest of the driver when either not specified or off. The error message also seems to need to work, currently I'm getting:

com.microsoft.sqlserver.jdbc.SQLServerException: Invalid syntax: [Ljava.lang.Object;@628190ec in maxResultBuffer parameter at com.microsoft.sqlserver.jdbc.MaxResultBufferParser.throwNewInvalidMaxResultBufferParameterException(MaxResultBufferParser.java:120)

The above error is easily reproducible if you run our unit tests provided in the repo, if you have any trouble setting it up, we can help with that. Some of the tests require special setup, and it's ok if you can't run them on your end, but the CIs should pass.

long number = -1;

// check for null values and empty String "", if so return -1
if (StringUtils.isEmpty(input)) {
Copy link
Member

@rene-ye rene-ye Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary solution would be to change this line to

if (StringUtils.isEmpty(input) || "-1".equalsIgnoreCase(input)) {

But we should ask whether it makes sense to allow users to even input something like -1 at all. -1 is also used as a default value by XAConnection at times.

@@ -355,6 +355,8 @@ protected static ISQLServerDataSource updateDataSource(String connectionString,
break;
case Constants.SEND_TEMPORAL_DATATYPES_AS_STRING_FOR_BULK_COPY:
ds.setSendTemporalDataTypesAsStringForBulkCopy(Boolean.parseBoolean(value));
Copy link
Member

@rene-ye rene-ye Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a break at the end of the case statement or it'll overflow into the next one.

@@ -355,6 +355,8 @@ protected static ISQLServerDataSource updateDataSource(String connectionString,
break;
case Constants.SEND_TEMPORAL_DATATYPES_AS_STRING_FOR_BULK_COPY:
ds.setSendTemporalDataTypesAsStringForBulkCopy(Boolean.parseBoolean(value));
case Constants.MAX_RESULT_BUFFER:
ds.setMaxResultBuffer(value);
Copy link
Member

@rene-ye rene-ye Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a break to the end of the this so the above bug doesn't happen again.

…Parser, added check for negative values supplied as maxResultBuffer parameter, they have their own exception, changed default value of MaxResultBuffer to "-1", addressed bad format of InvalidSyntaxException, added corresponding tests to MaxResultBufferParserTest.java
@sawkoj
Copy link
Contributor Author

sawkoj commented Nov 4, 2020

Hi @rene-ye, thank you for your feedback. According to your suggestions, I changed the default value of maxResultBuffer property to "-1", so from now it should be parsed correctly whether value for the property is specified or not. Also, I fixed the error message (I was unnecessarily packaging it as Array). I refactored the parsing logic, so from now it's gonna throw an error upon providing negative values, tests for negative values had been added as well. Of course I didn't forget to add missing break statements in AbstractTest.java.

/**
* Interface for MaxResultBufferCounter
*/
public interface ICounter {
Copy link
Contributor

@ulvii ulvii Nov 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this interface package-private.

//Counter reference, so maxResultBuffer property can by acknowledged
private ICounter counter;

public ICounter getCounter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package-private please.

return counter;
}

public void createCounter(ICounter previousCounter, Properties activeConnectionProperties) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package-private

}

private static void setMaxResultBuffer(String maxResultBuffer) {
connectionString = TestUtils.addOrOverrideProperty(connectionString, "maxResultBuffer", maxResultBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because connectionString is defined in AbstractTest and is static, this line will change the connection string for all the tests. Please use a local connection string .

@@ -0,0 +1,280 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more test for different possible values of the connection property, both valid and invalid cases.

…Counter and createCounter methods in IOBuffer.java
…al copy of connectionString (named localConnectionString), so it won't affect connectionString for other tests
…dateMaxResultBuffer method in MaxResultBufferParser.java accordingly
@sawkoj
Copy link
Contributor Author

sawkoj commented Nov 10, 2020

Hi @ulvii, thank you for the feedback. According to your suggestions, I changed access modifiers in ICounter interface and IOBuffer class methods getCounter(), createCounter() as well. In MaxResultBufferTest I added an another private static field, which is used as local connection String. The original connection string is unaffected. Also I added more test cases to MaxResultBufferParserTest (valid and invalid ones). MaxResultBufferTest class was refactored too, I parameterized existing tests and I added different MaxResultBuffer property values to illustrate the change of behavior.

peterbae
peterbae previously approved these changes Nov 23, 2020
@peterbae
Copy link
Contributor

The PR looks good to me. Once the other team members have reviewed & approved, we will re-run the CI tests and merge the PR.


private void checkForMaxResultBufferOverflow(long number) throws SQLServerException {
if (number > maxResultBuffer) {
logger.log(Level.WARNING, "MaxResultBuffer exceeded: {0}. Property was set to {1}.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to :

if (logger.isLoggable(Level.SEVERE)) {
    logger.log(Level.SEVERE, "MaxResultBuffer exceeded: {0}. Property was set to {1}.",
            new Object[] {number, maxResultBuffer});
}

try {
number = Long.parseLong(input);
} catch (NumberFormatException e) {
logger.log(Level.INFO, ERROR_MESSAGE, new Object[] {input});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (logger.isLoggable(Level.SEVERE)) {
    logger.log(Level.SEVERE, ERROR_MESSAGE, new Object[] {input});
}

try {
number = Long.parseLong(numberString);
} catch (NumberFormatException e) {
logger.log(Level.INFO, ERROR_MESSAGE, new Object[] {input});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (logger.isLoggable(Level.SEVERE)) {
    logger.log(Level.SEVERE, ERROR_MESSAGE, new Object[] {input});
}

try {
number = Long.parseLong(numberString);
} catch (NumberFormatException e) {
logger.log(Level.INFO, ERROR_MESSAGE, new Object[] {input});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (logger.isLoggable(Level.SEVERE)) {
    logger.log(Level.SEVERE, ERROR_MESSAGE, new Object[] {input});
}

multiplier = 1_000_000_000_000L;
break;
default:
logger.log(Level.INFO, ERROR_MESSAGE, new Object[] {input});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (logger.isLoggable(Level.SEVERE)) {
    logger.log(Level.SEVERE, ERROR_MESSAGE, new Object[] {input});
}

{"R_TokenRequireUrl", "Token credentials require a URL using the HTTPS protocol scheme."},};
};
{"R_TokenRequireUrl", "Token credentials require a URL using the HTTPS protocol scheme."},
{"R_maxResultBufferPropertyDescription",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R_maxResultBufferPropertyDescription should describe what maxResultBuffer connection property does, please check R_useFmtOnlyPropertyDescription for reference. I'd suggest to create another String resource for MaxResultBuffer property exceeded: {0}. MaxResultBuffer was set to: {1}. and use R_maxResultBufferPropertyDescription for the actual description of the connection property.

…rParser, added description to R_maxResultBufferPropertyDescription in SQLServerResource, added another string resource (R_maxResultBufferPropertyExceeded) in SQLServerResource
@sawkoj
Copy link
Contributor Author

sawkoj commented Nov 25, 2020

Hi @peterbae and @ulvii, thanks for feedback. Loggers were adjusted in MaxResultBufferParser and in MaxResultBufferCounter. I created additional resource: R_maxResultBufferPropertyExceeded when MaxResultBuffer is exceeded. Description of R_maxResultBufferPropertyDescription is updated too.

@ulvii ulvii added this to the 9.1.1 milestone Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Public API Changes in Public API
Projects
MSSQL JDBC
  
Closed/Merged PRs
Development

Successfully merging this pull request may close these issues.

None yet

6 participants