Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSSQL incorrect parametrize types #5268

Open
JakeHamix opened this issue Jul 18, 2022 · 4 comments
Open

MSSQL incorrect parametrize types #5268

JakeHamix opened this issue Jul 18, 2022 · 4 comments

Comments

@JakeHamix
Copy link

JakeHamix commented Jul 18, 2022

Environment

Knex version: 0.95.6
Database + version: SQL Server 2017
OS: Windows/Linux

@smorey2

Bug

We have observed some strange inconsistent performance spikes in a database caused by our application. After some (a lot of) digging around, I have finally determined the cause.

To experienced SQL server users, it's probably nothing new. It's a known implicit convert issue and unlike many others would make you believe, it's not a minor one and in our case ended up being critical.

This can be observed in seemingly random performance problems, where queries take way longer and consume way more resources than expected. While examining the query plan, nothing seems to be fundamentally wrong (except the possible implicit convert warning).

After a while, I managed to isolate this issue only to parameterized queries. Turns out Knex guesses the database type by the parameter variable type while parametrizing the query in _typeForBinding.

_typeForBinding(binding) {
    const Driver = this._driver();

    switch (typeof binding) {
      case 'string':
        return Driver.TYPES.NVarChar;
      case 'boolean':
        return Driver.TYPES.Bit;
      case 'number': {
        if (binding % 1 !== 0) {
          return Driver.TYPES.Float;
        }

        if (binding < SQL_INT4.MIN || binding > SQL_INT4.MAX) {
          if (binding < SQL_BIGINT_SAFE.MIN || binding > SQL_BIGINT_SAFE.MAX) {
            throw new Error(
              `Bigint must be safe integer or must be passed as string, saw ${binding}`
            );
          }

          return Driver.TYPES.BigInt;
        }

        return Driver.TYPES.Int;
      }
      default: {
        // if (binding === null || typeof binding === 'undefined') {
        //   return tedious.TYPES.Null;
        // }

        if (binding instanceof Date) {
          return Driver.TYPES.DateTime;
        }

        if (binding instanceof Buffer) {
          return Driver.TYPES.VarBinary;
        }

        return Driver.TYPES.NVarChar;
      }
    }
  }

So what's happening? It turned out some of our legacy database columns (and notably, some external views) have a column type of VarChar. As you can see in the code above, Knex will parametrize and bind any incoming strings as NVarChar. The database can't compare it directly and will perform a conversion, see here. As the tables grow, this issue is more and more apparent.

Example

Imagine we have some fairly large tables. To keep the queries running smoothly, you will naturally have an index on some columns. In this case, let's just assume a single indexed column varCharColumn. The code can be as simple as:

knex
    .select(columns)
    .from(tableName)
    .where(varCharColumn, 'someStringValue');

This will cause the database to convert the ENTIRE index, which consists of VarChar values, to NVarChar values. This enables it to compare directly to the input value of type NVarChar. Sounds silly, but that's how it is.

Solution

My solution is pretty dirty. I ended up forking Knex, adding an exception to these columns, and hard-coding them by name. Considering them to be legacy and expecting no new ones to be created.

if (typeof binding === 'object' && !!binding && typeof binding.value !== 'undefined') {
          return Driver.TYPES[binding.forceType];
}

Another option is to isolate these cases in your application and input the values as Raw, which the database will handle and issue the correct datatype. Alternatively, you can wrap the value in CAST.

I am not sure where I am going with this. The obvious solution would be to retrieve and cache the column database types on startup and use them afterward, but that's going preeetty far and can be tricky.

At the very least, developers should be informed about how Knex binds the parameter variables - in case they run into similar issues. It took me way longer than I am willing to admit to put this all together.

@kellyrbourg
Copy link
Contributor

+1
We have recently began migrating a large mssql based data access lib to knex and we have legacy and legitimate (managing max index size) reasons to use a mix of both nvarchar and varchar fields. We have noticed the same issues stated here and the impact is significant/unusable.

A few ideas for solutioning:

  • Have a way to have specificType like functionality when we are declaring parameter values for queries so that we can explicitly say which types are nvarchar and which types are varchar.
  • Have a way for users of knex to hook into the built in type mapping at config time so that we can default/extend it to what suites each app best.
  • Create a way to lookup the parameter type based on column name from the schema building.

Our workaround has been to use cast as stated above.

knex
    .select(columns)
    .from(tableName)
    .where(varCharColumn, knex.raw('CAST(? AS VARCHAR(100))', 'someStringValue'));

@kibertoad
Copy link
Collaborator

@kellyrbourg Would you be open to ceate a PR for either 1 or 2?
3 is too intrusive, knex absolutely should not be aware of the schema while building the query

@kellyrbourg
Copy link
Contributor

kellyrbourg commented Aug 6, 2022

Yes, I'm down to take a shot at bullet 2. I think it would have to be something specific to the mssql dialect/config. Is that ok?

@kibertoad
Copy link
Collaborator

Released in 2.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants