New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving support for SQL Server 2008 #5616

Merged
merged 17 commits into from May 13, 2016

Conversation

4 participants
@tahosa
Contributor

tahosa commented Mar 21, 2016

This fixes most of the problems related to #5492, and #4404 with regard to support for older versions of SQL Server. It also updates the Docker image to the latest NodeJS stable and adds an npm task for running integration tests in Docker.

The main piece of this update is to alter the assumption that all database engines have the parts of a query in the same order (e.g. column definitions, then table definitions, then where conditions, then joins, then limits, then offsets). SQL Server 2008 has a different ordering, so the existing query builder could not build valid LIMIT and OFFSET type queries.

I pulled the query assembler into an overridable method which remains unchanged for all dialects except mssql. For mssql, there is now version handling to generate queries which work for SQL server 2008 and earlier as well as for 2012 and newer using the better new syntax.

Unit and integration tests were run against all dialects, and against 3 different versions of SQL Server Express (2008, 2012, 2014) and all passed. Where relevant, some tests were modified to ensure complex functionality was tested (e.g. WHERE clauses in nested includes).

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 24, 2016

@mickhansen Any comments on this request? Just checking to see if there's anything that needs to change to get this merged.

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Mar 29, 2016

Sorry for not getting back to you @tahosa, this change brings up a few questions in regards to how we should organize the code.

In general i'm not a whole lot happy about having the order code be duplicated inside limit. I don't know much about SQL Server but apparently you need to enforce an order if there's a limit? But perhaps that could be done earlier as part of a normalization step instead.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 29, 2016

The way it was implemented, yes, an order clause is necessary to use OFFSET FETCH according to the TechNet docs. However, I'm not entirely sure why the special handling is there for the subqueries. I'm doing some testing now to see what pieces of the method can be removed for the newer versions of SQL server.

For 2008 and earlier though, the problem remains because the OFFSET FETCH mechanism doesn't exist. The new code I've added definitely compounds the problem of having code be duplicated due to how the queries must be constructed to get proper behavior for offsets. Limits are much more cleanly handled and most of the complex code in my PR is to handle the offset problem. I had considered creating another dialect for older versions of SQL server, but since 90% or more of the code is the same, I wasn't particularly happy with that approach either.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 29, 2016

After some tinkering and testing, it looks like that ripped code block is necessary, unless we can otherwise pass more information to the function. Since an ORDER BY clause is required for OFFSET FETCH to work, we have to add one if it isn't present. Since having ORDER BY on a subquery counts, that block of code just checks if any of the nested elements have an order statement, and adds one if none do.

I think the current approach is the most straightforward way of implementing this. I can maybe see extracting the ORDER BY build into a re-usable method so we don't have the code duplicated. Is this what you're talking about when you say normalizing it, or are you thinking more along the lines of rejecting the query if it has a limit/offset, but no order?

Randall Crock added some commits Mar 29, 2016

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Mar 30, 2016

At a higher level than the query generator (likely the query interface), i would parse the options. If limit and no order by, add a default order of the primary key.

That way the query generator doesn't have to care about adding a potential missing order, it just adds the limit.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 30, 2016

That approach would work for the Server 2012 and newer implementation, but we still need the actual order clause (if any) in order to support the older versions. Because we have to write a quite hackish nested query to get offsets to work, we at least need the ordering there to make sure the correct rows are returned for the offset.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 30, 2016

After some experimentation, it looks like implementing it as you describe isn't particularly simple. Since the ordering needs to be applied everywhere OFFSET or LIMIT is needed, including sub-queries, simply adding it at the parent level isn't good enough. It also seems to have some problems when the primary key is aliased, but that may be due to my naive implementation:

// Added to lib/query-interface.js at line 660 inside definition for select()
switch (this.sequelize.options.dialect) {
  case 'mssql': // Add default order if limit or offset is used
    if((options.limit || options.offset) && !options.order) {
      options.order = [[model.primaryKeyField, 'ASC']];
    }
    break;
}
@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Mar 31, 2016

Ah yeah, field aliasing is definitely a problem. We'll have to revisit this issue when i refactor how we handle renaming of fields.

We'll have to consider this carefully, i feel it adds a lot of ugly code to the codebase.
LIMIT doesn't really do that much when using ORDER BY (since unless you have a index in the same direction you need a full table scan), we could consider slicing client side, although that would not be great for when you have billions of rowso f course.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Mar 31, 2016

Is there anything in particular you'd like cleaned up in this PR? With the latest changes, I was at least able to eliminate the code duplication to find the orders, which I think does make it a lot cleaner. I agree though that the query to get SQL 2008 working is not very pretty. I can clean up the formatting and comment it more thoroughly to make it clear what it's doing for better maintainability.

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Apr 1, 2016

Mostly how selectFragmentBuilder is currently handled by calling it for string matches of SELECT.
I'm not sure if some of the code could be refactored to not write SELECT directly and instead use a method and then MSSQL could wrap that.

@tahosa

This comment has been minimized.

Contributor

tahosa commented Apr 1, 2016

I'm not sure if I understand what you're saying about selectSqlFragmentBuilder. From my reading of your comment, that's already how it behaves. I created that method as part of this work to extract direct building of the first part of SQL statements from selectQuery. I did this precisely to be able to override it from the mssql query generator. All the other dialects still just use the implementation I extracted in the abstract query generator.

I feel this is a little clunky due to how selectQuery is currently written. It's an extremely complex method that I think needs to be broken down into overridable sections to make it easier to follow and make improvements. If I can find the time, I can look into such a refactor.

@djshitcoin

This comment has been minimized.

djshitcoin commented Apr 1, 2016

Thanks, just today I stumbled into the same issues as the authors of the referenced issues. Hope it gets merged soon 👍

@mickhansen mickhansen self-assigned this Apr 4, 2016

@tahosa

This comment has been minimized.

Contributor

tahosa commented Apr 11, 2016

@mickhansen, any more comments or problems stopping this from merging?

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Apr 12, 2016

@tahosa No i'm sorry i've just been super busy :(

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Apr 12, 2016

/cc @janmeier

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented Apr 12, 2016

The wrapping code is sort of wracking my brain, we definitely need some unit testing around it so we don't break it.

Randall Crock added some commits Apr 12, 2016

@tahosa

This comment has been minimized.

Contributor

tahosa commented Apr 12, 2016

I've added unit tests for the weird query, as well as some more documentation about what it's doing and cleaning up the formatting a bit.

@tahosa

This comment has been minimized.

Contributor

tahosa commented May 12, 2016

Bumping this again. Any other updates that are needed before this can be merged?

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented May 12, 2016

This is going to sound bikesheddy, but i generally just don't like the approach with selectSqlFragmentBuilder, but i've been too swamped to think of better way myself.

/cc @janmeier

//console.log(`Replacing '${mainQueryItems[0]}' with '${string}'`)
mainQueryItems[i] = string;
break;
}

This comment has been minimized.

@mickhansen

mickhansen May 12, 2016

Contributor

What happens here exactly? You attempt to find the first query item containg a select statement and then replace? Shouldn't there be an easier way of doing this.

This comment has been minimized.

@tahosa

tahosa May 12, 2016

Contributor

You are correct, more or less about what this is for. It was originally designed to handle nesting queries, but since that's handled already by the recursive subquery calls, I'll simplify it.

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented May 12, 2016

Can we rename selectSqlFragmentBuilder to selectFromTableFragment? If we can resolve that and perhaps the loop i guess we can merge - Although i won't like it, but i do definitely see the value in fixing this for a lot of MSSQL dialects.

@tahosa

This comment has been minimized.

Contributor

tahosa commented May 12, 2016

I've changed the function name and clarified what the loop is doing and how it works. I agree that this is less than ideal, but I think it would take a significant reorganization of the selectQuery method to fix the root cause.

The problem stems from the assumption that queries must always be constructed in the order SELECT -> FROM -> WHERE -> ORDER -> OFFSET -> LIMIT. Older versions of MSSQL aren't constructed in this order, and instead are more like SELECT -> LIMIT -> FROM -> WHERE -> ORDER and OFFSET doesn't have a keyword so must be implied through the query's structure.

If I can get some time, I will consider tackling the job of breaking selectQuery down into smaller, more manageable chunks which will allow for a cleanup of the selectFromTableFragment.

@mickhansen

This comment has been minimized.

Contributor

mickhansen commented May 13, 2016

I've had on my own list that we need to break down selectQuery into smaller functions aswell, so if you can find time to start on that it would be greatly appreciated @tahosa.

As for this PR i think we can live with it for now, just waiting on word form @janmeier

@janmeier janmeier merged commit 7af83eb into sequelize:master May 13, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

janmeier added a commit that referenced this pull request May 13, 2016

@janmeier

This comment has been minimized.

Member

janmeier commented May 13, 2016

Thanks a bunch @tahosa ! (:

@janmeier

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment