Skip to content

Commit

Permalink
Clarify comments/documentation regarding DBIC/SQLA/DQ relationship
Browse files Browse the repository at this point in the history
This discussion comes up several times per year, just put it to bed for good
  • Loading branch information
ribasushi committed Oct 15, 2015
1 parent 8920356 commit 07fadea
Show file tree
Hide file tree
Showing 4 changed files with 191 additions and 44 deletions.
123 changes: 112 additions & 11 deletions lib/DBIx/Class/SQLMaker.pm
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ DBIx::Class::SQLMaker - An SQL::Abstract-based SQL maker class
=head1 DESCRIPTION
This module is a subclass of L<SQL::Abstract> and includes a number of
DBIC-specific workarounds, not yet suitable for inclusion into the
This module is currently a subclass of L<SQL::Abstract> and includes a number of
DBIC-specific extensions/workarounds, not suitable for inclusion into the
L<SQL::Abstract> core. It also provides all (and more than) the functionality
of L<SQL::Abstract::Limit>, see L<DBIx::Class::SQLMaker::LimitDialects> for
more info.
Currently the enhancements to L<SQL::Abstract> are:
Currently the enhancements over L<SQL::Abstract> are:
=over
Expand All @@ -25,10 +25,102 @@ Currently the enhancements to L<SQL::Abstract> are:
=item * C<GROUP BY>/C<HAVING> support (via extensions to the order_by parameter)
=item * A rudimentary multicolumn IN operator
=item * Support of C<...FOR UPDATE> type of select statement modifiers
=back
=head1 ROADMAP
Some maintainer musings on the current state of SQL generation within DBIC as
of Oct 2015
=head2 Folding of most (or all) of L<SQL::Abstract (SQLA)|SQL::Abstract> into DBIC
The rise of complex prefetch use, and the general streamlining of result
parsing within DBIC ended up pushing the actual SQL generation to the forefront
of many casual performance profiles. While the idea behind SQLA's API is sound,
the actual implementation is terribly inefficient (once again bumping into the
ridiculously high overhead of perl function calls).
Given that SQLA has a B<very> distinct life on its own, and is used within an
order of magnitude more projects compared to DBIC, it is prudent to B<not>
disturb the current call chains within SQLA itself. Instead in the near future
an effort will be undertaken to seek a more thorough decoupling of DBIC SQL
generation from reliance on SQLA, possibly to a point where B<DBIC will no
longer depend on SQLA> at all.
B<The L<SQL::Abstract> library itself will continue being maintained> although
it is not likely to gain many extra features, notably dialect support, at least
not within the base C<SQL::Abstract> namespace.
This work (if undertaken) will take into consideration the following
constraints:
=over
=item Main API compatibility
The object returned by C<< $schema->storage->sqlmaker >> needs to be able to
satisfy most of the basic tests found in the current-at-the-time SQLA dist.
While things like L<case|SQL::Abstract/case> or L<logic|SQL::Abstract/logic>
or even worse L<convert|SQL::Abstract/convert> will definitely remain
unsupported, the rest of the tests should pass (within reason).
=item Ability to plug back an SQL::Abstract (or derivative)
During the initial work on L<Data::Query> the test suite of DBIC turned out to
be an invaluable asset to iron out hard-to-reason-about corner cases. In
addition the test suite is much more vast and intricate than the tests of SQLA
itself. This state of affairs is way too valuable to sacrifice in order to gain
faster SQL generation. Thus a compile-time-ENV-check will be introduced along
with an extra CI configuration to ensure that DBIC is used with an off-the-CPAN
SQLA and that it continues to flawlessly run its entire test suite. While this
will undoubtedly complicate the implementation of the better performing SQL
generator, it will preserve both the usability of the test suite for external
projects and will keep L<SQL::Abstract> from regressions in the future.
=back
Aside from these constraints it is becoming more and more practical to simply
stop using SQLA in day-to-day production deployments of DBIC. The flexibility
of the internals is simply not worth the performance cost.
=head2 Relationship to L<Data::Query (DQ)|Data::Query>
When initial work on DQ was taking place, the tools in L<::Storage::DBIHacks
|http://github.com/dbsrgits/dbix-class/blob/current/blead/lib/DBIx/Class/Storage/DBIHacks.pm>
were only beginning to take shape, and it wasn't clear how important they will
become further down the road. In fact the I<regexing all over the place> was
considered an ugly stop-gap, and even a couple of highly entertaining talks
were given to that effect. As the use-cases of DBIC were progressing, and
evidence for the importance of supporting arbitrary SQL was mounting, it became
clearer that DBIC itself would not really benefit in any way from an
integration with DQ, but on the contrary is likely to lose functionality while
the corners of the brand new DQ codebase are sanded off.
The current status of DBIC/DQ integration is that the only benefit is for DQ by
having access to the very extensive "early adopter" test suite, in the same
manner as early DBIC benefitted tremendously from usurping the Class::DBI test
suite. As far as the DBIC user-base - there are no immediate practical upsides
to DQ integration, neither in terms of API nor in performance.
So (as described higher up) the DBIC development effort will in the foreseable
future ignore the existence of DQ, and will continue optimizing the preexisting
SQLA-based solution, potentially "organically growing" its own compatible
implementation. Also (again, as described higher up) the ability to plug a
separate SQLA-compatible class providing the necessary surface API will remain
possible, and will be protected at all costs in order to continue providing DQ
access to the test cases of DBIC.
In the short term, after one more pass over the ResultSet internals is
undertaken I<real soon now (tm)>, and before the SQLA/SQLMaker integration
takes place, the preexisting DQ-based branches will be pulled/modified/rebased
to get up-to-date with the current state of the codebase, which changed very
substantially since the last migration effort, especially in the SQL
classification meta-parsing codepath.
=cut

use base qw/
Expand Down Expand Up @@ -146,8 +238,9 @@ sub select {
if( $limiter = $self->can ('emulate_limit') ) {
carp_unique(
'Support for the legacy emulate_limit() mechanism inherited from '
. 'SQL::Abstract::Limit has been deprecated, and will be removed when '
. 'DBIC transitions to Data::Query. If your code uses this type of '
. 'SQL::Abstract::Limit has been deprecated, and will be removed at '
. 'some future point, as it gets in the way of architectural and/or '
. 'performance advances within DBIC. If your code uses this type of '
. 'limit specification please file an RT and provide the source of '
. 'your emulate_limit() implementation, so an acceptable upgrade-path '
. 'can be devised'
Expand Down Expand Up @@ -211,9 +304,9 @@ sub insert {
# optimized due to hotttnesss
# my ($self, $table, $data, $options) = @_;

# SQLA will emit INSERT INTO $table ( ) VALUES ( )
# FIXME SQLA will emit INSERT INTO $table ( ) VALUES ( )
# which is sadly understood only by MySQL. Change default behavior here,
# until SQLA2 comes with proper dialect support
# until we fold the extra pieces into SQLMaker properly
if (! $_[2] or (ref $_[2] eq 'HASH' and !keys %{$_[2]} ) ) {
my @bind;
my $sql = sprintf(
Expand Down Expand Up @@ -294,7 +387,10 @@ sub _recurse_fields {
# things in the SQLA space need to have more info about the $rs they
# create SQL for. The alternative would be to keep expanding the
# signature of _select with more and more positional parameters, which
# is just gross. All hail SQLA2!
# is just gross.
#
# FIXME - this will have to transition out to a subclass when the effort
# of folding the SQLA machinery into SQLMaker takes place
sub _parse_rs_attrs {
my ($self, $arg) = @_;

Expand Down Expand Up @@ -495,9 +591,14 @@ sub _join_condition {
return $self->_recurse_where($cond);
}

# This is hideously ugly, but SQLA does not understand multicol IN expressions
# FIXME TEMPORARY - DQ should have native syntax for this
# moved here to raise API questions
# !!! EXPERIMENTAL API !!! WILL CHANGE !!!
#
# This is rather odd, but vanilla SQLA does not have support for multicolumn IN
# expressions
# Currently has only one callsite in ResultSet, body moved into this subclass
# of SQLA to raise API questions like:
# - how do we convey a list of idents...?
# - can binds reside on lhs?
#
# !!! EXPERIMENTAL API !!! WILL CHANGE !!!
sub _where_op_multicolumn_in {
Expand Down
4 changes: 2 additions & 2 deletions lib/DBIx/Class/SQLMaker/LimitDialects.pm
Original file line number Diff line number Diff line change
Expand Up @@ -273,8 +273,8 @@ EOS
# method, and the slower BETWEEN query is used instead
#
# FIXME - this is quite expensive, and does not perform caching of any sort
# as soon as some of the DQ work becomes viable consider switching this
# over
# as soon as some of the SQLA-inlining work becomes viable consider adding
# some rudimentary caching support
if (
$rs_attrs->{order_by}
and
Expand Down
20 changes: 13 additions & 7 deletions lib/DBIx/Class/Storage/DBI/mysql.pm
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,19 @@ sub _prep_for_execute {
return $self->next::method(@_) if ( $_[0] eq 'select' or $_[0] eq 'insert' );


# FIXME FIXME FIXME - this is a terrible, gross, incomplete hack
# it should be trivial for mst to port this to DQ (and a good
# exercise as well, since we do not yet have such wide tree walking
# in place). For the time being this will work in limited cases,
# mainly complex update/delete, which is really all we want it for
# currently (allows us to fix some bugs without breaking MySQL in
# the process, and is also crucial for Shadow to be usable)
# FIXME FIXME FIXME - this is a terrible, gross, incomplete, MySQL-specific
# hack but it works rather well for the limited amount of actual use cases
# which can not be done in any other way on MySQL. This allows us to fix
# some bugs without breaking MySQL support in the process and is also
# crucial for more complex things like Shadow to be usable
#
# This code is just a pre-analyzer, working in tandem with ::SQLMaker::MySQL,
# where the possibly-set value of {_modification_target_referenced_re} is
# used to demarcate which part of the final SQL to double-wrap in a subquery.
#
# This is covered extensively by "offline" tests, so when the DQ work
# resumes, this will get flagged. Afaik there are no AST-visitor code of that
# magnitude yet (Oct 2015) within DQ, so a good exercise overall.

# extract the source name, construct modification indicator re
my $sm = $self->sql_maker;
Expand Down
88 changes: 64 additions & 24 deletions lib/DBIx/Class/Storage/DBIHacks.pm
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,24 @@ package #hide from PAUSE
DBIx::Class::Storage::DBIHacks;

#
# This module contains code that should never have seen the light of day,
# does not belong in the Storage, or is otherwise unfit for public
# display. The arrival of SQLA2 should immediately obsolete 90% of this
# This module contains code supporting a battery of special cases and tests for
# many corner cases pushing the envelope of what DBIC can do. When work on
# these utilities began in mid 2009 (51a296b402c) it wasn't immediately obvious
# that these pieces, despite their misleading on-first-sighe-flakiness, will
# become part of the generic query rewriting machinery of DBIC, allowing it to
# both generate and process queries representing incredibly complex sets with
# reasonable efficiency.
#
# Now (end of 2015), more than 6 years later the routines in this class have
# stabilized enough, and are meticulously covered with tests, to a point where
# an effort to formalize them into user-facing APIs might be worthwhile.
#
# An implementor working on publicizing and/or replacing the routines with a
# more modern SQL generation framework should keep in mind that pretty much all
# existing tests are constructed on the basis of real-world code used in
# production somewhere.
#
# Please hack on this responsibly ;)
#

use strict;
Expand Down Expand Up @@ -346,27 +361,53 @@ sub _adjust_select_args_for_complex_prefetch {
});
}

# This is totally horrific - the {where} ends up in both the inner and outer query
# Unfortunately not much can be done until SQLA2 introspection arrives, and even
# then if where conditions apply to the *right* side of the prefetch, you may have
# to both filter the inner select (e.g. to apply a limit) and then have to re-filter
# the outer select to exclude joins you didn't want in the first place
# FIXME: The {where} ends up in both the inner and outer query, i.e. *twice*
#
# This is rather horrific, and while we currently *do* have enough
# introspection tooling available to attempt a stab at properly deciding
# whether or not to include the where condition on the outside, the
# machinery is still too slow to apply it here.
# Thus for the time being we do not attempt any sanitation of the where
# clause and just pass it through on both sides of the subquery. This *will*
# be addressed at a later stage, most likely after folding the SQL generator
# into SQLMaker proper
#
# OTOH it can be seen as a plus: <ash> (notes that this query would make a DBA cry ;)
#
return $outer_attrs;
}

# This is probably the ickiest, yet most relied upon part of the codebase:
# this is the place where we take arbitrary SQL input and break it into its
# constituent parts, making sure we know which *sources* are used in what
# *capacity* ( selecting / restricting / grouping / ordering / joining, etc )
# Although the method is pretty horrific, the worst thing that can happen is
# for a classification failure, which in turn will result in a vocal exception,
# and will lead to a relatively prompt fix.
# The code has been slowly improving and is covered with a formiddable battery
# of tests, so can be considered "reliably stable" at this point (Oct 2015).
#
# I KNOW THIS SUCKS! GET SQLA2 OUT THE DOOR SO THIS CAN DIE!
# A note to implementors attempting to "replace" this - keep in mind that while
# there are multiple optimization avenues, the actual "scan literal elements"
# part *MAY NEVER BE REMOVED*, even if it is limited only ot the (future) AST
# nodes that are deemed opaque (i.e. contain literal expressions). The use of
# blackbox literals is at this point firmly a user-facing API, and is one of
# *the* reasons DBIC remains as flexible as it is. In other words, when working
# on this keep in mind that the following is widespread and *encouraged* way
# of using DBIC in the wild when push comes to shove:
#
# $rs->search( {}, {
# select => \[ $random, @stuff],
# from => \[ $random, @stuff ],
# where => \[ $random, @stuff ],
# group_by => \[ $random, @stuff ],
# order_by => \[ $random, @stuff ],
# } )
#
# Various incarnations of the above are reflected in many of the tests. If one
# gets to fail, you get to fix it. A "this is crazy, nobody does that" is not
# acceptable going forward.
#
# Due to a lack of SQLA2 we fall back to crude scans of all the
# select/where/order/group attributes, in order to determine what
# aliases are needed to fulfill the query. This information is used
# throughout the code to prune unnecessary JOINs from the queries
# in an attempt to reduce the execution time.
# Although the method is pretty horrific, the worst thing that can
# happen is for it to fail due to some scalar SQL, which in turn will
# result in a vocal exception.
sub _resolve_aliastypes_from_select_args {
my ( $self, $attrs ) = @_;

Expand Down Expand Up @@ -678,19 +719,16 @@ sub _group_over_selection {
# of the external order and convert them to MIN(X) for ASC or MAX(X)
# for DESC, and group_by the root columns. The end result should be
# exactly what we expect

# FIXME - this code is a joke, will need to be completely rewritten in
# the DQ branch. But I need to push a POC here, otherwise the
# pesky tests won't pass
# wrap any part of the order_by that "responds" to an ordering alias
# into a MIN/MAX
#
$sql_maker ||= $self->sql_maker;
$order_chunks ||= [
map { ref $_ eq 'ARRAY' ? $_ : [ $_ ] } $sql_maker->_order_by_chunks($attrs->{order_by})
];

my ($chunk, $is_desc) = $sql_maker->_split_order_chunk($order_chunks->[$o_idx][0]);

# we reached that far - wrap any part of the order_by that "responded"
# to an ordering alias into a MIN/MAX
$new_order_by[$o_idx] = \[
sprintf( '%s( %s )%s',
($is_desc ? 'MAX' : 'MIN'),
Expand Down Expand Up @@ -1041,7 +1079,9 @@ sub _extract_colinfo_of_stable_main_source_order_by_portion {
# resultset {where} stacks
#
# FIXME - while relatively robust, this is still imperfect, one of the first
# things to tackle with DQ
# things to tackle when we get access to a formalized AST. Note that this code
# is covered by a *ridiculous* amount of tests, so starting with porting this
# code would be a rather good exercise
sub _collapse_cond {
my ($self, $where, $where_is_anded_array) = @_;

Expand Down

0 comments on commit 07fadea

Please sign in to comment.