Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelPaginator and current version of SQLAlchemy #116

Open
ods opened this issue Mar 28, 2013 · 3 comments
Open

ModelPaginator and current version of SQLAlchemy #116

ods opened this issue Mar 28, 2013 · 3 comments

Comments

@ods
Copy link
Contributor

ods commented Mar 28, 2013

It's a pity, but now Query.count() in SQLAlchemy produces very inefficient SQL. See https://bitbucket.org/zzzeek/sqlalchemy/commits/9e7714afcaf5adc4100bce25416c0651ed13644f for details. We have to find a solution how to deal with it. Using patched SA doesn't seem a good way.

@ods
Copy link
Contributor Author

ods commented Mar 28, 2013

Temporal solution is to patch query.py:

--- a/sqlalchemy/orm/query.py   Wed Mar 27 21:07:15 2013 +0400
+++ b/sqlalchemy/orm/query.py   Thu Mar 28 17:47:05 2013 +0400
@@ -2542,44 +2542,116 @@ class Query(object):
                 kwargs.get('offset') is not None or
                 kwargs.get('distinct', False))

+    #def count(self):
+    #    """Return a count of rows this Query would return.
+
+    #    This generates the SQL for this Query as follows::
+
+    #        SELECT count(1) AS count_1 FROM (
+    #            SELECT <rest of query follows...>
+    #        ) AS anon_1
+
+    #    .. versionchanged:: 0.7
+    #        The above scheme is newly refined as of 0.7b3.
+
+    #    For fine grained control over specific columns
+    #    to count, to skip the usage of a subquery or
+    #    otherwise control of the FROM clause,
+    #    or to use other aggregate functions,
+    #    use :attr:`~sqlalchemy.sql.expression.func` expressions in conjunction
+    #    with :meth:`~.Session.query`, i.e.::
+
+    #        from sqlalchemy import func
+
+    #        # count User records, without
+    #        # using a subquery.
+    #        session.query(func.count(User.id))
+
+    #        # return count of user "id" grouped
+    #        # by "name"
+    #        session.query(func.count(User.id)).\\
+    #                group_by(User.name)
+
+    #        from sqlalchemy import distinct
+
+    #        # count distinct "name" values
+    #        session.query(func.count(distinct(User.name)))
+
+    #    """
+    #    col = sql.func.count(sql.literal_column('*'))
+    #    return self.from_self(col).scalar()
+
+    # XXX Old count() implementation from SA revision 6135:032e956e982e
     def count(self):
         """Return a count of rows this Query would return.

-        This generates the SQL for this Query as follows::
-
-            SELECT count(1) AS count_1 FROM (
-                SELECT <rest of query follows...>
-            ) AS anon_1
-
-        .. versionchanged:: 0.7
-            The above scheme is newly refined as of 0.7b3.
-
-        For fine grained control over specific columns
-        to count, to skip the usage of a subquery or
-        otherwise control of the FROM clause,
-        or to use other aggregate functions,
-        use :attr:`~sqlalchemy.sql.expression.func` expressions in conjunction
-        with :meth:`~.Session.query`, i.e.::
-
-            from sqlalchemy import func
-
-            # count User records, without
-            # using a subquery.
-            session.query(func.count(User.id))
-
-            # return count of user "id" grouped
-            # by "name"
-            session.query(func.count(User.id)).\\
-                    group_by(User.name)
-
-            from sqlalchemy import distinct
-
-            # count distinct "name" values
-            session.query(func.count(distinct(User.name)))
+        For simple entity queries, count() issues
+        a SELECT COUNT, and will specifically count the primary
+        key column of the first entity only.  If the query uses 
+        LIMIT, OFFSET, or DISTINCT, count() will wrap the statement 
+        generated by this Query in a subquery, from which a SELECT COUNT
+        is issued, so that the contract of "how many rows
+        would be returned?" is honored.
+
+        For queries that request specific columns or expressions, 
+        count() again makes no assumptions about those expressions
+        and will wrap everything in a subquery.  Therefore,
+        ``Query.count()`` is usually not what you want in this case.
+        To count specific columns, often in conjunction with 
+        GROUP BY, use ``func.count()`` as an individual column expression
+        instead of ``Query.count()``.  See the ORM tutorial
+        for an example.

         """
-        col = sql.func.count(sql.literal_column('*'))
-        return self.from_self(col).scalar()
+        should_nest = [self._should_nest_selectable]
+        def ent_cols(ent):
+            if isinstance(ent, _MapperEntity):
+                return ent.mapper.primary_key
+            else:
+                should_nest[0] = True
+                return [ent.column]
+
+        return self._col_aggregate(sql.literal_column('1'), sql.func.count,
+            nested_cols=chain(*[ent_cols(ent) for ent in self._entities]),
+            should_nest = should_nest[0]
+        )
+
+    def _col_aggregate(self, col, func, nested_cols=None, should_nest=False):
+        context = QueryContext(self)
+
+        for entity in self._entities:
+            entity.setup_context(self, context)
+
+        if context.from_clause:
+            from_obj = list(context.from_clause)
+        else:
+            from_obj = context.froms
+
+        if self._enable_single_crit:
+            self._adjust_for_single_inheritance(context)
+
+        whereclause  = context.whereclause
+
+        if should_nest:
+            if not nested_cols:
+                nested_cols = [col]
+            else:
+                nested_cols = list(nested_cols)
+            s = sql.select(nested_cols, whereclause, 
+                        from_obj=from_obj, use_labels=True,
+                        **self._select_args)
+            s = s.alias()
+            s = sql.select(
+                [func(s.corresponding_column(col) or col)]).select_from(s)
+        else:
+            s = sql.select([func(col)], whereclause, from_obj=from_obj,
+            **self._select_args)
+
+        if self._autoflush and not self._populate_existing:
+            self.session._autoflush()
+        return self.session.scalar(s, params=self._params,
+            mapper=self._mapper_zero())
+    # XXX End of old count() implementation

     def delete(self, synchronize_session='evaluate'):
         """Perform a bulk delete query.

@riffm
Copy link
Contributor

riffm commented Mar 28, 2013

@ods can we subclass instead of patching?

@ods
Copy link
Contributor Author

ods commented Mar 29, 2013

@riffm The only difference I see is that subclassing gives us false feeling that we can safely update library with new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants