Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use 'group by and having clause' like sql? #333

Open
namndev opened this issue Jan 7, 2016 · 4 comments
Open

How to use 'group by and having clause' like sql? #333

namndev opened this issue Jan 7, 2016 · 4 comments
Labels

Comments

@namndev
Copy link

namndev commented Jan 7, 2016

Some times, To Query faster I have use 'group by and having clause in sql'.
But in Cubes, I don't known to this work.

  • I find in sql/browser/aggregation_statement() line below:
statement = sql.expression.select(selection,
                                          from_obj=context.star,
                                          use_labels=True,
                                          whereclause=condition,
                                          group_by=group_by)

That: Do not using HAVING.

@namndev namndev closed this as completed Jan 8, 2016
@namndev namndev reopened this Jan 8, 2016
@namndev
Copy link
Author

namndev commented Jan 8, 2016

I try edit this libs bellow:

1. In cubes/cell.py

    def __init__(self, cube=None, cuts=None,  having_clauses=None):
        if not isinstance(cube, Cube):
            raise ArgumentError("Cell cube should be sublcass of Cube, "
                                "provided: %s" % type(cube).__name__)
        self.cube = cube
        self.cuts = cuts if cuts is not None else []
        self.having_clauses = having_clauses if having_clauses is not None else []

    def __and__(self, other):
        """Returns a new cell that is a conjunction of the two provided
        cells. The cube has to match."""
        if self.cube != other.cube:
            raise ArgumentError("Can not combine two cells from different "
                                "cubes '%s' and '%s'."
                                % (self.name, other.name))
        cuts = self.cuts + other.cuts
        having_clauses = self.having_clauses + other.having_clauses
        return Cell(self.cube, cuts=cuts, having_clause=having_clauses)

    def to_dict(self):
        """Returns a dictionary representation of the cell"""
        result = {
            "cube": str(self.cube.name),
            "cuts": [cut.to_dict() for cut in self.cuts]
            "having_clauses": [clause.to_dict() for clause in self.having_clauses]
        }
        return result

2. In cubes/sql/query.py
add 3 methods to end line:

    def clause_for_having(self, cell):
        """Returns a clause for having clause and attr for group. If cell is empty, not contain having or cell is
        `None` then returns `None`."""

        if not cell:
            return None

        clauses = self.clauses_for_having(cell.having_clauses)
        condition = and_(*clauses["condition"])
        clauses["condition"] = condition
        return clauses

    def clauses_for_having(self, having_clauses):
        clauses = []
        groups = []
        for cut in having_clauses:
            hierarchy = str(cut.hierarchy) if cut.hierarchy else None
            if isinstance(cut, PointCut):
                path = cut.path
                hav_conds = self.having_condition(str(cut.dimension),
                                                     path,
                                                     hierarchy, cut.invert)
            clauses.append(hav_conds["condition"])
            groups += hav_conds["group"]
        # return one dict
        dict_clause = {"groups": groups, "condition": clauses}
        return dict_clause

    def having_condition(self, dim, path, hierarchy=None, invert=False):
        """Returns a dict of `Condition` tuple (`attributes`, `conditions`,
        `group_by`) dimension `dim` point at `path` and list group attrs use having. It is a compound
        condition - one equality condition for each path element in form:
        ``level[i].key = path[i]``"""
        conditions = []
        groups = []
        levels = self.level_keys(dim, hierarchy, path)
        for level_key, value in zip(levels, path):
            # Prepare condition: dimension.level_key = path_value
            column = self.column(level_key)
            conditions.append(column == value)
            groups.append(column)

        condition = sql.expression.and_(*conditions)

        if invert:
            condition = sql.expression.not_(condition)

        dict_condition = {"group": groups, "condition": condition}
        return dict_condition

3. In cubes/sql/browser.py
At method aggregation_statement edit line:

statement = sql.expression.select(selection,
                                          from_obj=context.star,
                                          use_labels=True,
                                          whereclause=condition,
                                          group_by=group_by)

to

        # HAVING
        # ------
        having_clauses = context.clause_for_having(cell)
        havings = having_clauses["condition"]
        group_clauses = having_clauses["groups"]
        if group_by is None:
            group_by = []
        for group in group_clauses:
            if group not in group_by:
                group_by.append(group)
        statement = sql.expression.select(selection,
                                          from_obj=context.star,
                                          use_labels=True,
                                          whereclause=condition,
                                          group_by=group_by,
                                          having=havings)

I only using to PointCut in Having Clause because my needs just had so

@jjmontesl
Copy link
Contributor

Thanks for this. This feature shall include unit tests. Could you please provide some?

@jjmontesl
Copy link
Contributor

Also, a further explanation of the use case would be welcome to better understand this patch.

@Stiivi
Copy link
Member

Stiivi commented Jun 7, 2016

As @jjmontesl suggested, can you please provide plain SQL use-case examples which demonstrate the difference with and without the proposed feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants