Separate SQL Generator Class

The class should be useable by SQLAlchemy applications in two ways: as a statement generator or as a statement consumer. Python applications should be able to provide statements into the query generator as ‘tables’ or should be able to get the generated statement (also influence statement generation) and use the generated star/snowflake for whatever they want. For example for materialization, use in other advanced, more complex customized queries, schema auditing or validation, value auditing, searching, indexing... In other words, output from the query generator should be useable denormalized star/snowflake with well defined interface.

Proposal:

``` python
class StarSchema(object):
```

Represents a star/snowflake table schema. Attributes:
- `name` – user specific name of the star schema, used for the schemaidentification, debug purposes and logging. Has no effect on the execution or statement composition.
- `metadata` is a SQLAlchemy metadata object where the snowflake tables are described.
- `attributes` is a dictionary of snowflake attributes. The keys are attribute names, the values can be strings, dictionaries or objects with specific attributes (see below)
- `joins` is a list of join specification (see below)
- `tables` are SQL Alchemy selectables (tables or statements) that are referenced in the attributes. This dictionary is looked-up first before the actual metadata. Only table name has to be specified and database schema should not be used in this case.
- `schema` – default database schema containing tables

The columns can be specified as:
- a string with format: `column`, `table.column` or `schema.table.column`. When no table is specified, then the fact table is considered.
- as a list of arguments `[[schema,] table,] column`
- `StarColumn` or any object with attributes `schema`, `table`, `column`, `extract`, `unary` can be used.
- a dictionary with keys same as the attributes of `StarColumn` object

Non-object arguments will be stored as a `StarColumn` objects internally.

The joins can be specified as a list of:
- tuples of column specification in form of (`master`, `detail`)
- a dictionary with keys or object with attributes: `master`, `detail`,
  `alias` and `method`.

`master` is a specification of a column in the master table (fact) and `detail` is a specification of a column in the detail table (usually a dimension). `alias` is an alternative name for the `detail` table to be
joined.

The `method` can be: `match` – `LEFT INNER JOIN`, `master` – `LEFT
OUTER JOIN` or `detail` – `RIGHT OUTER JOIN`.

Note: The class should be cubes-independent, despite residing in the cubes repository. It should not use Cubes model objects. Reason is ability to separate it in the future when complexity of the SQL backend grows into extend where it would not be maintainable within the Cubes library.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate SQL Generator Class #253

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Separate SQL Generator Class #253

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions