_Before reading this, make sure you're familiar with the concepts from the_
[Query language guide](../ql).

Query language is good when you want to explore your data interactively. But sometimes
you might want to go further and build your query with Python code. This is useful when
you want to parametrize your queries, build custom functions around Dictum or expose
your metric store as a web API.

Dictum provides several helpers for this use case.

Metrics and dimensions can be retrieved by name from the corresponsing attributes
of the `Project` object. Calculations can be accessed by their ID as both attributes and
dict keys.

In [3]:
from dictum import Project

project = Project.example("chinook")
project.metrics["revenue"]

In [4]:
project.dimensions.Year

The `Project` object has a `select` method that corresponds to the `select` clause of
the query language. It accepts a list of metrics we want to select. You can give it a
list of strings, "pieces" of the actual QL query.

In [6]:
project.select("revenue", "revenue.percent as \"pct\"")

Unnamed: 0,Revenue,pct
0,"$2,328.60",100%


But, as you can see, this becomes problematic as the expressions become more complex.
So, you can pass the metric objects from the project directly. Transforms are applied
the same way, by calling the corresponding methods with or without `()`.

In [8]:
m = project.metrics
project.select(m.revenue, m.revenue.percent)

Unnamed: 0,Revenue,Revenue (%)
0,"$2,328.60",100%


To add a grouping, use the `by` method. Dimensions are passed the same way as metrics,
either as a string or as an object.

In [10]:
d = project.dimensions
(
    project.select(m.revenue, m.revenue.percent)
    .by(d.genre)
)

Unnamed: 0,Genre,Revenue,Revenue (%)
0,Alternative,$13.86,1%
1,Alternative & Punk,$241.56,10%
2,Blues,$60.39,3%
3,Bossa Nova,$14.85,1%
4,Classical,$40.59,2%
5,Comedy,$17.91,1%
6,Drama,$57.71,2%
7,Easy Listening,$9.90,0%
8,Electronica/Dance,$11.88,1%
9,Heavy Metal,$11.88,1%


There are `where` and `limit` methods too.

In [22]:
select = (
    project.select(m.revenue, m.revenue.percent)
    .by(d.genre)
    .where(d.music, d.Time.year >= 2012)
    .limit(m.revenue.top(5))
)
select

Unnamed: 0,Genre,Revenue,Revenue (%)
0,Alternative & Punk,$94.05,11%
1,Jazz,$27.72,3%
2,Latin,$142.56,17%
3,Metal,$120.78,15%
4,Rock,$336.60,40%


`select` method (as well as `ql`) doesn't return a Pandas DataFrame, it returns a special
query object. When you're using Jupyter, it runs the query for you, but to get the
DataFrame in code, call the `df` method. The DataFrame will be unformatted. To keep the
formatting, pass `format=True` to the method. Note that in this case, all values will be
strings.

In [13]:
type(select)

dictum.project.analyses.Select

In [14]:
df = select.df()
type(df)

pandas.core.frame.DataFrame

In [19]:
df

Unnamed: 0,genre,revenue,revenue__percent
0,Alternative & Punk,241.56,0.114608
1,Jazz,79.2,0.037576
2,Latin,382.14,0.181306
3,Metal,261.36,0.124002
4,Rock,826.65,0.392203


In [17]:
select.df(format=True)

Unnamed: 0,Genre,Revenue,Revenue (%)
0,Alternative & Punk,$241.56,11%
1,Jazz,$79.20,4%
2,Latin,$382.14,18%
3,Metal,$261.36,12%
4,Rock,$826.65,39%


`of` and `within` can be passed as keyword arguments to the metric transform, e.g.
`m.revenue.percent(within=[d.Year])`.

Lets's write a function that builds a typical query for us. We'll give it the metric
ID, and it will display that metric as a percentage by Year and Quarter.

In [21]:
def percentage_of_quarter(metric_id: str):
    metric = project.metrics[metric_id]
    return (
        project.select(metric.percent(within=[project.d.Year]))
        .by(project.d.Year, project.d.Quarter)
    )

percentage_of_quarter("items_sold")

Unnamed: 0,Year,Quarter,Number of Items Sold (%)
0,2009,Q1 2009,25%
1,2009,Q2 2009,25%
2,2009,Q3 2009,25%
3,2009,Q4 2009,25%
4,2010,Q1 2010,25%
5,2010,Q2 2010,25%
6,2010,Q3 2010,25%
7,2010,Q4 2010,25%
8,2011,Q1 2011,26%
9,2011,Q2 2011,26%
