The main way to retrieve data from a Dictum project is the Python API. You can explore
your metrics in Jupyter or retrieve the data as a Pandas Dataframe for further analysis.

In this and other example notebooks, we'll be using a
[sample Chinook database](https://github.com/lerocha/chinook-database) and a
corresponding Dictum example project.

In [1]:
from dictum import Project

project = Project.example("chinook")

## Exploring the project

You can interface with the project by using the resulting `project` object. Let's see
which metrics and dimensions are in the project first.

In [2]:
revenue = project.model.metrics["revenue"]
revenue

Metric(revenue)

In [3]:
revenue.str_expr

'$revenue'

In [4]:
revenue.name

'Revenue'

In [5]:
revenue.format

FormatConfig(kind='currency', pattern=None, skeleton=None, currency='USD')

In [21]:
import altair as alt

project.chart().mark_area().encode(
    x=alt.X(project.dimensions.Quarter, timeUnit="yearquarter"),
    y=project.m.revenue,
    color=project.d.customer_country,
)

In [3]:
from dictum import Project
project = Project.example("chinook")
project.select("revenue.percent").by("genre").where("music")

Unnamed: 0,Genre,Revenue
0,Alternative,1%
1,Alternative & Punk,11%
2,Blues,3%
3,Bossa Nova,1%
4,Classical,2%
5,Easy Listening,0%
6,Electronica/Dance,1%
7,Heavy Metal,1%
8,Hip Hop/Rap,1%
9,Jazz,4%


In [6]:
import altair as alt

(
    project.chart()
    .mark_bar()
    .encode(
        x=alt.X(project.m.revenue),
        y=alt.Y(project.d.genre),
    )
)

In [3]:
project.dimensions

In [4]:
project.metrics

Now we can start querying Dictum. Metrics and Dimensions are stored as attributes to
`project.metrics` and `project.dimensions`. To make it easier to use them interactively,
it's good to give them shorter names.

In [5]:
m, d = project.metrics, project.dimensions

## Selecting a metric

Now we can start exploring the data. Let's see the total value for `Revenue`.

In [6]:
project.select(m.revenue)

Unnamed: 0,Revenue
0,"$2,328.60"


Metrics and dimensions can be specified as objects, like above, or by passing their
identifier as a string.

In [7]:
project.select("revenue")  # this is the same as the above

Unnamed: 0,Revenue
0,"$2,328.60"


Notice that the values are formatted according to the format specification of the
`revenue` metric.

In [8]:
m.revenue.calculation.format

FormatConfig(kind='currency', pattern=None, skeleton=None, currency='USD')

## Grouping by dimension

We can group the metric by any dimension that's applicable for it.

In [9]:
project.select(m.revenue).by(d.genre)

Unnamed: 0,Genre,Revenue
0,Alternative,$13.86
1,Alternative & Punk,$241.56
2,Blues,$60.39
3,Bossa Nova,$14.85
4,Classical,$40.59
5,Comedy,$17.91
6,Drama,$57.71
7,Easy Listening,$9.90
8,Electronica/Dance,$11.88
9,Heavy Metal,$11.88


## Time dimensions

What about dates? Date is just like any other dimension. Let's group revenue by the
generic `Year`.

In [10]:
project.select(m.revenue).by(d.Year)

Unnamed: 0,Invoice Date,Revenue
0,2009,$449.46
1,2010,$481.45
2,2011,$469.58
3,2012,$477.53
4,2013,$450.58


## Filtering

In addition to grouping, we can filter the data by any dimension.

In [11]:
(
    project.select(m.revenue)
    .by(d.invoice_date.datetrunc("year"))
    .where(d.genre == "Rock")
)

Unnamed: 0,Invoice Date,Revenue
0,2009,$178.20
1,2010,$155.43
2,2011,$156.42
3,2012,$162.36
4,2013,$174.24


In [12]:
project.dimensions
# customer_orders_amount

## Dimension transforms

A dimension doesn't always contain the exact level of granularity that you want. Let's
take dates: sometimes you might want to group a metric by month, sometimes by day or
quarter. To avoid creating a separate dimension for every level of granularity, Dictum
provides dimension transforms. Transforms are special built-in functions that you can
apply to a dimension to slightly modify how it works. The most common usage of transforms
is specifying date granularity.

You can find a list of all available transforms in the
[query language reference](../../../reference/query_language/).

In [16]:
(
    project.select(m.revenue)
    .by(d.invoice_date.datetrunc("quarter"))
)

Unnamed: 0,Invoice Date,Revenue
0,Q1 2009,$110.88
1,Q2 2009,$112.86
2,Q3 2009,$112.86
3,Q4 2009,$112.86
4,Q1 2010,$143.86
5,Q2 2010,$112.86
6,Q3 2010,$111.87
7,Q4 2010,$112.86
8,Q1 2011,$112.86
9,Q2 2011,$144.86


## Generic dates

There are special time dimensions that we call _generic dates_. For example, the default
time dimension for `revenue` is `invoice_date`. If we query _revenue by quarter_, Dictum
understands that you really mean _revenue by quarter of invoice date_.

In [23]:
(
    project.select(m.revenue)
    .by(d.Quarter)
    .where(d.genre == "Rock", d.Time >= "2012-01-01")
)

Unnamed: 0,Invoice Date,Revenue
0,Q1 2012,$43.56
1,Q2 2012,$48.51
2,Q3 2012,$51.48
3,Q4 2012,$18.81
4,Q1 2013,$29.70
5,Q2 2013,$42.57
6,Q3 2013,$47.52
7,Q4 2013,$54.45


## Selecting multiple metrics

You can select multiple metrics in the same query, even if they are computed from
different tables in the database. The only limitation is that all dimensions in the query
must be compatible with all the metrics that you selected.

In [29]:
m.track_count

In [30]:
m.revenue

In [26]:
(
    project.select(
        m.revenue,
        m.track_count,
    )
    .by(d.genre)
)

Unnamed: 0,Genre,Revenue,Number of Tracks
0,Alternative,$13.86,40
1,Alternative & Punk,$241.56,332
2,Blues,$60.39,81
3,Bossa Nova,$14.85,15
4,Classical,$40.59,74
5,Comedy,$17.91,17
6,Drama,$57.71,64
7,Easy Listening,$9.90,24
8,Electronica/Dance,$11.88,30
9,Heavy Metal,$11.88,28


## Metric transforms

Metrics can be transformed too. Unlike with dimensions, these are so-called
_table transforms_. They are only valid in the context of a query with some exact
dimensions.

In [34]:
(
    project.select(
        m.revenue,
        m.revenue.percent.name("Revenue (%)"),  # when there are no arguments to a transform, parentheses are optional
        m.revenue.total.name("Revenue (Total)")
    )
    .by(d.genre)
)

Unnamed: 0,Genre,Revenue,Revenue (%),Revenue (Total)
0,Alternative,$13.86,1%,"$2,328.60"
1,Alternative & Punk,$241.56,10%,"$2,328.60"
2,Blues,$60.39,3%,"$2,328.60"
3,Bossa Nova,$14.85,1%,"$2,328.60"
4,Classical,$40.59,2%,"$2,328.60"
5,Comedy,$17.91,1%,"$2,328.60"
6,Drama,$57.71,2%,"$2,328.60"
7,Easy Listening,$9.90,0%,"$2,328.60"
8,Electronica/Dance,$11.88,1%,"$2,328.60"
9,Heavy Metal,$11.88,1%,"$2,328.60"


In [36]:
(
    project.select(
        m.revenue,
        m.revenue.percent(within=[d.genre]).name("Revenue (%, genre)"),
        m.revenue.percent.name("Revenue (%, total)")
    )
    .by(d.genre, d.artist)
    .where(d.genre.isin("Rock", "Alternative & Punk"))
)

Unnamed: 0,Genre,Artist,Revenue,"Revenue (%, genre)","Revenue (%, total)"
0,Alternative & Punk,Audioslave,$4.95,2%,0%
1,Alternative & Punk,Body Count,$10.89,5%,1%
2,Alternative & Punk,Faith No More,$32.67,14%,3%
3,Alternative & Punk,Foo Fighters,$4.95,2%,0%
4,Alternative & Punk,Green Day,$32.67,14%,3%
5,Alternative & Punk,JET,$6.93,3%,1%
6,Alternative & Punk,Os Mutantes,$7.92,3%,1%
7,Alternative & Punk,Pearl Jam,$5.94,2%,1%
8,Alternative & Punk,R.E.M.,$24.75,10%,2%
9,Alternative & Punk,R.E.M. Feat. Kate Pearson,$8.91,4%,1%


In [36]:
project.ql("""
select revenue,
    revenue.total of (artist),
    revenue.total of (artist) within (genre)
where artist = 'Iron Maiden'
by genre, artist
""")

Unnamed: 0,Genre,Artist,Revenue,Revenue.1,Revenue.2
0,Blues,Iron Maiden,$3.96,$138.60,$3.96
1,Heavy Metal,Iron Maiden,$11.88,$138.60,$11.88
2,Metal,Iron Maiden,$69.30,$138.60,$69.30
3,Rock,Iron Maiden,$53.46,$138.60,$53.46
