Skip to content

Commit

Permalink
Added docs and a plot function to the query builder class
Browse files Browse the repository at this point in the history
  • Loading branch information
twheys committed May 10, 2019
1 parent b02a82e commit 4a7e976
Show file tree
Hide file tree
Showing 12 changed files with 262 additions and 293 deletions.
120 changes: 60 additions & 60 deletions README.rst
Expand Up @@ -31,33 +31,33 @@ Introduction

|Brand| arose out of an environment where several different teams, each working with data sets often with crossover, were individually building their own dashboard platforms. |Brand| was developed as a centralized way of building dashboards without the legwork.

|Brand| is used to create configurations of data sets using |FeatureSlicer| which backs a database table containing analytics and defines sets of |FeatureDimension| and |FeatureMetric|. A |FeatureDimension| is used to group data by properties, such as a timestamp, an account, a device type, etc. A |FeatureMetric| is used to render quanitifiers such as clicks, ROI, conversions into a widget such as a chart or table.
|Brand| is used to create configurations of data sets using |FeatureDataSet| which backs a database table containing analytics and defines sets of |FeatureDimension| and |FeatureMetric|. A |FeatureDimension| is used to group data by properties, such as a timestamp, an account, a device type, etc. A |FeatureMetric| is used to render quanitifiers such as clicks, ROI, conversions into a widget such as a chart or table.

A |FeatureSlicer| exposes a rich builder API that allows a wide range of queries to be constructed that can be rendered as several widgets. A |FeatureSlicer| can be used directly in a Jupyter_ notebook, eliminating the need to write repetitive custom queries and render the data in visualizations.
A |FeatureDataSet| exposes a rich builder API that allows a wide range of queries to be constructed that can be rendered as several widgets. A |FeatureDataSet| can be used directly in a Jupyter_ notebook, eliminating the need to write repetitive custom queries and render the data in visualizations.

Slicers
-------
Data Sets
---------

|FeatureSlicer| are the core component of |Brand|. A |FeatureSlicer| is a representation of a data set and is used to execute queries and transform result sets into widgets such as charts or tables.
|FeatureDataSet| are the core component of |Brand|. A |FeatureDataSet| is a representation of a data set and is used to execute queries and transform result sets into widgets such as charts or tables.

A |FeatureSlicer| requires only a couple of definitions in order to use: A database connector, a database table, join tables, and dimensions and metrics. Metrics and Dimension definitions tell |Brand| how to query and use data in widgets. Once a slicer is created, it's query API can be used to build queries with just a few lines of code selecting which dimensions and metrics to use and how to filter the data.
A |FeatureDataSet| requires only a couple of definitions in order to use: A database connector, a database table, join tables, and dimensions and metrics. Metrics and Dimension definitions tell |Brand| how to query and use data in widgets. Once a slicer is created, it's query API can be used to build queries with just a few lines of code selecting which dimensions and metrics to use and how to filter the data.

.. _slicer_example_start:
.. _dataset_example_start:

Instantiating a Slicer
""""""""""""""""""""""
Instantiating a Data Set
""""""""""""""""""""""""

.. code-block:: python
from fireant.slicer import *
from fireant.dataset import *
from fireant.database import VerticaDatabase
from pypika import Tables, functions as fn
vertica_database = VerticaDatabase(user='myuser', password='mypassword')
analytics, accounts = Tables('analytics', 'accounts')
my_slicer = Slicer(
# This is the primary database table that our slicer uses
my_dataset = dataset(
# This is the primary database table that our dataset uses
table=analytics,
# Define the database connection object
Expand Down Expand Up @@ -110,48 +110,48 @@ Instantiating a Slicer
],
)
.. _slicer_example_end:
.. _dataset_example_end:

.. _slicer_query_example_start:
.. _dataset_query_example_start:

Building queries with a Slicer
""""""""""""""""""""""""""""""
Building queries with a Data Set
""""""""""""""""""""""""""""""""

Use the ``data`` attribute start building a slicer query. A slicer query allows method calls to be chained together to select what should be included in the result.
Use the ``query`` property of a data set instance to start building a data set query. A data set query allows method calls to be chained together to select what should be included in the result.

This example uses the slicer defined above
This example uses the data set defined above

.. code-block:: python
from fireant import Matplotlib, Pandas, daily
from fireant import Matplotlib, Pandas, day
matplotlib_chart, pandas_df = my_slicer.data \
matplotlib_chart, pandas_df = my_dataset.data \
.dimension(
# Select the date dimension with a daily interval to group the data by the day applies to
# dimensions are referenced by `slicer.dimensions.{alias}`
my_slicer.dimensions.date(daily),
# dimensions are referenced by `dataset.fields.{alias}`
day(my_dataset.fields.date),
# Select the device_type dimension to break the data down further by which device it applies to
my_slicer.dimensions.device_type,
my_dataset.fields.device_type,
) \
.filter(
# Filter the result set to data to the year of 2018
my_slicer.dimensions.date.between(date(2018, 1, 1), date(2018, 12, 31))
my_dataset.fields.date.between(date(2018, 1, 1), date(2018, 12, 31))
) \
# Add a week over week reference to compare data to values from the week prior
.reference(WeekOverWeek(slicer.dimension.date))
.reference(WeekOverWeek(dataset.fields.date))
.widget(
# Add a matpotlib chart widget
Matplotlib()
# Add axes with series to the chart
.axis(Matplotlib.LineSeries(slicer.metrics.clicks))
.axis(Matplotlib.LineSeries(dataset.fields.clicks))
# metrics are referenced by `slicer.metrics.{alias}`
.axis(Matplotlib.ColumnSeries(slicer.metrics.cost, slicer.metrics.revenue))
# metrics are referenced by `dataset.metrics.{alias}`
.axis(Matplotlib.ColumnSeries(dataset.fields.cost, dataset.fields.revenue))
) \
.widget(
# Add a pandas data frame table widget
Pandas(slicer.metrics.clicks, slicer.metrics.cost, slicer.metrics.revenue)
Pandas(dataset.fields.clicks, dataset.fields.cost, dataset.fields.revenue)
) \
.fetch()
Expand All @@ -161,7 +161,7 @@ This example uses the slicer defined above
# Display the chart
print(pandas_df)
.. _slicer_query_example_end:
.. _dataset_query_example_end:

License
-------
Expand Down Expand Up @@ -207,43 +207,43 @@ Crafted with ♥ in Berlin.

.. |Brand| replace:: *fireant*

.. |FeatureSlicer| replace:: *Slicer*
.. |FeatureDataSet| replace:: *Data Set*
.. |FeatureMetric| replace:: *Metric*
.. |FeatureDimension| replace:: *Dimension*
.. |FeatureFilter| replace:: *Filter*
.. |FeatureReference| replace:: *Reference*
.. |FeatureOperation| replace:: *Operation*

.. |ClassSlicer| replace:: ``fireant.Slicer``
.. |ClassDataSet| replace:: ``fireant.DataSet``
.. |ClassDatabase| replace:: ``fireant.database.Database``
.. |ClassJoin| replace:: ``fireant.slicer.joins.Join``
.. |ClassMetric| replace:: ``fireant.slicer.metrics.Metric``

.. |ClassDimension| replace:: ``fireant.slicer.dimensions.Dimension``
.. |ClassBooleanDimension| replace:: ``fireant.slicer.dimensions.BooleanDimension``
.. |ClassContDimension| replace:: ``fireant.slicer.dimensions.ContinuousDimension``
.. |ClassDateDimension| replace:: ``fireant.slicer.dimensions.DatetimeDimension``
.. |ClassCatDimension| replace:: ``fireant.slicer.dimensions.CategoricalDimension``
.. |ClassUniqueDimension| replace:: ``fireant.slicer.dimensions.UniqueDimension``
.. |ClassDisplayDimension| replace:: ``fireant.slicer.dimensions.DisplayDimension``

.. |ClassFilter| replace:: ``fireant.slicer.filters.Filter``
.. |ClassComparatorFilter| replace:: ``fireant.slicer.filters.ComparatorFilter``
.. |ClassBooleanFilter| replace:: ``fireant.slicer.filters.BooleanFilter``
.. |ClassContainsFilter| replace:: ``fireant.slicer.filters.ContainsFilter``
.. |ClassExcludesFilter| replace:: ``fireant.slicer.filters.ExcludesFilter``
.. |ClassRangeFilter| replace:: ``fireant.slicer.filters.RangeFilter``
.. |ClassPatternFilter| replace:: ``fireant.slicer.filters.PatternFilter``
.. |ClassAntiPatternFilter| replace:: ``fireant.slicer.filters.AntiPatternFilter``

.. |ClassWidget| replace:: ``fireant.slicer.widgets.base.Widget``
.. |ClassPandasWidget| replace:: ``fireant.slicer.widgets.pandas.Pandas``
.. |ClassHighChartsWidget| replace:: ``fireant.slicer.widgets.highcharts.HighCharts``
.. |ClassHighChartsSeries| replace:: ``fireant.slicer.widgets.highcharts.Series``

.. |ClassReference| replace:: ``fireant.slicer.references.Reference``

.. |ClassOperation| replace:: ``fireant.slicer.operations.Operation``
.. |ClassJoin| replace:: ``fireant.dataset.joins.Join``
.. |ClassMetric| replace:: ``fireant.dataset.fields.Field``

.. |ClassDimension| replace:: ``fireant.dataset.fields.Field``
.. |ClassBooleanDimension| replace:: ``fireant.dataset.dimensions.BooleanDimension``
.. |ClassContDimension| replace:: ``fireant.dataset.dimensions.ContinuousDimension``
.. |ClassDateDimension| replace:: ``fireant.dataset.dimensions.DatetimeDimension``
.. |ClassCatDimension| replace:: ``fireant.dataset.dimensions.CategoricalDimension``
.. |ClassUniqueDimension| replace:: ``fireant.dataset.dimensions.UniqueDimension``
.. |ClassDisplayDimension| replace:: ``fireant.dataset.dimensions.DisplayDimension``

.. |ClassFilter| replace:: ``fireant.dataset.filters.Filter``
.. |ClassComparatorFilter| replace:: ``fireant.dataset.filters.ComparatorFilter``
.. |ClassBooleanFilter| replace:: ``fireant.dataset.filters.BooleanFilter``
.. |ClassContainsFilter| replace:: ``fireant.dataset.filters.ContainsFilter``
.. |ClassExcludesFilter| replace:: ``fireant.dataset.filters.ExcludesFilter``
.. |ClassRangeFilter| replace:: ``fireant.dataset.filters.RangeFilter``
.. |ClassPatternFilter| replace:: ``fireant.dataset.filters.PatternFilter``
.. |ClassAntiPatternFilter| replace:: ``fireant.dataset.filters.AntiPatternFilter``

.. |ClassReference| replace:: ``fireant.dataset.references.Reference``

.. |ClassWidget| replace:: ``fireant.widgets.base.Widget``
.. |ClassPandasWidget| replace:: ``fireant.widgets.pandas.Pandas``
.. |ClassHighChartsWidget| replace:: ``fireant.widgets.highcharts.HighCharts``
.. |ClassHighChartsSeries| replace:: ``fireant.widgets.highcharts.Series``

.. |ClassOperation| replace:: ``fireant.dataset.operations.Operation``

.. |ClassVerticaDatabase| replace:: ``fireant.database.VerticaDatabase``
.. |ClassMySQLDatabase| replace:: ``fireant.database.MySQLDatabase``
Expand Down
20 changes: 18 additions & 2 deletions docs/2_database.rst
Expand Up @@ -3,7 +3,7 @@ Connecting to the database

In order for |Brand| to connect to your database, a database connectors must be used. This takes the form of an instance of a concrete subclass of |Brand|'s ``Database`` class. Database connectors are shipped with |Brand| for all of the supported databases, but it is also possible to write your own. See below on how to extend |Brand| to support additional databases.

To configure a database, instantiate a subclass of |ClassDatabase|. You will use this instance to create a |FeatureSlicer|. It is possible to use multiple databases simultaneous, but |FeatureSlicer| can only use a single database, since they inherently model the structure of a table in the database.
To configure a database, instantiate a subclass of |ClassDatabase|. You will use this instance to create a |FeatureDataSet|. It is possible to use multiple databases simultaneous, but |FeatureDataSet| can only use a single database, since they inherently model the structure of a table in the database.

Vertica

Expand Down Expand Up @@ -78,10 +78,15 @@ Instead of using one of the built in database connectors, you can provide your o
.. code-block:: python
import vertica_python
from pypika import VerticaQuery
from fireant import Database
class MyVertica(Database):
# Vertica client that uses the vertica_python driver.
# Override the custom PyPika Query class (Not necessary but perhaps helpful)
query_cls = VerticaQuery
def __init__(self, host='localhost', port=5433, database='vertica',
user='vertica', password=None,
read_timeout=None):
Expand All @@ -105,14 +110,25 @@ Instead of using one of the built in database connectors, you can provide your o
def date_add(self, date_part, interval, field):
return DateAdd(...) # custom DateAdd function
hostage.settings = MyVertica(
Once a Database connector has been set up, it can be used when instantiating |ClassDataSet|.

.. code-block:: python
from fireant import DataSet
my_vertica = MyVertica(
host='example.com',
port=5433,
database='example',
user='user',
password='password123',
)
DataSet(
database=my_vertica,
...
)
In a custom database connector, the ``connect`` function must be overridden to provide a ``connection`` to the database.
The ``trunc_date`` and ``date_add`` functions must also be overridden since are no common ways to truncate/add dates in SQL databases.

Expand Down
58 changes: 58 additions & 0 deletions docs/3_dataset.rst
@@ -0,0 +1,58 @@
Creating a |FeatureDataSet|
===========================

A |FeatureDataSet| is a definition of a collection of data that can be queried and transformed into widgets. It consists of four main components: A database connector, a primary database table, join tables, and fields. Once a |FeatureDataSet| has been defined, it can be queried to generate a large variety of visualizations.

**Some Definitions**

Database Connection
The database connector is a connection to a database. It contains all of the connection details and supplies the functions for connecting to the database. It also has some helper functions that create SQL where database platforms deviate.

Table
The base table to query from in your database. This is the table that goes in the ``FROM`` clause of the SQL queries generated by |Brand|.

Join
Joins specify how to join additional tables. They are instantiated with another PyPika Table and a PyPika expression on how to join the two tables. Joins can also join based on another join by using an expression that links it to the other join table (see below for an example). |Brand| Will automatically determine which joins are necessary on a per query basis.

Field
Fields are the bread and butter of |Brand|. The define what types data is available and are ultimately what is referenced when building up queries. Fields are defined with a PyPika expression.

Code Example
------------

.. code-block:: python
from fireant.dataset import *
from fireant.database.vertica import VerticaDatabase
from pypika import Tables, functions as fn
vertica_database = VerticaDatabase(user='jane_doe', password='strongpassword123')
analytics, customers = Tables('analytics', 'customers')
dataset = DataSet(
database=vertica_database,
table=analytics,
joins=[
Join(customers, analytics.customer_id == customers.id),
],
fields=[
# Non-aggregate definition
Field(alias='customer',
definition=customers.id,
label='Customer'),
# Date/Time type, also non-aggregate
Field(alias='date',
definition=analytics.timestamp,
type=DataType.date,
label='Date'),
# Aggregate definition (The SUM function aggregates a group of values into a single value)
Field(alias='clicks',
definition=fn.Sum(analytics.clicks),
label='Clicks'),
],
)
.. include:: ../README.rst
:start-after: _appendix_start:
:end-before: _appendix_end:

0 comments on commit 4a7e976

Please sign in to comment.