# Dialog metrics

This is an example notebook on how to use Objectiv on Dialog data. It uses a lot of pieces from our [example notebooks](https://objectiv.io/docs/modeling/example_notebooks/) section of our docs. Here you can also find the overall reference for the [open model hub](https://objectiv.io/docs/modeling/models) and [Bach](https://objectiv.io/docs/modeling/bach).

## Getting started
### Import the required packages for this notebook
The open model hub package can be installed with `pip install objectiv-modelhub` (this installs Bach as well).  
If you are running this notebook from our quickstart, the model hub and Bach are already installed, so you don't have to install it separately.

In [None]:
from modelhub import ModelHub
from bach import display_sql_as_markdown

At first we have to instantiate the Objectiv DataFrame object and the model hub.

In [None]:
# instantiate the model hub
modelhub = ModelHub(time_aggregation='YYYY-MM-DD')

# get the Bach DataFrame with Dialog data > REPLACE WITH YOUR PG CREDENTIALS
df = modelhub.get_objectiv_dataframe(db_url='',
                                     start_date='2022-03-01',
                                     table_name='')

The columns 'global_contexts' and the 'location_stack' contain most of the event specific data. These columns
are json type columns and we can extract data from it based on the keys of the json objects using `SeriesGlobalContexts` or `SeriesGlobalContexts` methods to extract the data.

In [None]:
# adding specific contexts to the data as columns
df['application'] = df.global_contexts.gc.application
df['feature_nice_name'] = df.location_stack.ls.nice_name
df['root_location'] = df.location_stack.ls.get_from_context_with_type_series(type='RootLocationContext', key='id')

## Active users today per hour


In [None]:
# model hub: active users today per hour
users_today = modelhub.aggregate.unique_users(df[df.day == df.day.max()], groupby=modelhub.time_agg(df, 'YYYY-MM-DD-HH'))
users_today.head() 

In [None]:
# get the SQL
display_sql_as_markdown(users_today)

## Daily active users

In [None]:
# model hub: unique users, daily
daily_users = modelhub.aggregate.unique_users(df)
daily_users.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(daily_users)

## Monthly active users

In [None]:
# model hub: unique users, monthly
monthly_users = modelhub.aggregate.unique_users(df, groupby=modelhub.time_agg(df, 'YYYY-MM'))
monthly_users.head()

In [None]:
# get the SQL
display_sql_as_markdown(monthly_users)

## Daily sessions

In [None]:
# model hub: unique sessions, daily
daily_sessions = modelhub.aggregate.unique_sessions(df)
daily_sessions.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(daily_sessions)

## Monthly sessions

In [None]:
# model hub: unique sessions, monthly
monthly_sessions = modelhub.aggregate.unique_sessions(df, groupby=modelhub.time_agg(df, 'YYYY-MM'))
monthly_sessions.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(monthly_sessions)

## Average daily sessions per user

In [None]:
# use the earlier created users & sessions and calculate average
daily_sessions_user = daily_sessions / daily_users
daily_sessions_user.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(daily_sessions_user)

## Average monthly sessions per user


In [None]:
# use the earlier created users & sessions and calculate average
monthly_sessions_user = monthly_sessions / monthly_users
monthly_sessions_user.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(monthly_sessions_user)

## Sessions per hour of the day

In [None]:
# model hub: unique sessions, hourly
hourly_sessions = modelhub.aggregate.unique_sessions(df, groupby=modelhub.time_agg(df, 'YYYY-MM-DD-HH'))
hourly_sessions.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(hourly_sessions)

## Average session duration

In [None]:
# model hub: average duration, daily
duration_daily = modelhub.aggregate.session_duration(df)
duration_daily.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(duration_daily)

## Users per feature

In [None]:
# select only user actions, so stack_event_types must be a superset of ['InteractiveEvent']
interactive_events = df[df.stack_event_types>=['InteractiveEvent']]

# users by feature
users_feature = interactive_events.groupby(['application', 'feature_nice_name', 'event_type']).agg({'user_id':'nunique'})
users_feature.sort_values('user_id_nunique', ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(users_feature)

## Users per root location

In [None]:
# users by root_location
users_root_location = interactive_events.groupby(['application', 'root_location', 'event_type']).agg({'user_id':'nunique'})
users_root_location.sort_values('user_id_nunique', ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(users_root_location)

## First session features

In [None]:
# first, add a column labeling a session as first session to the earlier created interactive_events
interactive_events['is_first_session'] = modelhub.map.is_first_session(interactive_events)

# then, select the features used in this first session
users_first_session_feature = interactive_events[interactive_events.is_first_session == True].groupby(['application', 'feature_nice_name', 'event_type']).agg({'user_id':'nunique'})
users_first_session_feature.sort_values('user_id_nunique', ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(users_first_session_feature)

## First session duration

In [None]:
# add first session to the data, also non-interactive events, to define the first session duration
df['is_first_session'] = modelhub.map.is_first_session(df)

# then, use the model hub to calculate average session duration
first_session_duration = modelhub.aggregate.session_duration(df[df.is_first_session == True])
first_session_duration.sort_index(ascending=False).head()

In [None]:
# get the SQL
display_sql_as_markdown(first_session_duration)