This is one of the Objectiv [example notebooks](https://objectiv.io/docs/modeling/example-notebooks/). These notebooks can run [on your own data](https://objectiv.io/docs/modeling/get-started-in-your-notebook/), or you can instead run the [Demo](https://objectiv.io/docs/home/try-the-demo/) to quickly try them out.

# Marketing Analytics
This example notebook shows how you can easily analyze traffic coming from Marketing campaigns, as measured via UTM tags. [See here how to get started in your notebook](https://objectiv.io/docs/modeling/get-started-in-your-notebook/).

## Setup
We first have to instantiate the model hub and an Objectiv DataFrame object.

In [None]:
# set the timeframe of the analysis
start_date = '2022-06-01'
end_date = None

In [None]:
from modelhub import ModelHub
from bach import DataFrame
import pandas as pd

# instantiate the model hub and set the default time aggregation to daily
# and set the global contexts that will be used in this example
modelhub = ModelHub(time_aggregation='%Y-%m-%d', global_contexts=['http', 'marketing', 'application'])
# get a Bach DataFrame with Objectiv data within a defined timeframe
df = modelhub.get_objectiv_dataframe(start_date=start_date, end_date=end_date)

The `location_stack` column, and the columns taken from the global contexts, contain most of the event-specific data. These columns are JSON typed, and we can extract data from it using the keys of the JSON objects with [`SeriesLocationStack`](https://objectiv.io/docs/modeling/open-model-hub/api-reference/SeriesLocationStack/SeriesLocationStack/) methods, or the `context` accessor for global context columns. See the [open taxonomy example](open-taxonomy-how-to.ipynb#Location-stack-&-global-contexts) for how to use the `location_stack` and global contexts.

In [None]:
# add `feature_nice_name` and `root_location` as columns, so that we can use it for grouping etc. later
df['feature_nice_name'] = df.location_stack.ls.nice_name
df['root_location'] = df.location_stack.ls.get_from_context_with_type_series(type='RootLocationContext', key='id')

In [None]:
# derive a specific DataFrame with added marketing contexts
df_acquisition = df.copy()
# extract referrer and marketing contexts from the respective global context colummns
df_acquisition['referrer'] = df_acquisition.http.context.referrer
df_acquisition['utm_source'] = df_acquisition.marketing.context.source
df_acquisition['utm_medium'] = df_acquisition.marketing.context.medium
df_acquisition['utm_campaign'] = df_acquisition.marketing.context.campaign

In [None]:
# also define a DataFrame with only the sessions that came in via a marketing campaign
campaign_sessions = df_acquisition[~df_acquisition['utm_source'].isnull()]['session_id'].unique()
df_marketing_only = df_acquisition[df_acquisition['session_id'].isin(campaign_sessions)]

In [None]:
# define a further selection: which source to select in the below analyses.
source_selection = ['twitter', 'reddit']
sources = DataFrame.from_pandas(engine=df.engine, df=pd.DataFrame({'sources': source_selection}), convert_objects=True).sources
# filter on defined list of UTM Sources
df_marketing_selection = df_marketing_only[(df_marketing_only.utm_source.isin(sources))]

In [None]:
# materialize the DataFrame as temporary tables to reduce the complexity of the underlying queries
df_acquisition = df_acquisition.materialize(materialization='temp_table')
df_marketing_only = df_marketing_only.materialize(materialization='temp_table')
df_marketing_selection = df_marketing_selection.materialize(materialization='temp_table')

#### Available dataframes:
- `df` = all + `feature_nice_name` + `root_location`.
- `df_acquisition` = `df` + referrer + UTMs
- `df_marketing_only` = `df_acquisition`, but only sessions with non_null `utm_source`.
- `df_marketing_selection` = `df_marketing_only`, but filtered for selection, e.g. only `utm_source` in `{'reddit', 'twitter'}`.

### Reference
* [modelhub.ModelHub](https://objectiv.io/docs/modeling/open-model-hub/api-reference/ModelHub/ModelHub/)
* [modelhub.ModelHub.get_objectiv_dataframe](https://objectiv.io/docs/modeling/open-model-hub/api-reference/ModelHub/get_objectiv_dataframe/)
* [using global context data](open-taxonomy-how-to.ipynb#Location-stack-&-global-contexts)
* [modelhub.SeriesLocationStack.ls](https://objectiv.io/docs/modeling/open-model-hub/api-reference/SeriesLocationStack/ls/)
* [bach.DataFrame.from_pandas](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/from_pandas/)
* [bach.Series.isnull](https://objectiv.io/docs/modeling/bach/api-reference/Series/isnull/)
* [bach.DataFrame.materialize](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/materialize/)

# Acquisition

## Users from marketing

In [None]:
# show daily number of people coming from marketing campaigns
users_from_marketing_daily = modelhub.aggregate.unique_users(df_marketing_selection).sort_index(ascending=False)
users_from_marketing_daily.head()

In [None]:
users_from_marketing_daily.sort_index(ascending=True).to_pandas().plot(kind='bar', figsize=[15,5], title='Daily #users from marketing', xlabel='Day')

## Users per source-medium-campaign over full timeframe

In [None]:
# split users by marketing _campaign_ (based on UTM data)
users_per_campaign = modelhub.aggregate.unique_users(df_marketing_selection, ['utm_source', 'utm_medium', 'utm_campaign'])
users_per_campaign.reset_index().dropna(axis=0, how='any', subset='utm_source').sort_values(['unique_users'], ascending=False).head(10)

In [None]:
# Stacked graph per campaign
upc = users_per_campaign.to_frame().reset_index()[['utm_source', 'utm_campaign', 'unique_users']]
upc = upc.to_pandas().groupby(['utm_source', 'utm_campaign'])
upc_pivot = upc.sum().reset_index().pivot(index='utm_source', columns='utm_campaign')['unique_users'].reset_index().sort_values(by=['utm_source'], ascending=False)
upc_pivot.plot.bar(x='utm_source', stacked=True)

## Users from marketing _source_ per day

In [None]:
# users by marketing _source_, per day
source_users_daily = modelhub.agg.unique_users(df_marketing_selection, groupby=['day', 'utm_source'])
source_users_daily = source_users_daily.reset_index()
source_users_daily.sort_values('day', ascending=False).head(20)

## Users from marketing _campaign_ per day

In [None]:
# users by marketing _campaign_ (based on UTM data), per day
users_per_campaign_daily = modelhub.aggregate.unique_users(df_marketing_selection, ['day', 'utm_source', 'utm_medium', 'utm_campaign'])
users_per_campaign_daily = users_per_campaign_daily.reset_index()
users_per_campaign_daily.sort_values('day', ascending=False).head(20)

## Referrers overall

In [None]:
# users by referrer in full timeframe (overall, including coming from marketing campaigns)
referrer_users = modelhub.agg.unique_users(df_acquisition, groupby=['referrer']).sort_values(ascending=False)
referrer_users.head(20)

### Reference
* [bach.Series.sort_index](https://objectiv.io/docs/modeling/bach/api-reference/Series/sort_index/)
* [bach.Series.to_pandas](https://objectiv.io/docs/modeling/bach/api-reference/Series/to_pandas/)
* [modelhub.Aggregate.unique_users](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/unique_users/)
* [bach.Series.reset_index](https://objectiv.io/docs/modeling/bach/api-reference/Series/reset_index/)
* [bach.Series.group_by](https://objectiv.io/docs/modeling/bach/api-reference/Series/group_by/)
* [bach.DataFrame.dropna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/dropna/)
* [bach.DataFrame.to_pandas](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/to_pandas/)
* [bach.Series.to_frame](https://objectiv.io/docs/modeling/bach/api-reference/Series/to_frame/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

# Conversion
See conversion overall and from marketing. Conversion in this example is defined as clicking any link on the website or docs to our GitHub repo.

In [None]:
# define the conversion event in `df_acquisition` and `df_marketing_selection`
# in this example: clicking any link leading to our GitHub repo
# create a column that extracts all location stacks that lead to our GitHub repo
location_stack_conversion = {'id': 'browse-on-github', '_type': 'LinkContext'}
modelhub.add_conversion_event(location_stack=df_acquisition.location_stack.json[location_stack_conversion:],
                              event_type='PressEvent',
                              name='github_press')


modelhub.add_conversion_event(location_stack=df_marketing_selection.location_stack.json[location_stack_conversion:],
                              event_type='PressEvent',
                              name='github_press')

df_acquisition['is_conversion_event'] = modelhub.map.is_conversion_event(df_acquisition, 'github_press')
df_marketing_selection['is_conversion_event'] = modelhub.map.is_conversion_event(df_marketing_selection, 'github_press')

### Reference
* [bach.series.series_json.JsonAccessor](https://objectiv.io/docs/modeling/bach/api-reference/Series/Json/json/)
* [modelhub.ModelHub.add_conversion_event](https://objectiv.io/docs/modeling/open-model-hub/api-reference/ModelHub/add_conversion_event/)
* [modelhub.Map.is_conversion_event](https://objectiv.io/docs/modeling/open-model-hub/models/helper-functions/is_conversion_event/)

## Daily conversions from marketing

In [None]:
# calculate daily conversions from marketing (based on UTM data)
conversions_from_marketing = df_marketing_selection[df_marketing_selection.is_conversion_event].dropna(axis=0, how='any', subset='utm_source')
conversions_from_marketing_daily = modelhub.aggregate.unique_users(conversions_from_marketing).sort_index(ascending=False)
conversions_from_marketing_daily.head()

In [None]:
conversions_from_marketing_daily.sort_index(ascending=True).to_pandas().plot(kind='bar', figsize=[15,5], title='Daily #conversions from marketing', xlabel='Day')

## Daily conversion rate from marketing

In [None]:
# calculate daily conversion rate from marketing campaigns overall
# divide conversions from campaigns by total daily number of people coming from campaigns 
conversion_rate_from_marketing = (conversions_from_marketing_daily / users_from_marketing_daily) * 100
conversion_rate_from_marketing.sort_index(ascending=False).fillna(0.0).head(10)

In [None]:
conversion_rate_from_marketing.fillna(0.0).sort_index(ascending=True).to_pandas().plot(kind='line', figsize=[15,5], title='Daily conversion rate from marketing', xlabel='Day')

## Daily conversions overall

In [None]:
# calculate daily conversions overall (including from marketing campaigns)
conversions_overall = modelhub.aggregate.unique_users(df_acquisition[df_acquisition.is_conversion_event])
conversions_overall.sort_index(ascending=False).head()

In [None]:
# plot daily conversions overall (including from marketing campaigns)
conversions_overall.to_pandas().plot(kind='bar', figsize=[15,5], title='Daily #conversions overall', xlabel='Day')

### Daily conversion rate overall

In [None]:
# calculate daily conversion rate overall (including from marketing campaigns)
daily_users = modelhub.aggregate.unique_users(df_acquisition).sort_index(ascending=False)
conversion_rate_overall = (conversions_overall / daily_users) * 100
conversion_rate_overall.sort_index(ascending=False).head(10)

In [None]:
conversion_rate_overall.sort_index(ascending=True).fillna(0.0).to_pandas().plot(kind='line', figsize=[15,5], title='Daily conversion rate overall', xlabel='Day')

### Reference
* [modelhub.Map.is_conversion_event](https://objectiv.io/docs/modeling/open-model-hub/models/helper-functions/is_conversion_event/)
* [bach.DataFrame.dropna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/dropna/)
* [modelhub.Aggregate.unique_users](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/unique_users/)
* [bach.Series.sort_index](https://objectiv.io/docs/modeling/bach/api-reference/Series/sort_index/)
* [bach.DataFrame.to_pandas](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/to_pandas/)
* [bach.DataFrame.fillna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/fillna/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

## Conversion split by source & campaign

### Conversions per marketing _source_ over full timeframe

In [None]:
# calculate conversions per marketing _source_ over the full timeframe (based on UTM data)
campaign_conversions_source_timeframe = modelhub.aggregate.unique_users(df_marketing_selection[df_marketing_selection.is_conversion_event], ['utm_source'])
campaign_conversions_source_timeframe.reset_index().dropna(axis=0, how='any', subset='utm_source').sort_values(['unique_users'], ascending=False).head()

### Conversions per marketing _source_ daily

In [None]:
# split daily conversions by marketing _source_ (based on UTM data)
campaign_conversions_source_daily = modelhub.aggregate.unique_users(df_marketing_selection[df_marketing_selection.is_conversion_event], ['day', 'utm_source'])
campaign_conversions_source_daily.reset_index().dropna(axis=0, how='any', subset='utm_source').set_index('day').sort_index(ascending=False).head(10)

### Conversions per marketing _campaign_ over full timeframe

In [None]:
# split conversions by marketing _campaign_ (based on UTM data)
campaign_conversions_campaign = modelhub.aggregate.unique_users(df_marketing_selection[df_marketing_selection.is_conversion_event], ['utm_source', 'utm_medium', 'utm_campaign'])
campaign_conversions_campaign.reset_index().dropna(axis=0, how='any', subset='utm_source').sort_values(['utm_source', 'unique_users'], ascending=False).head()

### Reference
* [modelhub.Aggregate.unique_users](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/unique_users/)
* [modelhub.Map.is_conversion_event](https://objectiv.io/docs/modeling/open-model-hub/models/helper-functions/is_conversion_event/)
* [bach.DataFrame.dropna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/dropna/)
* [bach.DataFrame.sort_values](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/sort_values/)
* [bach.Series.sort_index](https://objectiv.io/docs/modeling/bach/api-reference/Series/sort_index/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

# Avg. duration

## Avg. duration per ad source

In [None]:
# avg duration for users that come from an ad campaign in the full timeframe
duration_per_source = modelhub.aggregate.session_duration(df_marketing_selection, groupby=['utm_source']).to_frame()
duration_per_source.sort_values(['utm_source'], ascending=False).head(10)

## Vs. avg. duration by all users

In [None]:
# vs time spent by all users
modelhub.aggregate.session_duration(df_acquisition, groupby=None).to_frame().head()

## Avg. duration for converted users per _source_

In [None]:
# avg. duration for converted users - per source
# label sessions with a conversion
df_marketing_selection['converted_users'] = modelhub.map.conversions_counter(df_marketing_selection, name='github_press') >= 1
# label hits where at that point in time, there are 0 conversions in the session
df_marketing_selection['zero_conversions_at_moment'] = modelhub.map.conversions_in_time(df_marketing_selection, 'github_press') == 0
# filter on above created labels to find the users who converted for the very first time
converted_users = df_marketing_selection[(df_marketing_selection.converted_users & df_marketing_selection.zero_conversions_at_moment)]

modelhub.aggregate.session_duration(converted_users, groupby=['utm_source']).to_frame().sort_values('utm_source', ascending=False).head(20)

## Avg. duration per converted user

In [None]:
# duration before conversion - per source & user
# label sessions with a conversion
df_marketing_selection['converted_users'] = modelhub.map.conversions_counter(df_marketing_selection, name='github_press', partition='user_id') >= 1
# label hits where at that point in time, there are 0 conversions in the session
df_marketing_selection['zero_conversions_at_moment'] = modelhub.map.conversions_in_time(df_marketing_selection, 'github_press', partition='user_id') == 0
# materialize the data frame after adding columns as temporary table to reduce the complexity of the underlying queries
df_marketing_selection = df_marketing_selection.materialize(materialization='temp_table')
# filter on above created labels to find the users who converted for the very first time
converted_users = df_marketing_selection[(df_marketing_selection.converted_users & df_marketing_selection.zero_conversions_at_moment)]

modelhub.aggregate.session_duration(converted_users, groupby=['day', 'utm_source', 'user_id']).to_frame().sort_values('day', ascending=False).head(20)

## Avg. duration before first conversion
Avg. duration for users that converted for the very first time (not including hits or sessions after the moment of conversion).

In [None]:
# avg duration before conversion - overall
modelhub.aggregate.session_duration(converted_users, groupby=None).to_frame().head()

## Avg. duration before first conversion per _source_
Avg. duration per campaign _source_ for users who converted for the very first time (not including hits or sessions after the moment of conversion).

In [None]:
# avg duration before conversion - per source
modelhub.aggregate.session_duration(converted_users, groupby=['utm_source']).to_frame().head()

## Avg. duration with bounces filtered out

In [None]:
# create dataframe for sessions without zero duration (aka bounces)
df_marketing_no_bounces = modelhub.aggregate.session_duration(df_marketing_selection, groupby=['utm_source', 'session_id'], exclude_bounces=True).to_frame()

# avg duration for non-bounced users that come from an ad campaign in the full timeframe
df_marketing_no_bounces = df_marketing_no_bounces.reset_index().groupby(['utm_source'])['session_duration'].mean().to_frame()
df_marketing_no_bounces.head(30)

## Avg. daily duration per campaign _source_

In [None]:
# calculate time spent per campaign source, daily
duration_per_source_daily = modelhub.agg.session_duration(df_marketing_selection, groupby=['utm_source', 'day']).to_frame()
# calculate the number of users per campaign source, daily
source_users_daily = modelhub.agg.unique_users(df_acquisition, groupby=['utm_source', 'day'])
source_users_daily = source_users_daily.reset_index()
# add them together
source_duration_users_daily = duration_per_source_daily.merge(source_users_daily, how='left', on=['utm_source', 'day']);
# also add #conversions
converted_users = campaign_conversions_source_daily.to_frame().rename(columns={"unique_users": "converted_users"})
source_duration_users_daily = source_duration_users_daily.merge(converted_users, how='left', on=['utm_source', 'day'])

source_duration_users_daily = source_duration_users_daily.sort_values(['utm_source', 'day'], ascending=False)
source_duration_users_daily.head(50)

### Reference
* [modelhub.Aggregate.session_duration](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/session_duration/)
* [bach.DataFrame.sort_values](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/sort_values/)
* [modelhub.Map.conversions_counter](https://objectiv.io/docs/modeling/open-model-hub/models/helper-functions/conversions_counter/)
* [modelhub.Map.conversions_in_time](https://objectiv.io/docs/modeling/open-model-hub/models/helper-functions/conversions_in_time/)
* [bach.Series.to_frame](https://objectiv.io/docs/modeling/bach/api-reference/Series/to_frame/)
* [bach.Series.reset_index](https://objectiv.io/docs/modeling/bach/api-reference/Series/reset_index/)
* [modelhub.Aggregate.unique_users](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/unique_users/)
* [bach.DataFrame.dropna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/dropna/)
* [bach.DataFrame.merge](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/merge/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

# Deep-dive into user behavior from marketing

## Top used product features for users from marketing campaigns

In [None]:
# top used product features for users coming from marketing campaigns
top_product_features_from_marketing = modelhub.aggregate.top_product_features(df_marketing_selection)
top_product_features_from_marketing.head(20)

### Top used product features for users from marketing campaigns, before they convert

In [None]:
# top used product features for users coming from marketing campaigns, before they convert
top_features_before_conversion_from_marketing = modelhub.agg.top_product_features_before_conversion(df_marketing_selection, name='github_press')
top_features_before_conversion_from_marketing.head(20)

In [None]:
# calculate the percentage of converted users per feature: (converted users per feature) / (total users converted)
total_converted_users = df_marketing_selection[df_marketing_selection['is_conversion_event']]['user_id'].unique().count().value
top_conversion_locations = modelhub.agg.unique_users(df_marketing_selection[df_marketing_selection['is_conversion_event']], groupby='feature_nice_name')
top_conversion_locations = (top_conversion_locations / total_converted_users) * 100
# show the results, with .to_frame() for nicer formatting
top_conversion_locations = top_conversion_locations.to_frame().rename(columns={'unique_users': 'converted_users_percentage'})
top_conversion_locations.sort_values(by='converted_users_percentage', ascending=False).head()

### Reference
* [modelhub.Aggregate.top_product_features](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/top_product_features/)
* [modelhub.Aggregate.top_product_features_before_conversion](https://objectiv.io/docs/modeling/open-model-hub/models/aggregation/top_product_features_before_conversion/)
* [bach.Series.unique](https://objectiv.io/docs/modeling/bach/api-reference/Series/unique/)
* [bach.Series.count](https://objectiv.io/docs/modeling/bach/api-reference/Series/count/)
* [bach.Series.to_frame](https://objectiv.io/docs/modeling/bach/api-reference/Series/to_frame/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

## Funnel Discovery: flows for _all_ users from marketing campaigns

In [None]:
# select which event type to use for further analysis - PressEvents to focus on what users directly interact with
df_funnel_from_marketing = df_marketing_selection[df_marketing_selection['event_type'] == 'PressEvent']
# instantiate the FunnelDiscovery model from the open model hub
funnel = modelhub.get_funnel_discovery()
# set the maximum n steps
max_steps = 4

In [None]:
# for every user starting their session, find all maximum n consecutive steps they took
df_steps = funnel.get_navigation_paths(df_funnel_from_marketing, steps=max_steps, by='user_id')
df_steps.head()

In [None]:
# calculate the most frequent consecutive steps that all users took after starting their session, based on the location stack
df_steps.value_counts().to_frame().head(20)

In [None]:
funnel.plot_sankey_diagram(df_steps, n_top_examples=50)

## Funnel Discovery: flows for _converted_ users from marketing

In [None]:
# add which step resulted in conversion to the dataframe, with the `add_conversion_step_column` param
# filter down to all sequences that have actually converted with the `only_converted_paths` param
df_steps_till_conversion = funnel.get_navigation_paths(df_funnel_from_marketing, steps=max_steps, by='user_id', add_conversion_step_column=True, only_converted_paths=True)
df_steps_till_conversion.head(5)

In [None]:
# plot the Sankey diagram using the top examples via the `n_top_examples` param
condition_convert_on_step = df_steps_till_conversion['_first_conversion_step_number'] == 2
funnel.plot_sankey_diagram(df_steps_till_conversion[condition_convert_on_step], n_top_examples=15)

## Funnel Discovery: drop-off for users from marketing

In [None]:
# select only non-converted users
df_funnel_non_converted = df_marketing_selection[~df_marketing_selection['is_conversion_event']]
funnel_converted_users = df_marketing_selection[df_marketing_selection['is_conversion_event']]['user_id']
# select the events of these non converted users
df_funnel_non_converted = df_funnel_non_converted[~df_funnel_non_converted['user_id'].isin(funnel_converted_users)]
# get the last used feature in the location_stack before dropping off
drop_loc = df_funnel_non_converted.sort_values('moment').groupby('user_id')['feature_nice_name'].to_json_array().json[-1].materialize()
total_count = drop_loc.count().value
# show the last used features by non-converted users, sorted by their usage share compared to all features
drop_loc_percent = (drop_loc.value_counts() / total_count) * 100
drop_loc_percent = drop_loc_percent.to_frame().rename(columns={'value_counts': 'drop_percentage'})
drop_loc_percent.sort_values(by='drop_percentage', ascending=False).head(10)

### Reference
* [modelhub.ModelHub.get_funnel_discovery](https://objectiv.io/docs/modeling/open-model-hub/api-reference/ModelHub/get_funnel_discovery/)
* [modelhub.FunnelDiscovery.get_navigation_paths](https://objectiv.io/docs/modeling/open-model-hub/models/funnels/FunnelDiscovery/get_navigation_paths/)
* [bach.Series.to_frame](https://objectiv.io/docs/modeling/bach/api-reference/Series/to_frame/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)
* [modelhub.FunnelDiscovery.plot_sankey_diagram](https://objectiv.io/docs/modeling/open-model-hub/models/funnels/FunnelDiscovery/plot_sankey_diagram/)
* [bach.DataFrame.rename](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/rename/)
* [bach.DataFrame.sort_values](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/sort_values/)

## Predict User Behavior from campaigns
You can run predictive modeling on your marketing data as well. For example, you can create a feature for the number of times a user came in through a specific marketing source.

The created marketing feature set can be merged with any user-level feature set for logistic regression, like `features_set_sample` from the [logistic regression example notebook](./model-hub-logistic-regression.ipynb). For more details on how to do feature engineering, see our [feature engineering example notebook](./feature-engineering.ipynb).

In [None]:
# create a feature for the number of times a user came in through a specific marketing source:
feature_prepare = df_acquisition.copy()
feature_prepare['utm_source'] = feature_prepare.utm_source.fillna('none')
features_us = feature_prepare.groupby(['user_id', 'utm_source']).session_id.nunique()
features_us_unstacked = features_us.unstack(fill_value=0)
features_us_unstacked.head()

### Reference
* [bach.DataFrame.copy](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/copy/)
* [bach.DataFrame.fillna](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/fillna/)
* [bach.DataFrame.groupby](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/groupby/)
* [bach.DataFrame.nunique](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/nunique/)
* [bach.Series.unstack](https://objectiv.io/docs/modeling/bach/api-reference/Series/unstack/)
* [bach.DataFrame.head](https://objectiv.io/docs/modeling/bach/api-reference/DataFrame/head/)

## Get the SQL for any analysis
The SQL for any analysis can be exported with one command, so you can use models in production directly to simplify data debugging & delivery to BI tools like Metabase, dbt, etc. See how you can [quickly create BI dashboards with this](https://objectiv.io/docs/home/try-the-demo#creating-bi-dashboards).