# Case Study: Improving marketing effectiveness

## Introduction

Most importantly, we do not expect you to answer everything perfectly. For example, it is not required to be an expert in user behavior/marketing analytics – the case is just an example to demonstrate your data skills. What we want to see is especially:

1- How easy is working with data sets for you? 
2- What methods do you know to prepare data and solve analytical issues? 
3- What do you focus on when analyzing data and what is of lesser priority to you?
4- How do you present in an easy-to-digest and actionable way? 

You have about 20 minutes to present the case study. After that, we will discuss the results and any questions from your side and our side. We are looking forward to hearing and seeing your findings!

> In order to generate more traffic and conversions, ImmoScout24 invests in paid advertising. The aim is to invest each € where it brings the most impact. Marketing managers would like to know whether they can improve budget allocation to marketing channels for one particular product. Moreover, they wonder if they could improve the way they currently assess the contribution of different channels.

#### Traffic dataset	

Column Name |	Description
------------- | ---------------
date	| date
fullvisitorID	| user identifier
visitStartTime |	session start time
channelGrouping |	default channel grouping
medium	| traffic type
source	| traffic source
isTrueDirect	| is the session in fact direct session, but inherited 
channelGrouping |	combination of medium and source fields from previous sessions of the same user
hits |	number of hits per session
bounces |	did the session bounce
timeOnSite |	session length
conversion |	did the session generate a conversion

#### Costs dataset	

Column Name |	Description
------------- | ---------------
date |	date
campaign_name|	name of the campaign
source_medium|	channel identification: combined source and medium information
impressions|	number of impressions
clicks|	number of clicks
spend|	costs


In [27]:
costs = _deepnote_execute_sql('SELECT *\nFROM \'costs_processed.csv\'', 'SQL_DEEPNOTE_DATAFRAME_SQL', audit_sql_comment='', sql_cache_mode='cache_disabled')
costs

Unnamed: 0,date,campaign_name,source_medium,impressions,clicks,spend
0,2020-04-24,de_br_reach_digitale_immobiliensuche,outbrain / display,0,0,0.00
1,2020-04-23,de_fi_generic_baufinanzierungsrechner_s_con_bu...,google / cpc,7486,1739,1941.71
2,2020-04-25,de_fi_generic_schufa_s_con_buyer_financecalc,google / cpc,361,51,174.87
3,2020-03-23,de_br_marke_+_stadt,bing / cpc,0,0,0.00
4,2020-04-05,de_br_marke,google / cpc,75928,44541,1468.00
...,...,...,...,...,...,...
2866,2020-03-27,de_br_marke_immobilien,bing / cpc,0,0,0.00
2867,2020-03-28,de_fi_kredit,bing / cpc,0,0,0.00
2868,2020-03-26,de_br_marke_bundesland,bing / cpc,0,0,0.00
2869,2020-03-21,de_fi_brand_baufinanzierung,google / cpc,0,0,0.00


In [25]:
traffic = _deepnote_execute_sql('SELECT *\nFROM \'final_traffic.csv\'', 'SQL_DEEPNOTE_DATAFRAME_SQL', audit_sql_comment='', sql_cache_mode='cache_disabled')
traffic

Unnamed: 0,date,fullvisitorID,visitStartTime,channelGrouping,medium,source,isTrueDirect,hits,bounces,timeOnSite,conversion
0,2020-03-04,1.000339e+18,1583311404,Organic Search,organic,google,True,2,,446.0,0
1,2020-03-05,1.000339e+18,1583399635,Organic Search,organic,google,True,4,,489.0,0
2,2020-03-06,1.000339e+18,1583504222,Organic Search,organic,google,True,4,,36.0,0
3,2020-03-06,1.000339e+18,1583520755,Organic Search,organic,google,True,1,1.0,,0
4,2020-03-09,1.000339e+18,1583753297,Paid Search (Non-Brand),cpc,google,False,23,,257.0,1
...,...,...,...,...,...,...,...,...,...,...,...
141419,2020-04-30,8.967218e+18,1588241835,Direct,(none),(direct),False,55,,2194.0,0
141420,2020-04-30,9.063905e+18,1588268232,Paid Search (Non-Brand),cpc,google,False,20,,229.0,0
141421,2020-04-30,9.125349e+18,1588235466,Referral,referral,deref-gmx.net,False,42,,705.0,0
141422,2020-04-30,9.203921e+17,1588223800,Paid Search (Non-Brand),cpc,google,False,35,,192.0,0


> 1. How do the marketing channels currently perform? 
2. Has there been any change month over month? (Estimate value of a conversion to be 100 €.)

In [34]:
funnel = _deepnote_execute_sql('WITH cs AS (\n    SELECT\n        date\n        , source_medium\n        , SUM(impressions) AS impressions\n        , SUM(clicks) AS clicks\n        , SUM(spend) AS spend\n    FROM costs\n    GROUP BY 1, 2\n)\n, tf AS (\n    SELECT\n        date\n        , concat(source, \' / \', medium) AS source_medium\n        , SUM(hits) AS hits\n        , SUM(bounces) AS bounces\n        , SUM(timeOnSite) AS timeOnSite\n        , SUM(conversion) AS conversion\n    FROM traffic\n    GROUP BY 1, 2\n    )   \nSELECT\n     source_medium\n    , impressions\n    , clicks\n    , spend\n    , hits\n    , bounces\n    , timeOnSite\n    , conversion\nFROM cs LEFT JOIN tf USING(date, source_medium)', 'SQL_DEEPNOTE_DATAFRAME_SQL', audit_sql_comment='', sql_cache_mode='cache_disabled')
funnel

Unnamed: 0,source_medium,impressions,clicks,spend,hits,bounces,timeOnSite,conversion
0,google / cpc,206281,119570,7206.53,10821.0,13.0,199017.0,95.0
1,google / cpc,192259,118161,5187.34,13356.0,12.0,271365.0,73.0
2,google / cpc,148931,89510,4376.55,10026.0,12.0,201250.0,69.0
3,google / cpc,181233,107727,4801.17,9083.0,6.0,181986.0,75.0
4,google / cpc,165793,98372,4489.34,9915.0,14.0,194972.0,70.0
...,...,...,...,...,...,...,...,...
239,outbrain / display,0,0,0.00,,,,
240,outbrain / display,0,0,0.00,,,,
241,bing / cpc,0,0,0.00,,,,
242,bing / cpc,0,0,0.00,,,,


Typical order for metrics in the context of a digital marketing:

1. **Impressions:** The number of times an ad or content is displayed to users. It represents the potential reach of the campaign.

2. **Clicks:** The number of times users interact with an ad or content by clicking on it. Clicks are a measure of user engagement and interest.

3. **Conversion:** The ultimate goal of many campaigns, conversion represents the desired action you want users to take, such as making a purchase, filling out a form, or subscribing.

In [38]:
final = funnel.groupby('source_medium')[('impressions', 'clicks', 'conversion')].sum()

for index, row in final.iterrows():
    data = dict(
        number=row.values,
        stage=final.columns
    )
    fig = px.funnel(data, x='number', y='stage', title=f'Funnel for {index}')
    fig.show()

NameError: name 'plt' is not defined

In [None]:
DeepnoteChart(df_3, """{"layer":[{"layer":[{"mark":{"clip":true,"type":"trail","tooltip":true},"encoding":{"x":{"sort":"ascending","type":"ordinal","field":"date","scale":{"type":"linear","zero":false},"timeUnit":"yearmonthdate"},"y":{"sort":null,"type":"quantitative","field":"convertion_rate","scale":{"type":"linear","zero":false},"format":{"type":"default","decimals":null},"aggregate":"sum","formatType":"numberFormatFromNumberType"},"color":{"sort":null,"type":"nominal","field":"source_medium","scale":{"scheme":"tableau10"}}}},{"mark":{"size":100,"type":"point","opacity":0,"tooltip":true},"encoding":{"x":{"sort":"ascending","type":"ordinal","field":"date","scale":{"type":"linear","zero":false},"timeUnit":"yearmonthdate"},"y":{"sort":null,"type":"quantitative","field":"convertion_rate","scale":{"type":"linear","zero":false},"format":{"type":"default","decimals":null},"aggregate":"sum","formatType":"numberFormatFromNumberType"},"color":{"sort":null,"type":"nominal","field":"source_medium","scale":{"scheme":"tableau10"}}}}]}],"title":"","config":{"legend":{}},"$schema":"https://vega.github.io/schema/vega-lite/v5.json","encoding":{}}""")

<__main__.DeepnoteChart at 0x7fab2aff5850>

> 3. Can we improve the budget allocation based on last month’s performance? 
4. If yes, how should we shift the budgets? 
5. What would be the estimated effect?

> 6. Based on traffic data, is there another way we could consider the contribution of the different channels? If yes, show how this would be done and share your insights.

> Please, prepare a short presentation to summarize your findings. Moreover, provide us with the code/approach you used to arrive at the insights. Please provide an update that answers the following questions, based on the provided datasets:

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=e6cabd9c-7692-40d3-ab2f-2445a60671be' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>