<a href="https://colab.research.google.com/github/Diishasing/MARKETING-CAMPAIGN-ANALYSIS/blob/main/MarketingCampaigns.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **BUSINESS PROBLEM**

---


As a marketing agency, our primary objective is to maximize the return on investment (ROI) for our clients' advertising campaigns. We have conducted two and campaigns, one on Facebook and the other on AdWorks, and we need to determine which platform yields better results in terms of clicks, conversions and overall cost-effectiveness. By identifying the most effective platform, we can allocate our resources more efficiently and optimize our advertising strategies to deliver better outcomes for our clients.

##**Research Question:**

---
**Which ad platform is more effective in terms of conversions, clicks, and overall cost-effectiveness?**


IMPORTING LIBRARIES

In [None]:
import pandas as pd
import numpy as np
import scipy.stats as st
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.stattools import coint
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

**DATA DESCRIPTION**

---
The dataset comprises a collection of data comapring the performance of two separate ad campaigns conducted throughtout the year 2019. Specifically, the data covers a Facebook Ad campaign and an AdWords Ad campaign. For each day of year 2019, there is a corresponding row in the dataset. resulting in a total of 365 lines of campaign data to analyze. The dataset includes various performance metrices for each ad campaign, providing insights into their effectiveness and efficiency over time.

Key features included in the dataset are as follows:


*   Date: The date corresponding to each row of campaign data, ranging from January 1st, 2019, to December 31st, 2019.
*   Ad Views: The number of times the ad was viewed.
* Ad Clicks: The number of clicks received on the ad.
* Ad Conversions: The number of conversions resulting from the ad.
* Cost per Ad: The cost associated with running the Facebook ad campaign.
* Click-Through Rate (CTR): The ratio of clicks to views, indicating the effectiveness of the ad in generating clicks.
* Convertion Rate: The ratio of conversions to clicks, reflecting the effectiveness of the ad in driving desired actions.
* Cost per click (CPC): The average cost incurred per click on the ad.





In [None]:
#loading the dataset
df = pd.read_csv('marketing_campaign.csv')

In [None]:
df.head()

In [None]:
df.shape()

In [None]:
df.dtypes()

In [None]:
df['Date'] = pd.to_datetime(df['Date'])

In [None]:
df.describe()

COMPARING CAMPAIGNS PERFORMANCE

In [None]:
#distribution of the clicks and conversions
plt.figure(figsize = 15, 6)
plt.subplot(1, 2, 1)
plt.title('Facebook Ad Clicks')
sns.histplot(df['Facebook Ad Clicks'], bins = 7, edgecolor = 'k', kde = True)
plt.subplot(1, 2, 2)
plt.title('Facebook Ad Conversions')
sns.histplot(df['Facebook Ad Conversions'], bins = 7, edgecolor = 'k', kde = True)
plt.show()

plt.figure(figsize = 15, 6)
plt.subplot(1, 2, 1)
plt.title('AdWorks Ad Clicks')
sns.histplot(df['AdWorks Ad Clicks'], bins = 7, edgecolor = 'k', kde = True)
plt.subplot(1, 2, 2)
plt.title('AdWorks Ad Conversions')
sns.histplot(df['AdWorks Ad Conversions'], bins = 7, edgecolor = 'k', kde = True)
plt.show()

All the histogram are showing somewhat symmetrical shape. This symmetrical shape suggests that the number of clicks and conversions is relatively evenly distributed. In other words, there are not many clicks or conversions that are outliers on either the high or low end.

**How frequently do we observe days with high numbers of conversions compared to days with low numbers of conversions?**

In [None]:
#creating function to calculate the category for the conversions
def create_conversion_category(conversion_col):
    category = []
    for conversion in df[conversion_col]:
      if conversion < 6:
        category.append('less than 6')
      elif 6 <= conversion < 11:
        category.append('6 - 10')
      elif 11 <= conversion < 16:
        category.append('10 - 15')
      else:
        category.append('more than 15')
    return category

#applying function of different campaign's conversion
df['Facebook Conversion Category'] = create_conversion_category('Facebook Ad Conversions')
df['AdWorks Conversion Category'] = create_conversion_category('AdWorks Ad Conversions')

In [None]:
df[['Facebook Ad Conversions', 'Facebook Conversion Category', 'AdWorks Ad Conversions', 'AdWorks Conversion Category']].head()

In [None]:
df['Facebook Conversion Category'].value_counts()

In [None]:
facebook = pd.DataFrame(df['Facebook Conversion Category'].value_counts()).reset_index().rename(columns = {'Facebook Conversion Category': 'Category'})
facebook

In [None]:
df['AdWorks Conversion Category'].value_counts()

In [None]:
adworks = pd.DataFrame(df['AdWorks Conversion Category'].value_counts()).reset_index().rename(columns = {'AdWorks Conversion Category': 'Category'})
adworks

In [None]:
category_df = pd.merge(facebook, adworks, on = 'Category', how = 'outer').fillna(0)
category_df

In [None]:
category_df = category_df.iloc[[3, 1, 0, 2]]
category_df

In [None]:
X_axis = np.arange(len(category_df))
plt.figure(figsize = (15, 6))
plt.bar(X_axis - 0.2, category_df['count_x'], 0.4, label = 'Facebook', color = '#03989E', linewidth = 1, edgecolor = 'k')
plt.bar(X_axis + 0.2, category_df['count_y'], 0.4, label = 'AdWorks', color = '#A62372', linewidth = 1, edgecolor = 'k')

plt.xsticks(X_axis, category_df['Category'])
plt.xlabel("Conversion Category")
plt.ylabel("Number of days")
plt.title('Frequency of daily conversion categories', fontsize = 15)
plt.legend(fontsize = 15)
plt.show()



*   The data suggests Facebook had more frequent higher conversion days than Adworks, which either had very low conversion rates (less than 6) or moderate ones (6 - 10).
*   There is a significant variance in the number of high-conversion days between two different campaigns.
* The absence of any days with conversions between 10 - 15 and more than 15 in AdWorks indicates a need to review what strategies were changed or what external factors could have influenced these numbers.



**Do more clicks on the Ad really lead to more sales?**

In [None]:
plt.figure(figsize = (15, 6))
plt.subplot(1, 2, 1)
plt.title('Facebook')
sns.scatterplot(x = df['Facebook Ad Clicks'], y = df['Facebook Ad Conversions'], color = '#03989E')
plt.xlabel('Clicks')
plt.ylabel('Conversions')
plt.subplot(1, 2, 2)
plt.title('AdWorks')
sns.scatterplot(x = df['AdWorks Ad Clicks'], y = df['AdWorks Ad Conversions'], color = '#03989E')
plt.xlabel('Clicks')
plt.ylabel('Conversions')
plt.show()

In [None]:
facebook_corr = df[['Facebook Ad Clicks', 'Facebook Ad Conversions']].corr()
facebook_corr

In [None]:
adworks_corr = df[['AdWorks Ad Clicks', 'AdWorks Ad Conversions']].corr()
adworks_corr

In [None]:
print('Correlation Coeff \n----------')
print('Facebook: ', round(facebook_corr.values[0, 1], 2))
print('AdWorks: ', round(adworks_corr.values[0, 1], 2))



*   A correlation coefficient of 0.87 indicates a strong positive linear relationship between clicks on Facebook ads and sales. This suggests that as the number of clicks on Facebook ads increase, sales tend o increase as well.
*   This strong correlation suggests that Facbook ads are highly effective in driving sales, as a large portion of the variation in sales can be explained by the variation in clicks on Facebook ads.
* The strong correlation between clicks on Facebook ads and sales suggests the Facebook advertising is highly effective in driving sales for the business. Increasing investment in Facebook ads or optimizing their performance could potientially lead to even higher sales.
* A correlation coefficient of 0.45 indicates a moderate positive linear relationship between clicks on Adwords ads and sales. While there is still a positive relationship, it is not as strong as with Facebook ads.
* The moderate correlation between clicks on AdWords ads and sale indicates that while AdWords advertising does contribute to sales, its effectiveness may be influenced by other factors, Further analysis is needed to identify these factors and optimize AdWords campaigns accordingly.



###**HYPOTHESIS TESTING**

---
**Hypothesis** : Advertising on Facebook will result in a greater number of conversions compared to advertising on AdWords.

**Null Hypothesis** [H0] : There is no difference in the number of conversions between Facebook and AdWords, or the number of conversions from AdWords is greater than or equal to those from Facebook.

H0 : mu_Facebook <= mu_AdWords

**Alternate Hypothesis** [H1] : The number of conversions from Facebook is greater than the number of conversions from AdWords.

H1 : mu_Facebook > mu_AdWords

In [None]:
print('Mean Conversion \n----------')
print('Facebook: ', round(df['Facebook Ad Conversions'].mean(), 2))
print('AdWorks: ', round(df['AdWorks Ad Conversions'].mean(), 2))

t_stats, p_value = st.ttest_ind(a = df['Facebook Ad Conversions'], b = df['AdWords Ad Conversions'], equal_var = False)
print('t-statistic: ', round(t_stats, 4))
print('p-value: ', round(p_value, 4))

#comparing the p-value with the significance of 5% or 0.005
if p_value < 0.005:
  print('Reject the null hypothesis')
else:
  print('Accept the null hypothesis')



*   The mean number of conversions from Facebook ads (11.74) is substantially higher than the mean number of conversions from AdWords ads (5.98). This suggests that, on average, Facebook advertising is more effective in generating conversions comapred to AdWords advertising.
*   The T statistics (32.88) is a measure of the difference between the means of the two groups relative to the variation within the groups. A larger T statistics indicates a greater difference between the means of the two groups.
* The p-value (9.35e-134) is extremely small, indicating strong evidence against the null hypothesis.
* The results strongly support the alternate hypothesis, indicating that the number of conversions from Facebook advertising is indeed greater that the number of conversions from AdWords advertising.
* Facebook Advertising appears to be a more effective channel for generating conversions compared to AdWords advertising, based on the sample data analyzed.
* Given the significant difference in conversion rates between Facebook and AdWords, consider reallocating resources towards Facebook advertising efforts. This could involve increasing ad spend, expanding targeting efforts, or experimenting with different ad formats to capitalize on the platform's effectiveness in driving conversions.


###**REGRESSION ANALYSIS**

---

What will happen when I do go with the facebook Ad? How many facebook ad conversions can I except given a certain number of facebook ad clicks?

In [None]:
#independent variable
X = df[['Facebook Ad Clicks']]
#dependent variable
y = df['Facebook Ad Conversions']
#fix this
#initializing and fitting Linear Regression model
reg_model = LinearRegression().fit(X, y)
predict = reg_model.predict(X)

#model evaluation
r2 = r2_score(y, predict)*100
mse = mean_squared_error(y, predict)
print('(Accuracy) R2 Score: ', round(r2, 2), '%')
print('Mean Squared Error: ', round(mse, 2))

In [None]:
plt.figure(figsize = (8, 8))
plt.scatterplot(x = df['Facebook Ad Clicks'], y = df['Facebook Ad Conversion'], color = '#03989E', label = 'Actual data points')
plt.plot(df['Facebook Ad Clicks'], predict, color = '#A62372', label = 'Best fit line')
plt.legend()
plt.show()

In [None]:
print(f'For {50} Clicks, Expected Conversion : {round(reg_model.predict([[50]])[0][0], 2)}')
print(f'For {90} Clicks, Expected Conversion : {round(reg_model.predict([[50]])[0][0], 2)}')



*   The model has a reasonably good predictive power, with an R2 Score, of ----%. This suggests that it can effectively predict facebook ad conversions based on the number of Facebook ad clicks.
*    With the insights provided by the linear regression model, businesses can make informed decisions about resource allocation, budget planning, and campaign optimization.
* For instance, knowing the expected number of facebook ad conversions based on a certain number of Facebook ad clicks can help in setting realistic campaign goals, optimizing ad spend, and assessing the ROI of facebook advertising efforts.



###**Analyzing Facebook campaign metrics over time**

In [None]:
#cleaning data (removing unwanted symbols from the columns and converting them to numerical columns)
df['Facebook Click-Through Rate (Clicks / View)'] = df['Facebook Click-Through Rate (Clicks / View)'].apply(lambda x: float(x[:-1]))
df['Facebook Conversion Rate (Conversions / Clicks)'] = df['Facebook Conversion Rate (Conversions / Clicks)'].apply(lambda x: float(x[:-1]))
df['Facebook Cost per Click (Ad Cost / Clicks)'] = df['Facebook Cost per Click (Ad Cost / Clicks)'].apply(lambda x: float(x[1:]))
df['Cost per Facebook Ad'] = df['Cost per Facebook Ad'].apply(lambda x: float(x[1:]))


NameError: name 'df' is not defined

In [None]:
#filtering the facebook campaign
df = df[['Data',  'Facebook Ad Views',
         'Facebook Ad Clicks', 'Facebook Ad Conversions', 'Cost per Facebook Ad',
         'Facebook Click-Through Rate (Clicks / View)', 'Facebook Conversion Rate (Conversions / Clicks)',
         'Facebook Cost per Click (Ad Cost / Clicks)']]

At what times of the month or days of the week do we observe the conversion?

In [None]:
#extracting month and week day from the date column
df['month'] = df['Date'].dt.month
df['week'] = df['Date'].dt.weekday

In [None]:
plt.figure(figsize = (8, 5))
plt.title('Weekly Conversion')
weekly_conversion = df.groupby('week')['Facebook Ad Conversions'].sum()
week_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
plt.plot(week_names, weekly_conversion['Facebook Ad Conversions'], color = '#03989E', edgecolor = 'k')
plt.show()

In [None]:
plt.figure(figsize = (8, 5))
plt.title('Monthly Conversion')
monthly_conversion = df.groupby('month')['Facebook Ad Conversions'].sum()
month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
plt.plot(month_names, monthly_conversion['Facebook Ad Conversions'], '-o', color = '#A62372')
plt.show()



*   Across the weekdays over a year, the total number of conversions remains relatively consistent, indicating a consistent level of engagement throughout the week. However, Mondays and Tuesdays consistently exhibit the highest conversion rate compared to other days, suggesting that the beginning of the workweek sees heightened user engagement or responsiveness to marketing efforts.
*   Examining the monthly trend in conversions reveals an overall upwward trajectory, indicating a general increase in conversions over time. However, certain months stand out with variations in conversion rates. February, April, May, June, August, and November experience a decline in conversions compared to neighboring months. These periods of decreased conversion rates could be influenced by factors such as seasonal fluctuations, changes in consumer behavior, or adjustments in marketing strategies.



**How does the Cost per Conversion (CPC) trend over time?**

---

Cost per Conversion (CPC) : This metric is used to evaluate the cost effectiveness and profitability of an online advertising campaign. This metric helps marketers understand how much they are spending to obtain each conversion, allowing them to optimize their spending and targeting strategies effectively.

In [None]:
plt.figure(figsize = (8, 5))
plt.title('Monthly Cost per Conversion (CPC)')
monthly_df = df.groupby('month')[['Facebook Ad Conversion', 'Cost per Facebook Ad']].sum()
monthly_df['Cost per Conversion'] = monthly_df['Cost per Facebook Ad'] / monthly_df['Facebook Ad Conversions']
plt.plot(month_names, monthly_df['Cost per Conversion'], '-o', color = '#A62372')
plt.show()



*   The CPC trend over the 12-month period shows some fluctuations but overall maintains a relatively stable range.
*   May and November have the lowest CPC values, indicating potentially more cost-effective advertising or higher conversion rates during these periods.
* February has the highest CPC value, suggesting that advertising costs may be relatively higher during this month compared to others.
* Lower CPC values in certain months (e.g., May and November) could indicate periods of higher advertising effectiveness or more favourable market conditions.
* Consider allocating more advertising budget to months with historically lower CPC values (e.g., May and November) to maximize ROI.



Is there a long-term equilibrium relationship between advertising spend and convesion rates that suggests a stable, proportional impact of budget changes on conversions over time?

In [None]:
score, p_value, _ = coint(df['Cost per Facebook Ad'],
                          df['Facebook Ad Conversions'])
print('Cointegration test score:', score)
print('P-value:', p_value)
if p_value < 0.05:
    print('Reject the null hypothesis')
else:
    print('Accept the null hypothesis)



*   Since the p-value is significantly lower than the chosen significance level, we reject the null hypothesis. This indicates that there is a long-term equilibrium relationship between advertising spend (cost) and conversions.
*   Businesses can use this understanding of the stable relationship between cost and conversions to optimize their advertising strategies. By investing in campaigns that demonstrate a stong return on investment (ROI) and adjusting spending based on performance, companies can maximize conversions while minimizing costs.



# **THANK YOU!**