# Market Campaign Analysis
This project aims to analyze the effectiveness of two marketing campaigns, Facebook and AdWords, in terms of clicks, conversions, and cost. The analysis involves data cleaning, exploratory data analysis, hypothesis testing, and regression analysis to identify the most effective platform for maximizing return on investment (ROI).

### Objective:
As a marketing agency, our primary goal is to maximize the return on investment (ROI) for our clients' advertising campaigns. We've conducted two ad campaigns—one on Facebook and the other on AdWords. Our aim is to determine which platform delivers superior results in terms of clicks, conversions, and overall cost-effectiveness. Identifying the most effective platform will enable us to allocate resources more efficiently and optimize our advertising strategies to achieve better outcomes for our clients.

##### Libraries used

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score,mean_squared_error
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import coint
import warnings
warnings.filterwarnings('ignore')

##### Data Desciptions : 
The dataset comprises performance data for two separate advertising campaigns conducted throughout 2019: a Facebook Ad campaign and an AdWords campaign. It contains 365 rows, one for each day of the year, with various metrics to analyze the campaigns' effectiveness and efficiency.

Key features included:

- Date: The date corresponding to each campaign data row, ranging from January 1st to December 31st, 2019.
- Ad Views: The number of times the ad was viewed.
- Ad Clicks: The number of clicks received on the ad.
- Ad Conversions: The number of conversions resulting from the ad.
- Cost per Ad: The cost associated with running the Facebook ad campaign.
- Click-Through Rate (CTR): The ratio of clicks to views, indicating the ad's effectiveness in generating clicks.
- Conversion Rate: The ratio of conversions to clicks, reflecting the ad's effectiveness in driving desired actions.
- Cost per Click (CPC): The average cost incurred per click on the ad.

In [None]:
data = pd.read_csv('marketing_campaign.csv')

In [None]:
data

In [None]:
data.dtypes

In [None]:
data['Date'] = pd.to_datetime(data['Date'])

In [None]:
data.describe()

# Comparing Campaign Performances

In [None]:
# distribution of clicks and conversions

plt.figure(figsize = (15,6))
plt.subplot(1,2,1)
plt.title('Facebook Ad Clicks')
sns.histplot(data['Facebook Ad Clicks'], bins = 7,edgecolor = 'k', kde = True)
plt.subplot(1,2,2)
plt.title('Facebook Ad Conversions')
sns.histplot(data['Facebook Ad Conversions'], bins = 7,edgecolor = 'k', kde = True)
plt.show()

plt.figure(figsize = (15,6))
plt.subplot(1,2,1)
plt.title('AdWords Ad Clicks')
sns.histplot(data['AdWords Ad Clicks'], bins = 7,edgecolor = 'k', kde = True)
plt.subplot(1,2,2)
plt.title('AdWords Ad Conversions')
sns.histplot(data['AdWords Ad Conversions'], bins = 7,edgecolor = 'k', kde = True)
plt.show()

The histograms exhibit a somewhat symmetrical shape, suggesting a relatively even distribution of clicks and conversions. In other words, there are not many outliers on either the high or low end for the number of clicks or conversions.

#### How often do we observe days with high number of conversions compared to days with low numbers of conversions?
determine the frequency of days with high conversion rates compared to days with low conversion rates

In [None]:
# creating a function to calculate the category bucket for the conversions.

def conversion_category(conversion_col):
    category = []
    for conversion in data[conversion_col]:
        if conversion < 6:
            category.append('less than 6')
        elif 6 <= conversion < 11:
            category.append('6 - 10')
        elif 11 <= conversion <16:
            category.append('10 - 15')
        else:
            category.append('more than 15')
    return category

# applying function of different campaign's conversions
data['Facebook Conversion Category'] = conversion_category('Facebook Ad Conversions')
data['AdWords Conversion Category'] = conversion_category('AdWords Ad Conversions')

In [None]:
data['Facebook Conversion Category'].value_counts()

In [None]:
data['AdWords Conversion Category'].value_counts()

In [None]:
facebook = pd.DataFrame(data['Facebook Conversion Category'].value_counts()).reset_index().rename(columns = {'Facebook Conversion Category' : 'Category'})
facebook

In [None]:
AdWords = pd.DataFrame(data['AdWords Conversion Category'].value_counts()).reset_index().rename(columns = {'AdWords Conversion Category' : 'Category'})
AdWords

In [None]:
category_df = pd.merge(facebook , AdWords , on = 'Category' , how = 'outer').fillna(0)
category_df = category_df.iloc[[3,1,0,2]]
category_df

In [None]:
X_axis = np.arange(len(category_df))
plt.figure(figsize = (15,6))
plt.bar(X_axis - 0.2, category_df['count_x'], 0.4, label = 'Facebook', color = '#03989E', linewidth = 1, edgecolor = 'k')
plt.bar(X_axis + 0.2, category_df['count_y'], 0.4, label = 'AdWords', color = '#A62372', linewidth = 1, edgecolor = 'k')

plt.xticks(X_axis, category_df['Category'])
plt.xlabel('Conversion Category')
plt.ylabel('Number of Days')
plt.title('Frequency of Daily Conversions by Conversion Categories', fontsize = 15)
plt.legend(fontsize = 15)
plt.show

- The data indicates that the Facebook campaign had a higher frequency of days with high conversion rates compared to the AdWords campaign. The AdWords campaign mainly experienced either very low conversion rates (less than 6) or moderate ones (6-10).
- There is a significant variance in the number of high-conversion days between the two campaigns.
- The lack of any days with conversions between 10-15 and above 15 for the AdWords campaign suggests the need to review the strategies employed or external factors that may have influenced these numbers.

#### Do more clicks really lead to more conversions?

In [None]:
plt.figure(figsize = (15,6))
plt.subplot(1,2,1)
plt.title('Facebook')
sns.scatterplot(x = data['Facebook Ad Clicks'], y = data['Facebook Ad Conversions'], color = '#03989E')
plt.xlabel('Clicks')
plt.ylabel('Conversions')

plt.subplot(1,2,2)
plt.title('AdWords')
sns.scatterplot(x = data['AdWords Ad Clicks'], y = data['AdWords Ad Conversions'], color = '#03989E')
plt.xlabel('Clicks')
plt.ylabel('Conversions')
plt.show()

In [None]:
facebook_click_conversion_corr = data[['Facebook Ad Clicks', 'Facebook Ad Conversions']].corr()
facebook_click_conversion_corr

In [None]:
AdWords_click_conversion_corr = data[['AdWords Ad Clicks', 'AdWords Ad Conversions']].corr()
AdWords_click_conversion_corr

In [None]:
print('Correlation Coefficients : ')
print('Facebook :', round(facebook_click_conversion_corr.values[0,1],2))
print('Adwords :', round(AdWords_click_conversion_corr.values[0,1],2))

- The correlation coefficient of 0.87 for Facebook ads indicates a strong positive linear relationship between clicks and conversions. This means that as the number of clicks on Facebook ads increases, sales tend to increase as well, demonstrating the high effectiveness of Facebook ads in driving sales for the business.- 
For the AdWords campaign, the correlation coefficient of 0.45 indicates a moderate positive linear relationship between clicks and sales. While there is a positive correlation, it is not as strong as the one observed for Facebook ads, suggesting that the effectiveness of AdWords ads in contributing to sales may be influenced by other factors that require further analysis and optimization.

# Hypothesis Testing

The null and alternative hypotheses are as follows::

**Null Hypothesis (H0):** The number of conversions from Facebook advertising is less than or equal to the number of conversions from AdWords advertising.
    H0: μ_Facebook ≤ μ_AdWords

**Alternative Hypothesis (H1):** The number of conversions from Facebook advertising is greater than the number of conversions from AdWords advertising.
    H1: μ_Facebook > μ_AdWords

- The hypothesis testing aims to determine whether the data supports the claim that Facebook advertising generates more conversions than AdWords advertising.

In [None]:
print('Mean conversions : \n')
print('Facebook : ', round(data['Facebook Ad Conversions'].mean(),2))
print('AdWords : ', round(data['AdWords Ad Conversions'].mean(),2))

In [None]:
import scipy.stats as st
t_stats, p_value = st.ttest_ind(a = data['Facebook Ad Conversions'], b = data['AdWords Ad Conversions'], equal_var = False)
print('\nT statistics', t_stats,'\np-value',p_value)

In [None]:
if p_value < 0.05:
    print('Reject the null hypothesis.')
else:
    print('Accept the null hypothesis.')

- The mean number of conversions from Facebook ads (11.74) is substantially higher than the mean from AdWords ads (5.98), suggesting that, on average, Facebook advertising is more effective in generating conversions.
- The large T-statistic (32.88) and extremely small p-value (9.35e-134) provide strong evidence against the null hypothesis, supporting the alternative hypothesis that the number of conversions from Facebook advertising is greater than from AdWords advertising.
- The results strongly indicate that Facebook advertising is a more effective channel for generating conversions compared to AdWords advertising, based on the sample data analyzed.
- Given the significant difference in conversion rates, it is recommended to consider reallocating resources towards Facebook advertising efforts. This could involve increasing ad spend, expanding targeting efforts, or experimenting with different ad formats to capitalize on Facebook's effectiveness in driving conversions.

# Regression Analysis
##### Determining the expected number of Facebook ad conversions given a certain number of Facebook ad clicks, by establishing a predictive relationship between the two variables using regression analysis

In [None]:
X = data[['Facebook Ad Clicks']]
Y = data[['Facebook Ad Conversions']]

prediction = LinearRegression().fit(X,Y).predict(X)

In [None]:
r2 = r2_score(Y, prediction)*100
mse = mean_squared_error(Y, prediction)
print('R2 Score Accuracy : ',round(r2,2))
print('Mean Squared Error : ',round(mse,2))

In [None]:
plt.figure(figsize= (8,6))
sns.scatterplot(x = data['Facebook Ad Clicks'], y= data['Facebook Ad Conversions'], color = '#03989E', label = 'Actual data points')
plt.plot(data['Facebook Ad Clicks'], prediction, color = '#A62372', label = 'Best Fit Line')
plt.legend()
plt.show()

In [None]:
print(f'For {60} clicks, Expected Conversion : {round(LinearRegression().fit(X,Y).predict([[60]])[0][0],2)}')

- The linear regression model exhibits a reasonably good predictive power, with an R-squared score of 76.35%. This suggests that approximately 76.35% of the variability in Facebook ad conversions can be explained by the number of Facebook ad clicks.- 
With the insights provided by this regression model, businesses can make informed decisions about resource allocation, budget planning, and campaign optimization for their Facebook advertising efforts. For instance, knowing the expected number of Facebook ad conversions for a given number of ad clicks can help in setting realistic campaign goals, optimizing ad spend, and assessing the return on investment (ROI) of Facebook advertising campaigns.

# Analysing Facebook Campaign matrices over time

In [None]:
# cleaning data (removing unwanted symbols from the columns and converting them into numerical columns)

data ['Facebook Click-Through Rate (Clicks / View)'] = data ['Facebook Click-Through Rate (Clicks / View)'].apply(lambda x:float(x[:-1]))
data ['Facebook Conversion Rate (Conversions / Clicks)'] = data ['Facebook Conversion Rate (Conversions / Clicks)'].apply(lambda x:float(x[:-1]))
data ['Facebook Cost per Click (Ad Cost / Clicks)'] = data ['Facebook Cost per Click (Ad Cost / Clicks)'].apply(lambda x: float(x[1:]))
data ['Cost per Facebook Ad'] = data ['Cost per Facebook Ad'].apply(lambda x: float(x[1:]))

In [None]:
#filtering for facebook campaign

data = data [['Date', 'Facebook Ad Views','Facebook Ad Clicks', 'Facebook Ad Conversions', 'Cost per Facebook Ad',
              'Facebook Click-Through Rate (Clicks / View)','Facebook Conversion Rate (Conversions / Clicks)',
              'Facebook Cost per Click (Ad Cost / Clicks)']]

#### At what times of the month or days of the week do we observe the conversions?
to identify patterns in conversion rates across different months and days of the week to determine the optimal timing for advertising campaigns

In [None]:
# extracting month and week day from the date column

data ['month'] = data ['Date'].dt.month
data ['week'] = data ['Date'].dt.weekday

plt.figure(figsize=(8,5))
plt.title('Weekly Conversions')
weekly_conversion = data.groupby('week') [['Facebook Ad Conversions']].sum()
week_names= ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
plt.bar(week_names, weekly_conversion ['Facebook Ad Conversions'], color = '#03989E', edgecolor = 'k')

In [None]:
plt.figure(figsize=(8,5))
plt.title('Monthly Conversions')
monthly_conversion = data.groupby('month') [['Facebook Ad Conversions']].sum()
month_names= ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sept', 'Oct', 'Nov', 'Dec']
plt.plot(month_names, monthly_conversion ['Facebook Ad Conversions'],'-o', color = '#03989E')
plt.show()

- Across weekdays over the year, the total number of conversions remains relatively consistent, indicating a steady level of engagement throughout the week. However, Mondays and Tuesdays consistently exhibit higher conversion rates compared to other days, suggesting heightened user engagement or responsiveness to marketing efforts at the beginning of the workweek.- 
An examination of the monthly conversion trends reveals an overall upward trajectory, indicating a general increase in conversions over time. However, certain months, such as February, April, May, June, August, and November, experience a decline in conversions compared to neighboring months. These periods of decreased conversion rates could be influenced by factors such as seasonal fluctuations, changes in consumer behavior, or adjustments in marketing strategies.

#### How does the Cost Per Conversion (CPC) trend over time?
Cost Per Conversion (CPC) is a metric used to evaluate the cost-effectiveness and profitability of an online advertising campaign. It represents the average cost incurred for each conversion generated by the campaign. Analyzing the CPC trend over time helps marketers understand their spending in relation to conversions obtained, enabling them to optimize their advertising spend and targeting strategies effectively.

In [None]:
plt.figure(figsize=(8,5))
plt.title('Monthly Cost per Conversion(CPC)')
monthly = data.groupby('month') [['Facebook Ad Conversions','Cost per Facebook Ad']].sum()
monthly['Cost per Conversion(CPC)'] = monthly['Cost per Facebook Ad']/monthly['Facebook Ad Conversions']
plt.plot(month_names, monthly['Cost per Conversion(CPC)'],'-o', color = '#03989E')
plt.show()

- The CPC trend over the 12-month period shows some fluctuations but overall maintains a relatively stable range.
- May and November exhibit the lowest CPC values, indicating potentially more cost-effective advertising or higher conversion rates during these months.
- February has the highest CPC value, suggesting that advertising costs may be relatively higher during this month compared to others.
- Lower CPC values in certain months, such as May and November, could indicate periods of higher advertising effectiveness or more favorable market conditions for driving conversions.
- It is recommended to consider allocating a larger portion of the advertising budget to months with historically lower CPC values, such as May and November, to maximize the return on investment (ROI) and capitalize on periods of higher advertising effectiveness.

#### Is there a long-term equilibrium relationship between advertising spend and conversion rates that suggests a st budget changes on conversions over time?
to determine whether there exists a long-term equilibrium relationship between advertising spend and conversion rates, which would indicate a stable and predictable impact of budget changes on conversions over an extended period.

In [None]:
score, p_value,_ = coint(data ['Cost per Facebook Ad'], data ['Facebook Ad Conversions'])
print('Cointegration test score:', score)
print('P-value:', p_value)
if p_value < 0.05:
    print("\np-value is less than significance value, Reject the null hypothesis")
else:
    print("\np-value is greater than significance value, Accept the null hypothesis")

A cointegration test is conducted to determine the presence of a long-term equilibrium relationship between advertising spend (cost) and conversions.
- Since the p-value is significantly lower than the chosen significance level, we reject the null hypothesis. This indicates the presence of a long-term equilibrium relationship between advertising spend (cost) and conversions,allowing businesses to optimize their advertising strategies for stable and predictable conversion rates.
- Businesses can leverage this understanding of the stable relationship between cost and conversions to optimize their advertising strategies. By investing in campaigns that demonstrate a strong return on investment (ROI) and adjusting spending based on performance, companies can maximize conversions while minimizing costs.

## Final Recommendations for Business Optimization

### 1. Allocate More Resources to Facebook Advertising:
- The analysis consistently shows that Facebook advertising generates higher conversions compared to AdWords. Therefore, it's recommended to allocate more resources, such as budget and manpower, to Facebook advertising efforts.

### 2. Optimize Facebook Ad Performance:
- Focus on optimizing Facebook ad performance by:
  - Targeting specific audience segments that are more likely to convert.
  - Experimenting with different ad formats, messaging, and visuals to improve engagement and conversion rates.
  - Monitoring and adjusting ad campaigns regularly based on performance metrics to maximize ROI.

### 3. Consider Seasonal Trends and Timing:
- Take into account seasonal trends and timing when planning advertising campaigns. For example, Mondays and Tuesdays consistently exhibit higher conversion rates, suggesting these days may be more effective for launching campaigns or promoting new products/services.

### 4. Review AdWords Strategy:
- While AdWords still contributes to conversions, the analysis indicates that its effectiveness may be influenced by other factors. Review the AdWords strategy, targeting, and keyword selection to improve performance and ensure it complements the overall advertising strategy.

### 5. Utilize Predictive Models for Planning:
- Use predictive models, such as regression analysis, to forecast the expected number of conversions based on ad clicks. This can help in setting realistic campaign goals and optimizing ad spend for better ROI.

### 6. Monitor CPC Trends and Adjust Spending:
- Continuously monitor Cost Per Conversion (CPC) trends over time. Allocate a larger portion of the advertising budget to months with historically lower CPC values, such as May and November, to maximize ROI and capitalize on periods of higher advertising effectiveness.

### 7. Maintain Consistency in Engagement:
- Maintain consistent engagement throughout the week, as conversion rates remain relatively stable. Ensure marketing efforts are evenly distributed to leverage user engagement across all days of the week.

### 8. Long-term Strategy Based on Cost-Conversion Relationship:
- Leverage the long-term equilibrium relationship between advertising spend and conversions. Optimize advertising strategies based on this relationship to achieve stable and predictable conversion rates over time.

By implementing these suggestions, the business can improve the overall effectiveness of its advertising campaigns, maximize conversions, and achieve better returns on investment.
