<img src="https://www.predicagroup.com/app/uploads/2019/05/customer-churn-1024x662.jpg" width=800px h
     eight=800px justify-content=left>

# **SALES AND MARKETING**

***

***

## DESCRIPTION:
### The dataset contains information on sales and marketing, with the following columns:

1. Month: This column represents the month during which the data was collected or recorded. It serves as a time identifier to track the sales and marketing activities over different months.

2. Social Media: This column refers to the presence or engagement on social media platforms. It likely includes metrics such as the number of followers, likes, shares, comments, or other indicators of social media activity for a specific month.

3. Print Media: This column represents the presence or coverage of marketing efforts in print media. It could include mentions or advertisements in newspapers, magazines, brochures, or other physical printed materials related to the company or product.

4. Electronic Media: This column denotes the presence or exposure of marketing efforts in electronic media channels. It may include advertising on television, radio, online streaming platforms, or any other electronic media outlets.

5. Sales: This column captures the sales figures or revenue generated for a particular month. It represents the actual financial performance of the company or product during that specific time period.

The dataset aims to provide a comprehensive overview of the marketing activities across different media channels (social media, print media, electronic media) and their corresponding impact on sales. By analyzing this data, you can explore the relationships, patterns, and potential influences between marketing efforts and sales performance over time.


# WORKING IN PANDAS (DATA PREPROCESSING AND CLEANING)

In [28]:
import pandas as pd

In [29]:
#IMPORTING OF OUR DATASET INTO JUPYTER USING PANDAS
dataset=pd.read_csv("./Databel - Data.csv")

In [30]:
dataset

Unnamed: 0,Month,Social Media,Print Media,Electronic Media,Sales
0,1/31/2010,4,3,7,404
1,2/28/2010,8,8,7,2948
2,3/31/2010,12,9,13,7095
3,4/30/2010,5,9,8,2563
4,5/31/2010,2,4,4,599
...,...,...,...,...,...
91,8/31/2017,5,8,3,699
92,9/30/2017,5,3,2,647
93,10/31/2017,4,6,3,1685
94,11/30/2017,8,9,5,3027


In [31]:
dataset.shape

(96, 5)

In [32]:
#this command will show the total elements of our dataset by multiplying rows and columns
dataset.size

480

In [33]:
dataset.head(6)

Unnamed: 0,Month,Social Media,Print Media,Electronic Media,Sales
0,1/31/2010,4,3,7,404
1,2/28/2010,8,8,7,2948
2,3/31/2010,12,9,13,7095
3,4/30/2010,5,9,8,2563
4,5/31/2010,2,4,4,599
5,6/30/2010,6,11,4,1363


In [34]:
dataset.tail(6)

Unnamed: 0,Month,Social Media,Print Media,Electronic Media,Sales
90,7/31/2017,6,8,5,1665
91,8/31/2017,5,8,3,699
92,9/30/2017,5,3,2,647
93,10/31/2017,4,6,3,1685
94,11/30/2017,8,9,5,3027
95,12/31/2017,1,1,3,80


In [35]:
#This command shows the columns of our dataset
dataset.columns

Index(['Month', 'Social Media', 'Print Media', 'Electronic Media', 'Sales'], dtype='object')

***

In [36]:
#First we will check the structure or get information about our dataset
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 5 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   Month             96 non-null     object
 1   Social Media      96 non-null     int64 
 2   Print Media       96 non-null     int64 
 3   Electronic Media  96 non-null     int64 
 4   Sales             96 non-null     object
dtypes: int64(3), object(2)
memory usage: 3.9+ KB


# Check for any null or missing value in  dataset.

***

In [37]:
dataset.isnull()

Unnamed: 0,Month,Social Media,Print Media,Electronic Media,Sales
0,False,False,False,False,False
1,False,False,False,False,False
2,False,False,False,False,False
3,False,False,False,False,False
4,False,False,False,False,False
...,...,...,...,...,...
91,False,False,False,False,False
92,False,False,False,False,False
93,False,False,False,False,False
94,False,False,False,False,False


In [38]:
dataset.isnull().sum()

Month               0
Social Media        0
Print Media         0
Electronic Media    0
Sales               0
dtype: int64

In [39]:
dataset.isnull().sum().sum()

0

In [48]:
dataset["Sales"]=dataset["Sales"].str.replace(",","").astype(int)

In [49]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 5 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   Month             96 non-null     object
 1   Social Media      96 non-null     int64 
 2   Print Media       96 non-null     int64 
 3   Electronic Media  96 non-null     int64 
 4   Sales             96 non-null     int32 
dtypes: int32(1), int64(3), object(1)
memory usage: 3.5+ KB


***

# DATA IS CLEANED.


# ANALYSIS PART:
* ## We can proceed with the analysis phase, specifically focusing on correlation and covariance analysis.



In [50]:
dataset.corr()

Unnamed: 0,Social Media,Print Media,Electronic Media,Sales
Social Media,1.0,0.486814,0.528802,0.825858
Print Media,0.486814,1.0,0.32715,0.570283
Electronic Media,0.528802,0.32715,1.0,0.696956
Sales,0.825858,0.570283,0.696956,1.0


In [51]:
dataset.cov()

Unnamed: 0,Social Media,Print Media,Electronic Media,Sales
Social Media,4.863048,3.424013,2.951316,2998.783
Print Media,3.424013,10.172697,2.640789,2994.978
Electronic Media,2.951316,2.640789,6.405263,2904.416
Sales,2998.782895,2994.977632,2904.415789,2711255.0


# SOME ANALYSIS QUESTIONS RELATED TO OUR DATA

### 1.correlation between social media presence and sales

In [56]:
correlation = dataset["Social Media"].corr(dataset["Sales"])
print("Correlation between Social Media and Sales:", correlation.round(1))

Correlation between Social Media and Sales: 0.8


### 2.correlation between electronic media presence and sales

In [57]:
correlation = dataset["Electronic Media"].corr(dataset["Sales"])
print("Correlation between Electronic Media and Sales:", correlation.round(1))

Correlation between Electronic Media and Sales: 0.7


### 3.correlation between Print media presence and sales


In [58]:
correlation = dataset["Print Media"].corr(dataset["Sales"])
print("Correlation between Print Media and Sales:", correlation.round(1))

Correlation between Print Media and Sales: 0.6
